We use cookies to improve your experience on our website. By clicking OK or by continuing to browse the website, you consent to their use.
Click here to review our Cookie Use Policy.

OK

by QSR

Supports robust qualitative and mixed methods research for virtually any data source and any research method.

Learn more

by QSR

Intuitive data analysis software designed for public policy experts analyzing surveys.

Learn more

Creating software to help you discover the rich insights from humanised data.

Learn more

Using NVivo to structure a computational ontology

28 January 2019 - BY Matthew Hanchard

We all know that NVivo is an incredibly powerful tool for qualitative and mixed-methods data analysis, but it can be used for much more than that! In this post, I explain how coding to both Nodes and Relationships can be used to help develop a computational ontology, without losing richness and nuance or imposing a pre-existing structure onto the data.


Recent advances in computing offer researchers a lot of opportunities - from carrying out more complex correspondence analyses of large statistical datasets than ever before, through to the automated transcription of interview recordings. In digital humanities research, these advances have been embraced wholeheartedly. For example, at Beyond the Multiplex[1] (UKRI, 2017) we are currently developing a computational ontology to explore data from a large-scale mixed-methods research project, which includes data from:
  • Longitudinal survey - 3 waves over 6 months (N=5,000, n=500, n=500)
  • Semi-structured interviews with audience members (x 200)
  • Expert interviews with policymakers (x 32)
  • Film-elicitation focus groups with audience members (x 16)
  • Policy documents (+250)
A computational ontology allows researchers to classify “…components and characteristics of a particular knowledge domain…” (Pidd & Rodgers, 2018) as either (1) ‘entities’; (2) ‘characteristics’ of entities; or (3) as ‘relationships’ between two entities. Rather than using this three-part classification to “…dictate how data is described, structured, and related…” (Ibid.) within a data model (a typical approach to database development), computational ontologies allows researchers to specify exactly how entities and their characteristics relate to one another. To clarify, a data model can help identify a relationship the film genre ‘horror’ (as an entity) and the perception of horror films as ‘scary’ (as an entity characteristic). However, it cannot tell us anything about that relationship. By contrast, a computational ontology allows researchers to classify the relationship itself. For example, if the relationship between ‘horror’ and ‘scary’ is classified as ‘not too’, it might allow us to understand that some people prefer horror films if they perceive them as being ‘not too’ scary.

In our project, we study specialised film audiences and their film-watching practices, connecting both national policy and industry practices. We draw on a computational ontology to explore data across the project holistically, and to query across all data types i.e. to see how well a concept developed in our analysis of interviews scales-up through survey data or compares to national policy. For this, the ontology and our extensive use of NVivo Relationships and Relationship Type provides a way for us to draw together concepts developed in separate analyses (and separate NVivo Projects), and to explore how they relate to one another.[2]

Typically, software developers use data models and computational ontologies to provide a structure for data. That structure is imposed onto data, and any later data is adjusted to fit the pre-existing structure – a process laden with personal assumptions, prejudices, and bias. By contrast, we code to Relationships in NVivo, and name the Relationship Types in order to build a structure from the ground up (inductively). This ensures that as we develop a computational ontology, it remains grounded within, and driven by, the data.

To develop the computational ontology, we first used NVivo to code transcripts of interviews and focus groups. We coded to Nodes (to develop entities and entity characteristics). We then built (and coded to) Relationships between and assigned them to a set of Relationship Types that we developed throughout our coding.

Whether you are carrying out a small-scale qualitative analysis or building a computational ontology from a large mixed-methods dataset, coding to Relationships and Relationship Types in NVivo provides a useful way to explore how the Items within your Project (e.g. Nodes) relate to one another. Creating Relationships and Relationship Types is relatively easy:

Step 1:


Select ‘Create’ from the ribbon bar, then locate and select ‘Relationships’ within the ‘Nodes’ group (Figure 1).

Figure-1-Copy-(2).jpg


Figure 1

 

 

Step 2:


Step 2: When the dialogue box pop-ups, use the two ‘Select’ buttons to access a second dialogue box. This enables you to search and select the two Project Items you want to connect in new Relationship (Figure 2).

When you create a new Relationship, the Relationship Type will be designated as ‘Associated’ and not assigned to any specific direction (Figure 3). If you want to designate the Relationship as a specific Type, simply follow the third step below.

Figure-2-Copy-(2).jpg




Figure 2
 
 

Step 3:


Step 3: Within the dialogue box described in step 2, select the ‘New’ button. This opens an additional dialogue box that allows you to create a Relationship Type, and to define the its direction (Figure 3). For example,  when we looked at people’s choice of film-viewing platform, we found that video-on demand services such as Amazon Prime and Netflix are starting to replace DVD collections at home, but that the reverse was not true. To that effect, we generated a new Relationship called ‘REPLACES’ to connect an ‘entity’ called ‘Video-on-Demand Services’ with an ‘entity characteristic’ called ‘DVD collection’ (Figure 3).

Figure-3-Copy-(2).jpg




Figure 3



To code the data, we drew on an initial set of high-level Nodes (entities and characteristics) e.g. ‘times’, ‘places’ etc. before expanding them through our analysis of data. This followed a pilot study (Corbett et al., 2014), where we defined the initial set of entities and entity characteristics for the ontology. By using NVivo in this way, we found that rather having to force a structure onto the data and then coding towards that structure, we could work inductively – and therefore stay closer to our data, whilst keeping consistency across datasets.

As our research involves several Universities, some of whom have not yet upgraded their licensing models (either to NVivo 11 Server, or NVivo 12 for Teams). For that reason, we used NVivo 11 Pro (Standard). This meant our researchers each generated separate Projects for each set of interviews, policy documents, and focus groups. Keeping a set of high-level Nodes in place across all Projects enabled us to more easily integrate the datasets when Merging our Projects within NVivo prior to running Extracts and Exports. At the same time, our coding of data in each Project develop a complex hierarchy of Nodes beneath the initial high-level set (Figure 4). By extension, the Nodes, relationships, and Relationship Types generated through data analyses led to iterative revision of the computational ontology’s structure.

Figure-4-Copy-(2).png




Figure 4



There is some post-NVivo work required to get Nodes, Relationships and Relationships Types into an ontology. For example, we ran Extracts in NVivo to get XML files of all text coded to Nodes, and to identify all Intersecting Nodes. We also Exported all Relationships (and Relationship Types) as HTML files. After parsing the extracts and exports together in XML, we used Javascript to ready them for building into a SQL-based database and the computational ontology itself. However, there are alternative approaches that may have been used.

As a side note, we find that coding towards a computational ontology generates a lot of Nodes (see Figure 3, there are 4,134 Nodes in our Project!). This contradicts the common advice in qualitative and mixed-method coding literatures where researchers are often warned not to let their coding scheme ‘go viral’ or descend into unmanageable repetition – but to maintain a small and tightly focussed coding scheme instead (Bazeley and Jackson, 2013; Andrews, 2008). To generate theory from data, this advice is well-heeded, and in our conceptually-driven coding we certainly followed this advice. However, coding towards an ontology requires a far more voluminous range of descriptive codes. For example, one of our more conceptual Nodes called ’Value of cinema is well-focussed with only 7 Nodes (or subcategories) beneath it. Meanwhile, a descriptive Node called ‘Film and Film Series Titles’ holds 788 separate Nodes beneath it, allowing for later comparison with survey data and other national datasets through the computational ontology.

Overall, we found that by using NVivo both to code our data and to structure the coding scheme, we were able to provide an analysis that was suitable for a computational ontology. This enabled us to go beyond traditional mixed-methods research, and to work with a large volume of empirical data without forcing pre-conceived ideas onto the research itself. 

 

References:

Andrews, G. (2008) ‘Coding Fetishism’, in Given, L. (ed.) The Sage Encyclopaedia of Qualitative Research Methods: Vol. 2, M-Z Index. London: Sage, pp. 286–287.
Bazeley, P. and Jackson, K. (2013) Qualitative Data Analysis with NVivo. London: Sage.
Corbett, S., Wessels, B., Forrest, D., and Pidd, M. (2014) How Audiences Form:
Exploring Film Provision and Participation in the North of England
, Available at: https://www.showroomworkstation.org.uk/media/FilmHubNorth/How_Audiences_Form_Full_Report_UPDATED.pdf (Accessed: 07-Jan-2019).
Pidd, M. and Rodgers, K. (2018) Why use an ontology? Mixed methods produce mixed data, Available at: https://www.beyondthemultiplex.net/why-use-an-ontology-mixed-methods-produce-mixed-data/ (Accessed: 07-Jan-2019).
UKRI (2017) Beyond the Multiplex: Audiences for Specialised Film in English Regions, UKRI gateway to publicly funded research and innovation. Available at: https://gtr.ukri.org/projects?ref=AH%2FP005780%2F1 (Accessed: 07-Jan-2019).
 

[1] Beyond the Multiplex is an Arts and Humanities Research Council funded project (grant
reference AH/P005780/1). Researchers include Bridgette Wessels, David Forrest, Andrew Higson, Mike Pidd, Simeon Yates, Matthew Hanchard, Huw Jones, Peter Merrington, Katherine Rogers, Roderik Smits, Nathan
Townsend.