Seeking the greatest common divisor in qualitative coding

This post is based on a talk I gave at ICQI 2018, which itself leads on from a talk from last year on the Entomologies of qualitative coding. Good qualitative data is rich, and detailed - a fertile medium for understanding and interpreting the world. But the detail of the data comes at a price

Seeking the greatest common divisor in qualitative coding

This post is based on a talk I gave at ICQI 2018, which itself leads on from a talk from last year on the Entomologies of qualitative coding.


Good qualitative data is rich, and detailed - a fertile medium for understanding and interpreting the world. But the detail of the data comes at a price, usually qualitative data sources are lengthy, and are about a lot of different things. You don't just ask a single question that can be answered with a one word answer, you inquire and explore a range of issues around the topic to draw out detail and explore the 'why' behind the answers.

This means that the analysis of qualitative data starts with reading the data, to get a sense of the landscape of it, but an intermediary stage is coding - and this is the part of qualitative analysis that qualitative software like Quirkos can help with. We create codes which are like themes, and read through the text and put sections of text which are relevant to them into each code. Tagging the data in this way lets us bring quotes together that fit a theme, to eventually support (or disprove) a hypothesis. But what should these codes be? What features do we highlight that help us see the similarities and differences in the data?

Broadly speaking these themes can of two types: very low order, basic descriptive codes, or 'higher level' conceptual codes. It's difficult to describe the process and difference between these types of coding, but you can conceptualise it as moving from the lowest common denominator across the data to the highest (although it's also possible to do it the other way around).

This concept is found often in high-school level math when trying to add fractions. If they have different denominators (the bottom bit which shows how many sections they are) you have to multiply them out to get a common denominator - in other words a number that can be used to divide both fractions. It’s also the same game that entomologists are playing when trying to create a taxonomy of insects or other animals. Think about how you would describe what features are common to the butterflies in the top image? It can be a specific spot of a certain colour (a basic low level feature) or a feature that may look very different, but has a similar purpose - like an antenna (high level).

This is a bit like what qualitative coding often tries to do - find common themes that occur across all the sources. At a very basic level these will probably be very simple. Everyone in our sources is talking about 'Politics' in a general sense, 'The Media' and 'Opinions'. Creating these descriptive codes, and putting text into them is a useful way to start the coding process. It creates a 'map' or list of everything everyone is saying about 'The Media' and we can then read all these quotes together and look for patterns.

But there is a risk of creating codes that are so 'common' and so basic that they are pretty meaningless on their own. The more vague the theme, the more data will fit into it, but the less useful filtering of the data you are getting. Remember, the end goal is to find data that will answer your research question, and it is unlikely that these are as vague as 'What do people say about Politics?'. Usually you are looking for a much more specific insight, such as 'How do libertarian leaning people distance themselves from the policies of the Republican party?'. To do this, and to make a meaningful conclusion, we need to move to something more akin to the highest common denominator. In other words, what are the highest level, most significant and specific insights that are common themes in the data?

These are the 'highest common divisors' - in math the largest number you can give that allows you to divide and compare numbers or fractions. Every number is dividable by itself, and one. In the same way, every thing an individual says is true about themselves, and each word or statement they make is true in itself. However, neither of these is particularly interesting or insightful in itself, without some point of comparison. It's not important that an opinion or experience must be common to everyone to be relevant in qualitative research, in fact that's the strength of this methodology. However, you could argue that dissenting or different views are only interesting in comparison.

However, the highest common denominator codes should be at a high enough conceptual level to cover a variety of opinions, but bring them together under a common theory. It's not just saying people think this, or some people think the other way about their political leanings, but how they create a political identity. High level codes should be a close match to a theoretical interpretation of the world, such as “Gender is a performative act” (Butler 1988). These may be an existing theory, or a new theory you are discovering by applying a grounded theory approach.

But it's usually pretty hard to jump straight to this level of understanding of your data. Maybe you can read though all the sources once and just see a new conceptual understanding of the world emerge. However, this is rare, and you would probably still want to have quotes to illustrate and support your understanding. That's why creating your coding structure of the lowest common denominator first can help you to get to the next levels. And there may be multiple levels of coding, involving moving up, grouping and refining codes to support a deeper hypothesis. It's one reason why qualitative analysis is often described as a cyclical, iterative process.

Quirkos is designed to help you create and manage these different stages of coding with the ‘levels’ feature. This lets you create groups of codes from different coding iterations, and even have some that belong to all or just some of the levels. There’s a lot more information about how they work (and can be used to do other things) in this blog post on levels and groups. However, you can also always download the full version of Quirkos for free and try it for a month. It’s the easiest qualitative analysis software package to learn, as well as being one of the cheapest and most visual.

But remember, that even once you have created and populated a high level coding framework, this is not the same as analysis. You still need to make the leap from coding to qualitative analysis and actually read through the coded data, keep re-conceptualising it, and eventually match it to your research questions so that they can be answered. However, if you can keep coming back to the butterfly categorisation or fraction addition metaphors above, it might help you keep your eye looking out for both the low and high level themes in your research, and developing a rich coding framework that will help your insights and conclusions bubble up from your data.