Merging and splitting themes in qualitative analysis

split and merge qual codes

To merge or to split qualitative codes, that is the question…


One of the most asked questions when designing a qualitative coding structure is ‘How many codes should I have?’. It’s easy to start out a project thinking that just a few themes will cover the research questions, but sooner or later qualitative analysis tends towards ballooning thematic structure, and before you’ve even started you might have a framework with dozens of codes. And while going through and analysing the data, you might end up with another couple dozen more. So it’s quite common for researchers to end up with more than a hundred codes (or sometimes hundreds)!


This can be alarming for students doing qualitative analysis for the first time, but I would argue it’s fine in most situations. While it can be confusing and disorienting if you are using paper and highlighters, when using CAQDAS software a large number of themes can be quite manageable. However, this itself can be a problem, since qualitative software makes it almost too easy to create an unwieldy number of codes. While some restraint is always advisable, when I am running workshops I usually advise new coders not to worry, since with the software it is easier to merge codes later, than split them apart.


I’m going to use the example of Quirkos here, but the same principal applies to any qualitative data analysis package. When you are going through and analysing your qualitative text sources, reading and coding them is the most time consuming part. If you create a new code for a theme half way through coding your data because you can see it is becoming important, you will have to go back to the beginning and read through the already coded sources to make sure you have complete coverage. That’s why it’s normally easier to think through codes before starting a code/read through.


Of course there is some methodological variance here: if you are doing certain types of grounded theory this may not apply as you will want to create themes on the fly. It’s also worth noting that good qualitative coding is an iterative process, and you should expect to go through the data several times anyway. Usually each time you do this you will look at the code structure in a different way – maybe creating a more higher-level, theory driven coding framework on each pass.


However, there is another way that QDA software helps you manage your qualitative themes: since it is simple to merge smaller codes together under a more general heading. In Quirkos, just right click on the code bubble you want to keep, and you will see the dialogue below:


merging qualitative codes in quirkos

Then select from the drop down list of other themes in your project which topic you want to merge into the Quirk you selected first. That’s it! All the coded text in the second bubble will get added to the first one, and it will keep the name of that code, appended with “(merged)” so you can identify it.


Since it is so easy to merge topics in qualitative software, I generally suggest that people aren’t afraid to create a large number of very specific topics, knowing they can merge them together later. For example, if you are create a code for when people are talking about eating out at a restaurant, why not start with separate codes for Fast food, Mexican, Chinese, Haute cuisine etc - since you can always merge them later under the generic ‘Restaurant’ theme if you decide you don’t need that much detail.


It is also possible to retroactively split broad codes into smaller categories, but this is a much more engaged process. To do this in Quirkos, I would start by taking the code you want to expand (say Restaurant) and make sure it is a top level code – in other words is not a subcategory of another code. Then, create the codes you want to break out (for example Thai, Vegetarian, Organic) and make them sub categories of the main node. Then, double click on the top Quirk, and you will get a list of all the text coded to the top node (Restaurant). From this view in Quirkos, you can drag and drop each code into the relevant subcategory (eg Organic, Thai):

splitting qualitative codes in quirkos

Once you have gone through and recoded all the quotes into new codes, you can either delete the quotes from the top level code (Restaurant) one by one (by right clicking on the highlight stripe), or remove all quotes from that node by deleting the top-node entirely. If you still want to have a Restaurant Quirk at the top to contain the sub categories, just recreate it, and add the sub-categories to it. That way you will have a blank ‘Restaurant’ theme to keep the subcategories (Thai, Organic) together.


So to summarise, don’t be afraid to have too many codes in CAQDAS software – use the flexibility it gives you to experiment. While there is always too much of a good thing, the software will help you see all the coding options at once, so you can decide the best place to categorise each quote. With the ability to merge, and even split apart codes with a little effort, it’s always possible to adjust  your coding framework later – in fact you should anticipate the need to do this as you refine your interpretations. You can also save your project at one stage of the coding, and go back to that point if you need to revert to an earlier state to try a different approach. For more information about large or small coding strategies, this blog post article goes into more depth.

If you want to see how this works in Quirkos, just download the free trial and try for yourself. Quirkos makes operations like merge and split really easy, and the software is designed to be intuitive, visual and colourful. So give it a try, and always contact us if you have any questions or suggestions on how we can make common operations like this quicker and simpler!



Using qualitative analysis software to teach critical thought

teaching qualitative software


It’s a key part of the curriculum for British secondary school and American high school education to teach critical thought and analysis. It’s a vital life skill: the ability to look at who is saying what, and pick apart what is being said. I’ve been thinking about the possible role for qualitative analysis in education, and how qualitative data analysis software in particular could help develop critical analysis skills in students of all ages.


While using qualitative analysis software is fairly common at university level, it’s a little unusual (possibly unprecedented at a quick glance) to use it at higher/secondary level with pre-college students. But why is this the case? It may well be that previously the software was too complex or expensive to use in mainstream schools, especially when you consider the amount of training the teachers and educators would have to have.


However, Quirkos was designed to make qualitative analysis more accessible by being easier to learn and teach, while also reducing the cost of licences. Thus it may make a better fit than previous options for the higher education sector. But how would such an approach work, and how would it fit into an already tight curriculum?


First of all, the notion of critical reading and analysis is prominent as a ‘core skill’ in UK secondary and USA K-12 education. For example the UK English curriculum states that:
“Critical reading, discussing, appreciating and exploring texts is essential for learning across the curriculum”

In History, teachers should:
“equip pupils to ask perceptive questions, think critically, weigh evidence, sift arguments, and develop perspective and judgement… [and] understand the methods of historical enquiry, including how evidence is used rigorously to make historical claims, and discern how and why contrasting arguments and interpretations of the past have been constructed”

Even in the USA the Common Core State Standards “stresses critical-thinking, problem-solving, and analytical skills that are required for success in college, career, and life”

I would argue that trying to include qualitative analysis in a curriculum can tick many of these boxes, and provide a flexible way to integrate these skills in other lesion plans. For example, in History, students could be given a number of newspaper articles covering an important historical event. These may come from different countries or papers with different viewpoints, and using qualitative software students could perform comparative analysis, identifying sections of the text that show bias or contradict.


In an English class, students could be provided with a digital copy of a book on the reading list, and given a framework with topics to explore, encouraging them to identify metaphor, similes, or more specific issues like ‘representations of women’ or other recurring themes. If qualitative analysis software became a standard tool in schools, it could easily fit into a variety of activities, with teachers easily able to look at student’s outputs for marking and group discussion.


Finally, students of any age could be encouraged to do their own qualitative research project, surveying their peers or community on topics both topical and relevant to the curriculum. That way, children can also learn about setting research questions, bias, and presenting results, helping them better understand and critique the barrage of studies they are exposed to in the media.



The visual, colourful and interactive interface of Quirkos is very intuitive to the digital touch-screen generation: it not only looks like a game, but provides visual stimulation and feedback in the same way. Watching their bubble codes grow, and organising topics like petals in a flower should be intuitive for children of all ages, but is also fundamentally teaching them the basics of qualitative analysis, sorting and categorising categories, and thinking about what different sources are saying.


We are talking to educators in the UK already about developing example lesson plans and curriculums around Quirkos and qualitative analysis. There are a lot of practical hurdles to overcome, including the plurality of different IT systems schools use, and the limited amount of time teachers get to learn and enact new methods.


But the benefits are considerable: a background in qualitative research and analysis techniques is a great transferable skill for students to take into their working life. Although there doesn’t seem to be a lot of jobs outside research that make qualitative analysis experience an essential criteria, many jobs include a lot of dealing with written text in just such a way. Few workers in office environments can get by without engaging with company or government policy documents, and in areas like HR staff have to critically appraise (in a replicable, and guided way) written documents like CVs and covering letters on a regular basis.


And it’s a frequent complaint from employers that these are exactly the kind of skills applicants are lacking:

“In survey after survey, they rate young applicants as deficient in such key workplace skills as written and oral communication, critical thinking and analytical reasoning.”

The Collegiate Learning Assessment Plus measure used in the US university system measures analytical reasoning, critical thinking, document literacy, writing and communication skills – all considered essential areas by employers from all backgrounds. A recent study found that 40% of students, even at University level, lacked proficiency in these areas.


Qualitative analysis requires students to develop all of these skills, and getting started at a young age will not only help high school students start their academic studies where critical reasoning will become a daily task, but get them on the right step to employment, and to becoming an engaged and informed member of society.



In vivo coding and revealing life from the text

Ged Carrol

Following on from the last blog post on creating weird and wonderful categories to code your qualitative data, I want to talk about an often overlooked way of creating coding topics – using direct quotes from participants to name codes or topics. This is sometimes called “in vivo” coding, from the Latin ‘in life’ and not to be confused with the ubiquitous qualitative analysis software ‘Nvivo’ which can be used for any type of coding, not just in vivo!

In an older article I did talk about having a category for ‘key quotes’ - those beautiful times when a respondent articulates something perfectly, and you know that quote is going to appear in an article, or even be the article title. However, with in vivo coding, a researcher will create a coding category based on a key phrase or word used by a participant. For example someone might say ‘It felt like I was hit by a bus’ to describe their shock at the event, and rather than creating a topic/node/category/Quirk for ‘shock’, the researcher will name it ‘hit by a bus’. This is especially useful when metaphors like this are commonly used, or someone uses an especially vivid turn of phrase.

In vivo coding doesn’t just apply to metaphor or emotions, and can keep researchers close to the language that respondents themselves are using. For example when talking about how their bedroom looks, someone might talk about ‘mess’, ‘chaos’, or ‘disorganised’ and their specific choice of word may be revealing about their personality and embarrassment. It can also mitigate the tendency for a researcher to impose their own discourse and meaning onto the text.

This method is discussed in more depth in Johnny Saldaña’s book, The Coding Manual for Qualitative Researchers, which also points out how a read-through of the text to create in vivo codes can be a useful process to create a summary of each source.

Ryan and Bernard (2003) use a different terminology, indigenous categories or typographies after Patton (1990). However, here the meaning is a little different – they are looking for unusual or unfamiliar terms which respondents use in their own subculture. A good example of these are slang terms unique to a particular group, such as drug users, surfers, or the shifting vernacular of teenagers. Again, conceptualising the lives of participants in their own words can create a more accurate interpretation, especially later down the line when you are working more exclusively with the codes.

Obviously, this method is really a type of grounded theory, letting codes and theory emerge from the data. In a way, you could consider that if in vivo coding is ‘from life’ or grows from the data, then framework coding to an existing structure is more akin to ‘in vitro’ (from glass) where codes are based on a more rigid interpretation of theory. This is just like the controlled laboratory conditions of in vitro research with more consistent, but less creative, creations.

However, there are problems in trying to interpret the data in this way, most obviously, how ubiquitous will an in-vivo code from one source be across everyone’s transcripts? If someone talks about a shocking event in one source as feeling like being ‘hit by a bus’ and another ‘world dropped out from under me’, would we code the same text together? Both are clearly about ‘shock’ and would probably belong in the same theme, but does the different language require a slightly different interpretation? Wouldn’t you loose some of the nuance of the in vivo coding process if similar themes like these were lumped together?

The answer to all of these issues is probably ‘yes’. However, they are not insurmountable. In fact, Johnny Saldaña suggests that an in vivo coding process works best as a first reading of the data, creating not just a summary if read in order,  but a framework from each source which should be later combined with a ‘higher’ level of second coding across all the data. So after completing in vivo coding, the researcher can go back and create grouped coding categories based around common elements (like shock) or/and conceptual theory level codes (like long term psychological effects) which resonate across all the sources.

This sounds like it would be a very time consuming process, but in fact multi-level coding (which I often advocate) can be very efficient, especially with an in vivo coding as the first process. It may be that you just highlight some of these key words, on paper or Word, or create a series of columns in Excel adjacent to each sentence or paragraph of source material. Since the researcher doesn’t have to ponder the best word or phrase to describe the category at this stage, creating the coding framework is quick. It’s also a great process for participatory analysis, since respondents can quickly engage with selecting juicy morsels of text.

Don’t forget, you don’t have to use an exclusivly in vivo coding framework: just remember that it’s an option, and use for key illuminating quotes along side your other codes. Again, there is no one-size-fits-all approach for qualitative analysis, but knowing the range of methods allows you to choose the best way forward for each research question or project.

CAQDAS/QDA software makes it easy to keep all the different stages of your coding process together, and also create new topics by splitting and emerging existing codes. While the procedure will vary a little across the different qualitative analysis packages, the basics are very similar, so I’ll give a quick example of how you might do this in Quirkos.

Not a lot of people know this, but you can create a new Quirk/topic in Quirkos by dropping a section of text directly onto the create new bubble button, so this is a good way to create a lot of themes on the fly (as with in vivo coding). Just name these according to the in vivo phrase, and make sure that you highlight the whole section of relevant text for coding, so that you can easily see the context and what they are talking about.

Once you have done a full (or partial) reading and coding of your qualitative data, you can work with these codes in several ways. Perhaps the easiest is to create a umbrella (or parent) code (like shock) to which you can make relevant in vivo codes subcategories, just by dragging and dropping them onto the top node. Now, when you double click on the main node, you will see quotes from all the in vivo subcategories in one place.


qualitative research software - quirkos


It’s also possible to use the Levels feature in Quirkos to group your codes: this is especially useful when you might want to put an in vivo code into more than one higher level group. For example, the ‘hit by a bus’ code might belong in ‘shock’ but also a separate category called ‘metaphors’. You can create levels from the Quirk Properties dialogue of any Quirk, assign codes to one or more of these levels, and explore them using the query view. See this blog post for more on how to use levels in Quirkos.

It’s also possible to save a snapshot of your project at any point, and then actually merge codes together to keep them all under the same Quirk. You will loose most of the original in vivo codes this way (which is why the other options are usually better) but if you find yourself just dealing with too many codes, or want to create a neat report based on a few key concepts this can be a good way to go. Just right click on the Quirks you want to keep, and select ‘Merge Quirk with...’ to choose another topic to be absorbed into it. Don’t forget all actions in Quirkos have Undo and Redo options!

We don’t have an example dataset coded using in vivo quotes, but if you look at some of the sources from our Scottish Independence research project, you will see some great comments about politics and politicians that leap out of the page and would work great for in vivo coding. So why not try it out, and give in vivo coding a whirl with the free trial of Quirkos: affordable, flexible qualitative software that makes coding all these different approaches a breeze!



Turning qualitative coding on its head

CC BY-SA 2.0,

For the first time in ages I attended a workshop on qualitative methods, run by the wonderful Johnny Saldaña. Developing software has become a full time (and then some) occupation for me, which means I have little scope for my own professional development as a qualitative researcher. This session was not only a welcome change, but also an eye-opening critique to the way that many in the room (myself included) approach coding qualitative data.


Professor Saldaña has written an excellent Coding Manual for Qualitative Researchers, and the workshop really brought to life some of the lessons and techniques in the book. Fundamental to all the approaches was a direct challenge to researchers doing qualitative coding: code different.


Like many researchers, I am guilty of taking coding as a reductive, mechanical exercise. My codes tend to be very basic and descriptive – what is often called index coding. My codes are often a summary word of what the sentence or section of text is literally about. From this, I will later take a more ‘grand-stand’ view of the text and codes themselves, looking at connections between themes to create categories that are closer to theory and insight.


However, Professor Saldaña gave (to my count) at least 12 different coding frameworks and strategies that were completely unique to me. While I am not going to go into them all here (that’s what the textbook, courses and the companion website is for!) it was not one particular strategy that stuck with me, but the diversity of approaches.


It’s easy when you start out with qualitative data analysis to try a simple strategy – after all it can be a time consuming and daunting conceptual process. And when you have worked with a particular approach for many years (and are surrounded by colleagues who have a similar outlook) it is difficult to challenge yourself. But as I have said before, to prevent coding being merely a reductive and descriptive act, it needs to be continuous and iterative. To truly be analysis and interrogate not just the data, but the researcher’s conceptualisation of the data, it must challenge and encourage different readings of the data.


For example, Professor Saldaña actually has a background in performance and theatre, and brings some common approaches from that sphere to the coding process: exactly the kind of cross-disciplinary inspiration I love! When an actor or actress is approaching a scene or character, they may engage with the script (which is much like a qualitative transcript) looking at the character's objectives, conflicts, tactics, attitudes, emotions and subtexts. The question is: what is the character trying to do or communicate, and how? This sort of actor-centred approach works really well in qualitative analysis, in which people, narratives and stories are often central to the research question.


So if you have an interview with someone, for example on their experience with the adoption process, imagine you are a writer dissecting the motivations of a character in a novel. What are they trying to do? Justify how they would be a good parent (objectives)? Ok, so how are they doing this (tactics)? And what does this reveal about their attitudes and emotions? Is there a subtext here – are they hurt because of a previous rejection?


Other techniques talked about the importance of creating codes which were based around emotions, participant’s values, or even actions: for example, can you make all your codes gerunds (words that end in –ing)? While there was a distinct message that researchers can mix and match these different coding categories, it felt to me a really good challenge to try and view the whole data set from one particular view point (for example conflicts) and then step to one side and look again with a different lens.


It’s a little like trying to understand a piece of contemporary sculpture: you need to see it up close, far away, and then from different angles to appreciate the different forms and meaning. Looking at qualitative data can be similar – sometimes the whole picture looks so abstract or baffling, that you have to dissect it in different ways. But often the simplest methods of analysis are not going to provide real insight. Analysing a Henry Moore sculpture by the simplest categories (what is the material, size, setting) may not give much more understanding. Cutting up a work into sections like head, torso or leg does little to explore the overall intention or meaning. And certain data or research questions suit particular analytical approaches. If a sculpture is purely abstract, it is not useful to try and look for aspects of human form - even if the eye is constantly looking for such associations.


Here, context is everything. Can you get a sense of what the artist wanted to say? Was it an emotion, a political statement, a subtle treatise on conventional beauty? And much like impressionist painting, sometimes a very close reading stops the viewer from being able to see the brush strokes from the trees.


Another talk I went to on how researchers use qualitative analysis software, noted that some users assumed that the software and the coding process was either a replacement or better activity than a close reading of the text. While I don’t think that coding qualitative data can ever replace a detailed reading or familiarity with the source text, coding exercises can help read in different ways, and hence allow new interpretations to come to light. Use them to read your data sideways, backwards, and though someone else’s eyes.


But software can help manage and make sense of these different readings. If you have different coding categories from different interpretations, you can store these together, and use different bits from each interpretation. But it can also make it easier to experiment, and look at different stages of the process at any time. In Quirkos you can use the Levels feature to group different categories of coding together, and look at any one (or several) of those lenses at a time.


Whatever approach you take to coding, try to really challenge yourself, so that you are forced to categorise and thus interpret the data in different ways. And don't be suprsied if the first approach isn't the one that reveals the most insight!


There is a lot more on our blog about coding, for example populating a coding framework and coding your codes. There will also be more articles on coding qualitative data to come, so make sure to follow us on Twitter, and if you are looking for simple, unobtrusive software for qualitative analysis check out Quirkos!