Codebooks for qualitative research

A codebook doesn’t include the extracts of data themselves, but a detailed description of the codes, how they should be used, their relationship to each other, what should be included and excluded in each code.

Codebooks for qualitative research
This post is hosted by Quirkos, qualitative analysis software that makes your qualitative text analysis simple, fun and beautiful. Get closer to your data by starting a free trial today!

Many people undertaking qualitative analysis will use some form of coding to help explore and categorise their data. Often, the researcher will use hundreds of codes to do this.

What is a qualitative codebook?

A qualitative codebook is a reference which describes the codes in a coding framework: the list of qualitative codes, themes or topics that are used to analyse the data.

What should a qualitative codebook include?

A codebook doesn’t include extracts of data themselves, but it should include a detailed description of the codes, how they should be used, their relationship to each other, and what should be included and excluded in each code.

The term 'codebook' actually comes from quantitative statistics, where a codebook is used to store metadata such as variable names, valid ranges, data types etc. In qualitative coding, the codebook has a similar function: collecting useful meta-information about the codes, that cannot be contained solely within the code name.

When should a codebook be used?

Often the concept of codebooks is connected to framework analysis, or 'a priori' analytical techniques where most of the codes are decided in advance. However, it’s just as helpful for grounded theory or emergent analysis, although here the codebook will be constantly updated with new codes and themes as they are identified in the data.

Why do I need a codebook for my qualitative analysis?

  • A codebook can help you to remember when and why you made particular analytical decisions.
  • A codebook will help you remember the specific word definitions you are using in your codes, and why you chose those in particular.
  • You can use a codebook to log different methodologies associated with different codes, so you can take multiple approaches to the same data (such as grounded theory or discourse analysis).
  • A codebook will help you to justify your decisions and provide a paper trail when you are writing up your qualitative methodology and analytical process for publication.

It may feel like creating a formal codebook isn’t necessary when there is just one person doing the analysis, as there isn’t an immediate need to communicate what the codes are with someone else. After all, I know what I mean, right? However, there is often a need for self-communication – a note to your self that acts as an aide-mémoire to how a code should be used, through a dynamic process that might take months. As you work through your data, often the exact meaning of a code or theme often evolves, as different people have different perspectives on the same thing.

A code that sounds simple like ‘Anger’ may start off being easy to understand, but as you hear from more people you may realise what seems like ‘Anger’ is really better theorised as ‘Emotional Outlets’, ‘Injustice’, ‘Powerlessness’ or even ‘Rage’. The subtleties of how a code is defined can be crucial to correctly interpreting the data, so while it can be convenient to just use a single word to define a theme, you might change that word over time, and develop a more detailed explanation for what that code means.

A codebook is also useful for communicating with your future self – the poor person who actually has to write up the data, describe how it was analysed, and possibly debate the coding with supervisors or journal reviewers. If your codebook contains a history of how codes evolved, guidelines for what to code into them, and detailed descriptions of what they mean, it makes the process of writing up your research a lot easier. It’s tempting to consider that the coded data itself (i.e. the quotes/highlights) is the main thing to draw on when writing up findings, but justifying the way it was structured and organised is often just as important. Often the meaning of themes and codes will shift as more data is coded, and the structure that a coding framework imposes on a dataset fundamentally shapes how it is interpreted and the conclusions drawn from it.

Example of a qualitative codebook entry

One good way to document how codes should be used is to write coding examples, or use-case rules for each code. This might be a literal example: ‘I was really angry with them’ or a set of descriptive guidelines, such as:

‘Use for anger, rage, other violent emotions, when felt by the participant, about other people or circumstances. Do not include instances of ‘Angry with self’ - use the code ‘Personal Frustrations’. Do not use when the person is talking about someone else being angry – use ‘Other people’s feelings’.

Now it might seem really daunting to have this level of detail about your coding framework when you start, but it’s usually something that will develop as you progress through coding, especially when taking an iterative and cyclical approach to analysis. However, thinking though exactly how a code should (and shouldn’t) be used can help you work through what other codes might be needed, and how each is conceptualised to link back to answering the research question.

Another key part of a codebook is the relationship between codes. For example, you may have sub-categories (and sub-sub-categories), and you can show these hierarchies and relationships in the codes. Alternatively you may have a ‘flat’ structure, where there are no codes under each other. A codebook might also show the connections between codes, even when non-hierarchical (for example that Anger was often coded together with Frustration). It may also show a diagram of how codes developed into themes, and snapshots through the process, important when using iterative approaches like open and axial coding. It might also have ‘origin’ details, either who created the code (if working as part of a group) or which source of data it was first created for.

Codebooks for collaborative qualitative analysis

And of course, if you are using a collaborative or team-based approach to qualitative coding, codebooks are a really, really good idea. Regardless of whether people are coding together, each taking different sources, or just reviewing or appraising coding at the end, every person involved needs to know exactly what codes to use, and when. A codebook is invaluable to doing this, and while easy to integrate in a software tool like Quirkos, you should also have one that you can share when working manually – this might be printed out, or a shared document somewhere. You will also need to consider whether everyone can update and modify the codebook, or if it will be ‘set in stone’ at the beginning of the analysis, having been decided on by the whole team.

There’s also another crucial use for codebooks, and that’s when archiving, reviewing or allowing secondary analysis of your data. If a thesis or paper reviewer wants to see how you have coded qualitative data, the codebook should be a clear way to document that. And when archiving data (increasingly a requirement for publicly funded research), the codebook would be key to someone else being able to understand your coding and interpretation.

Want to learn more about qualitative research? Try our free qualitative research course, an interactive journey through the whole research process from designing a good research question, to collecting qualitative data and qualitative analysis, through to writing up your research.

Using qualitative analysis software to manage your codebook

In qualitative software, you can usually describe codes in more detail, so your framework in the software can potentially function as your codebook. In Quirkos, codes can have 'descriptions', a much longer definition that appears when you hover your mouse over a code, or underneath the code title in List View. The description may also note what type of code it is (for example deductive, emergent, axial, thematic or linguistic), especially if you use more than one coding approach, or have several stages.

One of the great reasons to use qualitative analysis software like Quirkos is that it will allow you to manage and update your codebook as you go along, and export it at any time. Quirkos aims to make qualitative analysis more accessible, by making software that is affordable, simple to learn and visual. Try the free trial and see for yourself!

With flexible canvas views, Quirkos makes qualitative analysis easy, fun and beautiful. Try for free today!
Try Quirkos for free today!

There are a couple of articles that can guide the creation of codebooks, and go into more detail: for example, DeCuit-Gunby et al. (2011) and Sage Methods Datasets (2019). However, it’s one of those classic things that everyone seems to use, but few people bother formally discussing. Most of my go-to textbooks on general qualitative research don't mention it, but one exception is the wonderful Coding Manual for Qualitative Researchers by Johnny Saldaña (2012).