Manuals and Tutorials
Licences and Pricing
Site and group licences
Analyzing Qualitative Data
July 21, 2016
A common question from newcomers to qualitative research is, what’s the right sample size? How many people do I need to have in my project to get a good answer for my research questions? For research based on quantitative data, there is usually a definitive answer: you can decide ahead of time what sample size is needed to gain a significant result for a particular test or method.
This post is hosted by Quirkos, simple and affordable qualitative analysis software. Download a one-month free trial today!
In qualitative research, there is no neat measure of significance, so getting a good sample size is more difficult. The literature often talks about reaching ‘saturation point’ - a term taken from physical science to represent a moment during the analysis of the data where the same themes are recurring, and no new insights are given by additional sources of data. Saturation is for example when no more water can be absorbed by a sponge, but it’s not always the case in research that too much is a bad thing. Saturation in qualitative research is a difficult concept to define Bowen (2008), but has come to be associated with the point in a qualitative research project when there is enough data to ensure the research questions can be answered.
However, as with all aspects of qualitative research, the depth of the data is often more important than the numbers (Burmeister & Aitken, 2012). A small number of rich interviews or sources, especially as part of a ethnography can have the importance of dozens of shorter interviews. For Fusch (2015):
“The easiest way to differentiate between rich and thick data is to think of rich as quality and thick as quantity. Thick data is a lot of data; rich data is many - layered, intricate, detailed, nuanced, and more. One can have a lot of thick data that is not rich; conversely, one can have rich data but not a lot of it. The trick, if you will, is to have both.”
So the quantity of the data is only one part of the story. The researcher needs to engage with it at an early level to ensure “all data [has] equal consideration in the analytic coding procedures. Frequency of occurrence of any specific incident should be ignored. Saturation involves eliciting all forms of types of occurrences, valuing variation over quantity.” Morse (1995). When the amount of variation in the data is levelling off, and new perspectives and explanations are no longer coming from the data, you may be approaching saturation. The other consideration is when there are no new perspectives on the research question, for example Brod et al. (2009) recommend constructing a ‘saturation grid’ listing the major topics or research questions against interviews or other sources, and ensuring all bases have been covered.
But despite this, is it still possible to put rough numbers on how many sources are required for a qualitative research project? Many papers have attempted to do this, and as could be expected, the results vary greatly. Mason (2010) looked at the average number of respondents in PhD thesis using on qualitative research. They found an average of 30 sources were used, but with a low of 1 source, a high of 95 and a standard deviation of 18.5! It is interesting to look at their data tables, as they show succinctly the differences in sample size expected for different methodological approaches, such as case study, ethnography, narrative enquiry, or semi-structured interviews.
While 30 in-depth interviews may seem high (especially for what is practical in a PhD study) others work with much less: a retrospective examination from a qualitative project by Guest et al. (2006) found that even though they conducted 60 interviews, they had saturation after 12, with most of the themes emergent after just 6. On the other hand, if students have supervisors who have more of a mixed-method or quantitative background, they will often struggle to justify the low number of participants suggested for methods of qualitative enquiry.
The important thing to note is that it is nearly impossible for a researcher to know when they have reached saturation point unless they are analysing the data as it is collected. This exposes one of the key ties of the saturation concept to grounded theory, and it requires an iterative approach to data collection and analysis. Instead of setting a fixed number of interviews or focus-groups to conduct at the start of the project, the investigator should be continuously going through cycles of collection and analysis until nothing new is being revealed.
This can be a difficult notion to work with, especially when ethics committees or institutional review boards, limited time or funds place a practical upper limit on the quantity of data collection. Indeed Morse et al (2014) found that in most dissertations they examined, the sample size was chosen for often practical reasons, not because a claim of saturation was made.
You should also be aware that many take umbrage at the idea that one should use the concept of saturation. O’Reilly (2003) notes that since the concept of saturation comes out of grounded theory, it’s not always appropriate to apply to research projects, and the term has become over used in the literature. It’s also not a good indicator by itself of the quality of qualitative research.
For more on these issues, I would recommend any of the articles referenced above, as well as discussion with supervisors, peers and colleagues. There is also more on sampling considerations in qualitative research in our previous blog post article.
Finally, don’t forget that Quirkos can help you take an iterative approach to analysis and data collection, allowing you to quickly analyse your qualitative data as you go through your project, helping you visualise your path to saturation (if you so choose this approach!). Download a free trial for yourself, and take a closer look at the rest of the features the software offers.