Archiving qualitative data: will secondary analysis become the norm?

archive secondary data

 

Last month, Quirkos was invited to a one day workshop in New York on archiving qualitative data. The event was hosted by Syracuse University, and you can read a short summary of the event here. This links neatly into the KWALON led initiative to create a common standard for interchange of coded data between qualitative software packages.


The eventual aim is to develop a standardised file format for qualitative data, which not only allows use of data on any qualitative analysis software, but also for coded qualitative data sets to be available in online archives for researchers to access and explore. There are several such initiatives around the world, for example QDR in the United States, and the UK Data Archive in the United Kingdom. Increasingly, research funders are requiring that data from research is deposited in such public archives.


A qualitative archive should be a dynamic and accessible resource. It is of little use creating something like the warehouse at the end of ‘Raiders of the Lost Ark’: a giant safehouse of boxed up data which is difficult to get to. Just like the UK Data Archive it must be searchable, easy to download and display, and have enough detail and metadata to make sure data is discoverable and properly catalogued. While there is a direct benefit to having data from previous projects archived, the real value comes from the reuse of that data: to answer new questions or provide historical context. To do this, we need a change in practice from researchers, participants and funding bodies.


In some disciplines, secondary data analysis of archival data is common place, think for example of history research that looks at letters and news from the Jacobean period (eg Coast 2014). The ‘Digital Humanities’ movement in academia is a cross-dicipline look at how digital archives of data (often qualitative and text based) can be better utilised by researchers. However, there is a missed opportunity here, as many researchers in digital humanities don’t use standard qualitative analysis software, preferring instead their own bespoke solutions. Most of the time this is because the archived data is in such a variety of different formats and formatting.


However, there is a real benefit to making not just historical data like letters and newspapers, but contemporary qualitative research projects (of the typical semi-structured interview / survey / focus-group type) openly available. First, it allows others to examine the full dataset, and check or challenge conclusions drawn from the data. This allows for a higher level of rigour in research - a benefit not just restricted to qualitative data, but which can help challenge the view that qualitative research is subjective, by making the full data and analysis process available to everyone.

 

Second, there is huge potential for secondary analysis of qualitative data. Researchers from different parts of the country working on similar themes can examine other data sources to compare and contrast differences in social trends regionally or historically. Data that was collected for a particular research question may also have valuable insights about other issues, something especially applicable to rich qualitative data. Asking people about healthy eating for example may also record answers which cover related topics like attitudes to health fads in the media, or diet and exercise regimes.

 

At the QDR meeting last month, it was postulated that the generation of new data might become a rare activity for researchers: it is expensive, time consuming, and often the questions can be answered by examining existing data. With research funding facing an increasing squeeze, many funding bodies are taking the opinion that they get better value from grants when the research has impact beyond one project, when the data can be reused again and again.

 

There is still an element of elitism about generating new data in research – it is an important career pathway, and the prestige given to PIs running their own large research projects are not shared with those doing ‘desk-based’ secondary analysis of someone else’s data. However, these attitudes are mostly irrational and institutionally driven: they need to change. Those undertaking new data collection will increasingly need to make sure that they design research projects to maximise secondary analysis of their data, by providing good documentation on the collection process, research questions and detailed metadata of the sources.

 

The UK now has a very popular research grant programme specifically to fund researchers to do secondary data analysis. Loiuse Corti told the group that despite uptake being slow in the first year, the call has become very competitive (like the rest of grant funding). These initiatives will really help raise the profile and acceptability of doing secondary analysis, and make sure that the most value is being gained from existing data.


However, making qualitative data available for secondary analysis does not necessarily mean that it is publicly available. Consent and ethics may not allow for the release of the complete data, making it difficult to archive. While researchers should increasingly start planning research projects and consent forms to make data archivable and shareable, it is not always appropriate. Sometimes, despite attempts to anonymise qualitative data, the detail of life stories and circumstances that participants share can make them identifiable. However, the UK Data Archive has a sort of ‘vetting’ scheme, where more sensitive datasets can only be accessed by verified researchers who have signed appropriate confidentiality agreements. There are many levels of access so that the maximum amount of data can be made available, with appropriate safeguards to protect participants.


I also think that it is a fallacy to claim that participants wouldn’t want their data to be shared with other researchers – in fact I think many respondents assume this will happen. If they are going to give their time to take part in a research project, they expect this to have maximum value and impact, many would be outraged to think of it sitting on the shelf unused for many years.


But these data archives still don’t have a good way to share coded qualitative data or the project files generated by qualitative software. Again, there was much discussion on this at the meeting, and Louise Corti gave a talk about her attempts to produce a standard for qualitative data (QuDex). But such a format can become very complicated, if we wish to support and share all the detailed aspects of qualitative research projects. These include not just multimedia sources and coding, but date and metadata for sources, information about the researchers and research questions, and possibly even the researcher’s journals and notes, which could be argued to be part of the research itself.

 

The QuDex format is extensive, but complicated to implement – especially as it requires researchers to spend a lot of time entering metadata to be worthwhile. Really, this requires a behaviour change for researchers: they need to record and document more aspects of their research projects. However, in some ways the standard was not comprehensive enough – it could not capture the different ways that notes and memos were recorded in Atlas.ti for example. This standard has yet to gain traction as there was little support from the qualitative software developers (for commercial and other reasons). However, the software landscape and attitudes to open data are changing, and major agreements from qualitative software developers (including Quirkos) mean that work is underway to create a standard that should eventually create allow not just for the interchange of coded qualitative data, but hopefully for easy archival storage as well.


So the future of archived qualitative archive requires behaviour change from researchers, ethics boards, participants, funding bodies and the designers of qualitative analysis software. However, I would say that these changes, while not necessarily fast enough, are already in motion, and the word is moving towards a more open world for qualitative (and quantitative) research.


For things to be aware of when considering using secondary sources of qualitative data and social media, read this post. And while you are at it, why not give Quirkos a try and see if it would be a good fit for your next qualitative research project – be it primary or secondary data. There is a free trial of the full version to download, with no registration or obligations. Quirkos is the simplest and most visual software tool for qualitative research, so see if it is right for you today!

 

 

Stepping back from coding software and reading qualitative data

printing and reading qualitative data

There is a lot of concern that qualitative analysis software distances people from their data. Some say that it encourages reductive behaviour, prevents deep reading of the data, and leads to a very quantified type of qualitative analysis (eg Savin-Baden and Major 2013).

 

I generally don’t agree with these statements, and other qualitative bloggers such as Christina Silver and Kristi Jackson have written responses to critics of qualitative analysis software recently. However, I want to counter this a little with a suggestion that it is also possible to be too close to your data, and in fact this is a considerable risk when using any software approach.

 

I know this is starting to sound contradictory, but it is important to strike a happy balance so you can see the wood from the trees. It’s best to have both a close, detailed reading and analysis of your data, as well as a sense of the bigger picture emerging across all your sources and themes. That was the impetus behind the design of Quirkos: that the canvas view of your themes, where the size of each bubble shows the amount of data coded to it, gives you a live birds-eye overview of your data at all times. It’s also why we designed the cluster view, to graphically show you the connections between themes and nodes in your qualitative data analysis.

 

It is very easy to treat analysis as a close reading exercise, taking each source in turn, reading it through and assigning sections to codes or themes as you go. This is a valid first step, but only part of what should be an iterative, cyclical process. There are also lots of ways to challenge your coding strategy to keep you alert to new things coming from the data, and seeing trends in different ways.

 

However, I have a confession. I am a bit of a Luddite in some ways: I still prefer to print out and read transcripts of data from qualitative projects away from the computer. This may sound shocking coming from the director of a qualitative analysis software company, but for me there is something about both the physicality of reading from paper, and the process of stepping away from the analysis process that still endears paper-based reading to me. This is not just at the start of the analysis process either, but during. I force myself to stop reading line-by-line, put myself in an environment where it is difficult to code, and try and read the corpus of data at more of a holistic scale.
I waste a lot of trees this way (even with recycled paper), but always return to the qualitative software with a fresh perspective, finish my coding and analysis there, but having made the best of both worlds. Yes, it is time consuming to have so many readings of the data, but I think good qualitative analysis deserves this time.

 

I know I am not the only researcher who likes to work in this way, and we designed Quirkos to make this easy to do. One of the most unique and ‘wow’ features of Quirkos is how you can create a standard Word document of all the data from your project, with all the coding preserved as colour-coded highlights. This makes it easy to printout, take away and read at your leisure, but still see how you have defined and analysed your data so far.

word export qualitative data

 

There are also some other really useful things you can do with the Word export, like share your coded data with a supervisor, colleague or even some of your participants. Even if you don’t have Microsoft Office, you can use free alternatives like LibreOffice or Google Docs, so pretty much everyone can see your coded data. But my favourite way to read away from the computer is to make a mini booklet, with turn-able pages – I find this much more engaging than just a large stack of A4/Letter pages stapled in the top corner. If you have a duplex printer that can print on both sides of the page, generate a PDF from the Word file (just use Save As…) and even the free version of Adobe Reader has an awesome setting in Print to automatically create and format a little booklet:

word booklet

 

 

I always get a fresh look at the data like this, and although I am trying not to be too micro-analytical and do a lot of coding, I am always able to scribble notes in the margin. Of course, there is nothing to stop you stepping back and doing a reading like this in the software itself, but I don’t like staring at a screen all day, and I am not disciplined enough to work on the computer and not get sucked into a little more coding. Coding can be a very satisfying and addictive process, but at the time I have to define higher-level themes in the coding framework, I need to step back and think about the bigger picture, before I dive into creating something based on the last source or theme I looked at. It’s also important to get the flow and causality of the sources sometimes, especially when doing narrative and temporal analysis. It’s difficult to read the direction of an interview or series of stories just from looking at isolated coded snippets.

 

Of course, you can also print out a report from Quirkos, containing all the coded data, and the list of codes and their relations. This is sometimes handy as a key on the side, especially if there are codes you think you are underusing. Normally at this stage in the blog I point out how you can do this with other software as well, but actually, for such a commonly required step, I find this very hard to do in other software packages. It is very difficult to get all the ‘coding stripes’ to display properly in Nvivo text outputs, and MaxQDA has lots of options to export coded data, but not whole coded sources that I can see. Atlas.ti does better here with the Print with Margin feature, which shows stripes and code names in the margin – however this only generates a PDF file, so is not editable.

 

So download the trial of Quirkos today, and every now and then step back and make sure you don’t get too close to your qualitative data…

 

 

Problems with quantitative polling, and answers from qualitative data

 

The results of the US elections this week show a surprising trend: modern quantitative polling keeps failing to predict the outcome of major elections.

 

In the UK this is nothing new, in both the 2015 general election and the EU referendum polling failed to predict the outcome. In 2015 the polls suggested very close levels of support for Labour and the Conservative party but on the night the Conservatives won a significant majority. Secondly, the polls for the Referendum on leaving the EU indicated there was a slight preference for Remain, when voters actually voted to Leave by a narrow margin. We now have a similar situation in the States, where despite polling ahead of Donald Trump, Hillary Clinton lost the Electoral College system (while winning a slight majority in the popular vote). There are also recent examples of polling errors in Israel, Greece and the Scottish Independence Referendum.

 

Now, it’s fair to say that most of these polls were within the margin of error, (typically 3%) and so you would expect these inaccurate outcomes to happen periodically. However, there seems to be a systematic bias here, in each time underestimating the support for more conservative attitudes. There is much hand-wrangling about this in the press, see for example this declaration of failure from the New York Times. The suggestion that journalists and traditional media outlets are out of touch with most of the population may be true, but does not explain the  polling discrepancies.

 

There are many methodological problems: numbers of people responding to telephone surveys is falling, perhaps not surprising considering the proliferation of nuisance calls in the UK. But this remains for most pollsters a vital way to get access to the largest group of voters: older people. In contrast, previous attempts to predict elections through social media and big data approaches have been fairly inaccurate, and likely will remain that way if social media continues to be dominated by the young.

 

However, I think there is another problem here: pollsters are not asking the right questions. Look how terribly worded the exit poll questions are, they try to get people to put themselves in a box as quickly as possible: demographically, religiously, and politically. Then they ask a series of binary questions like “Should illegal immigrants working in the U.S. should be offered legal status or deported to their home country?” giving no opportunity for nuance. The aim is clear – just get to a neat quantifiable output that matches a candidate or hot topic.

 

There’s another question which I think in all it’s many iterations is poorly worded: Who are you going to vote for? People might change whether they would support a different politician at any moment in time (including in a polling booth), but are unlikely to suddenly decide their family is not important to them. It’s often been shown that support for a candidate is not a reliable metric: people give you answers influenced by the media, the resdearcher and of course they can change their mind. But when you ask people questions about their beliefs, not a specific candidate, they tend to be much more accurate. It also does not always correlate that a person will believe a candidate is good, and vote for them. As we saw in Brexit, and possibly with the last US election, many people want to register a protest vote – they are not being heard or represented well, and people aren’t usually asked if this is one of the reasons they vote. It’s also very important to consider that people are often strategic voters, and are themselves influenced by the polls which are splashed everywhere. The polls have become a constant daily race of who’s ahead, possibly increasing voter fatigue and leading to complacency for supporters of who ever is ahead the day of the finishing line. These made future predictions much more difficult.

 


In contrast, here’s two examples of qualitative focus group data on the US election. The first is a very heavily moderated CBS group, which got very aggressive. Here, although there is a strong attempt to ask for one word answers on candidates, what comes out is a general distrust of the whole political system. This is also reflected in the Lord Ashcroft focus groups in different American states, which also include interviews with local journalists and party leaders. When people are not asked specific policy or candidate based questions, there is surprising  agreement: everyone is sick of the political system and the election process.


This qualitative data is really no more subjective than polls based on who answers a phone on a particular day, but provides a level of nuance lacking in the quantitative polls and mainstream debate, which helps explain why people are voting different ways – something many are still baffled by. There are problems with this type of data as well, it is difficult to accurately summarise and report on, and rarely are complete transcripts available for scrutiny. But if you want to better gauge the mood of a nation, discussion around the water-cooler or down the pub can be a lot more illuminating, especially when as a researcher or ethnographer you are able to get out of the way and listen (as you should when collecting qualitative data in focus groups).

 

Political data doesn’t have to be focus group driven either – these group discussions are done because they are cheap, but qualitative semi-structured interviews can really let you understand key individuals that might help explain larger trends. We did this before the 2015 general election, and the results clearly predicted and explained the collapse in support for the Labour party in Scotland.

 

There has been a rush in the polling to add more and more numbers to the surveys, with many reaching tens or even hundreds of thousands of respondents. But these give a very limited view of voter opinions, and as we’ve seen above can be very skewed by question and sampling method. It feels to me that deep qualitative conversations with a much smaller number of people from across the country would be a better way of gauging the social and political climate. And it’s important to make sure that participants have the power to set the agenda, because pollsters don’t always know what issues matter most to people. And for qualitative researchers and pollsters alike: if the right questions don’t get asked, you won’t get the right answers!

 

Don't forget to try Quirkos, the simplest and most visual way to analyse your qualitative text and mixed method data. We work for you, with a free trial and training materials, licences that don't expire and expert researcher support. Download Quirkos and try for your self!

 

 

 

Tips for running effective focus groups

In the last blog article I looked at some of the justifications for choosing focus groups as a method in qualitative research. This week, we will focus on some practical tips to make sure that focus groups run smoothly, and to ensure you get good engagement from your participants.

 


1. Make sure you have a helper!

It’s very difficult to run focus groups on your own. If you are wanting to layout the room, greet people, deal with refreshment requests, check recording equipment is working, start video cameras, take notes, ask questions, let in late-comers and facilitate discussion it’s much easier with two or even three people for larger groups. You will probably want to focus on listening to the discussion, not have to take notes and problem solve at the same time. Having another facilitator or helper around can make a lot of difference to how well the session runs, as well as how much good data is recorded from it.

 


2. Check your recording strategy

Most people will record audio and transcribe their focus groups later. You need to make sure that your recording equipment will pick up everyone in the room, and also that you have a backup dictaphone and batteries! Many more tips in this blog post article. If you are planning to video the session, think this through carefully.

 

Do you have the right equipment? A phone camera might seem OK, but they usually struggle to record long sessions, and are difficult to position in a way that will show everyone clearly. Special cameras designed for gig and band practice are actually really good for focus groups, they tend to have wide-angle lenses and good microphones so you don’t need to record separate audio. You might also want to have more than one camera (in a round-table discussion, someone will always have their back to the camera. Then you will want to think about using qualitative analysis software like Transana that will support multiple video feeds.

 

You also need to make sure that video is culturally appropriate for your group (some religions and cultures don’t approve of taking images) and that it won’t make people nervous and clam up in discussion. Usually I find a dictaphone less imposing than a camera lens, but you then loose the ability to record the body language of the group. Video also makes it much easier to identify different speakers!

 


3. Consent and introductions

I always prefer to do the consent forms and participant information before the session. Faffing around with forms to sign at the start or end of the workshop takes up a lot of time best used for discussion, and makes people hurried to read the project information. E-mail this to people ahead of time, so at least they can just sign on the day, or bring a completed form with them. I really feel that participants should get the option to see what they are signing up for before they agree to come to a session, so they are not made uncomfortable on the day if it doesn't sound right for them. However, make sure there is an opportunity for people to ask any questions, and state any additional preferences, privately or in public.

 


4. Food and drink

You may decide not to have refreshments at all (your venue might dictate that) but I really love having a good spread of food and drink at a focus group. It makes it feel more like a party or family occasion than an interrogation procedure, and really helps people open up.

 

While tea, coffee and biscuits/cookies might be enough for most people, I love baking and always bring something home-baked like a cake or cookies. Getting to talk about and offer  food is a great icebreaker, and also makes people feel valued when you have spent the time to make something. A key part of getting good data from a good focus group is to set a congenial atmosphere, and an interesting choice of drinks or fruit can really help this. Don’t forget to get dietary preferences ahead of time, and consider the need for vegetarian, diabetic and gluten-free options.

 


5. The venue and layout

A lot has already been said about the best way to set out a focus group discussion (see Chambers 2002), but there are a few basic things to consider. First, a round or rectangle table arrangement works best, not lecture hall-type rows. Everyone should be able to see the face of everyone else. It’s also important not to have the researcher/facilitator at the head or even centre of the table. You are not the boss of the session, merely there to guide the debate. There is already a power dynamic because you have invited people, and are running the session. Try and sit yourself on the side as an observer, not director of the session.

 

In terms of the venue, try and make sure it is as quiet as possible, and good natural light and even high ceilings can help spark creative discussion (Meyers-Levy and Zhu 2007).

 


6. Set and state the norms

A common problem in qualitative focus group discussions is that some people dominate the debate, while others are shy and contribute little. Chambers (2002) just suggests to say at the beginning of the session this tends to happen, to make people conscious of sharing too much or too little. You can also try and actively manage this during the session by prompting other people to speak, go round the room person by person, or have more formal systems where people raise their hands to talk or have to be holding a stone. These methods are more time consuming for the facilitator and can stifle open discussion, so it's best to use them only when necessary.

 

You should also set out ground rules, attempting to create an open space for uncritical discussion. It's not usually the aim for people to criticise the view of others, nor for the facilitator to be seen as the leader and boss. Make these things explicit at the start to make sure there is the right atmosphere for sharing: one where there is no right or wrong answer, and everyone has something valuable to contribute.

 


7. Exercises and energisers

To prompt better discussion when people are tired or not forthcoming, you can use exercises such as card ranking exercises, role play exercises and prompts for discussion such as stories or newspaper articles. Chambers (2002) suggests dozens of these, as well as some some off-the-wall 'energizer' exercises: fun games to get people to wake up and encourage discussion. More on this in the last blog post article. It can really help to go round the room and have people introduce themselves with a fun fact, not just to get the names and voices on tape for later identification, but as a warm up.

 

Also, the first question, exercise or discussion point should be easy. If the first topic is 'How did you feel when you had cancer?' that can be pretty intimidating to start with. Something much simpler, such as 'What was hospital food like?' or even 'How was your trip here?' are topics everyone can easily contribute to and safely argue over, gaining confidence to share something deeper later on.

 


8. Step back, and step out

In focus groups, the aim is usually to get participants to discuss with each-other, not a series of dialogues with the facilitator. The power dynamics of the group need to reflect this, and as soon as things are set in motion, the researcher should try and intervene as little as possible – occasionally asking for clarification or to set things back on track. Thus it's also their role to help participants understand this, and allow the group discussion to be as co-interactive as possible.

 

“When group dynamics worked well the co-participants acted as co-
researchers taking the research into new and often unexpected directions and engaging in interaction which were both complementary (such as sharing common experiences) and argumentative” 
- Kitzinger 1994

 


9. Anticipate depth

Focus groups usually last a long time, rarely less than 2 hours, but even a half or whole day of discussion can be appropriate if there are lots of complex topics to discuss. It's OK to consider having participants do multiple focus groups if there is lots to cover, just consider what will best fit around the lives of your participants.

 

At the end of these you should find there is a lot of detailed and deep qualitative data for analysis. It can really help digesting this to make lots of notes during the session, as a summary of key issues, your own reflexive comments on the process, and the unspoken subtext (who wasn't sharing on what topics, what people mean when they say, 'you know, that lady with the big hair').


You may also find that qualitative analysis software like Quirkos can help pull together all the complex themes and discussions from your focus groups, and break down the mass of transcribed data you will end up with! We designed Quirkos to be very simple and easy to use, so do download and try for yourself...