Content from Qualitative Data and QualCoder
Last updated on 2025-04-03 | Edit this page
Estimated time: 60 minutes
Overview
Questions
- What can we learn from existing qualitative data?
- How is qualitative interview data typically structured?
- How can documents and projects be imported into QualCoder?
Objectives
- Practice finding and downloading data from a qualitative data repository
- Create a QualCoder project
- Practice importing and organizing documents
For decades, a movement has been building among quantitative researchers to share data and analysis as completely as possible. In part, this is meant to improve scientific transparency, so findings can be checked and verified, but it also serves to provide a basis for future studies.
Qualitative researchers have been much slower to adopt data and analysis sharing, for a few different reasons, including both technical and normative constraints. On the technical side, qualitative data has had limited standards, and analysis has often been conducted manually or with specialized software with limited compatibility with other software. More importantly, however, qualitative researchers tend to emphasize the role of both the context of data collection and the perspective of the researcher in the research process - in contrast to quantitative methods that often seek precisely to minimize the subjectivity of their methods and draw wide-ranging conclusions about entire populations.
Moreover, qualitative data often presents challenges for protecting privacy when sharing data, because interviews, focus groups, and other qualitative methods provide rich detail that can make it easier to identify participants even if direct identifiers are removed, as well as often dealing with sensitive issues.
Despite these challenges, however, there is a small but growing movement to share and reuse qualitative and mixed methods data and analysis, helped along by recent technical developments.
In this lesson, we will start by exploring what advantages sharing and reusing qualitative research can offer that might offset the challenges it presents, then practice finding, reusing, and sharing data and analysis projects with the largest qualitative data archive, QDR, and the most widely used coding and qualitative data analysis software, NVivo.
The data we will work with are real data, although the scenario we use is somewhat artificial. Our goal is twofold - to help build literacy and skills for working with secondary qualitative data using relevant tools and to help develop an imagination for how you might reuse and share qualitative and mixed methods data to improve your own research.
Reusing Qualitative Data
The first step to reuse is to understand what elements might be reused and why. Historically, research products like articles and books have been the primary shared output of qualitative research. However, there are at least four other common research products that may be valuable for reuse or adaptation:
- methodologies and instruments
- raw data
- codebooks
- analysis
In the next step, we’ll download some real qualitative data and explore how each of these products might be useful to another researcher.
Types of data
Throughout this lesson, we’ll be using semi-structured interviews, one of the most common types of qualitative data, in our examples and discussion. Content analysis generally involves similar concerns and processes, but uses other kinds of secondary data, such as published print or audio-visual materials. Other qualitative methods, such as participant observation and focus groups, may be somewhat less suited to these tools and approaches and may require adaptation beyond what is discussed here.
Part of the power of qualitative research is its recognition that no method can be truly universal and that all analysis is contextual. We encourage you to treat the discussion and tools here as starting points, rather than templates.
Finding and Downloading QDR Data
Discussing data reuse in the abstract can only get you so far. Instead, we will imagine we face the following scenario and working together to discover how we can take advantage of others hard work to improve our research:
You are preparing to conduct a study of social media users in multiple countries, with a focus on understanding how people make decisions about the privacy of their posts and profiles.
Participants will complete brief surveys about their demographic background, participate in one or more semi-structured interviews, and provide access to their posts on each platform they regularly use for a period of 1 month. You are the lead researcher, but are joined by both long-term collaborators and student research assistants, who are likely to come and go throughout the expected duration of the research.
Your training in qualitative research makes you skeptical of the value of data that was collected for other purposes, but you also know you won’t have much time to collect the data and need to ensure you make the most of the opportunity and structure your interviews so they can reveal critical findings.
One of your collaborators referenced a dissertation that used qualitative methods to study data sharing in qualitative and big social research, and you noticed that the dissertation mentioned that anonymized data are available online. Although your research questions differ from theirs, you wonder if their interviews might help guide how you approach your study and decide to take a look.
The first step to finding out whether this data can help is to get the data.
Following the link in the dissertation, visit the data’s summary page at the Qualitative Data Repository, or QDR. QDR describes itself as
…a dedicated archive for storing and sharing digital data (and accompanying documentation) generated or collected through qualitative and multi-method research in the social sciences and related disciplines.
Essentially, QDR is a place for researchers to share qualitative and mixed methods data in a variety of forms, as well as their analysis projects. Some data are restricted and require an application or agreement before using, but other data, including what we are interested in, are openly available to any registered user.

This project contains a wide variety of information, including:
- interview transcripts from three different populations
- interview analysis
- participant summaries
- interview guides
- study documents (consent forms, solicitations, IRB)
Both the topic and the range of data available seem promising, and the QDR Standard Access license agreement allows any registered user to download the data, so let’s download it and see what we find!
If you don’t already have one, you’ll need to create an account at
QDR for the next step. To do so with your email, click Register. in the top right
of the summary page. Alternatively, you can use an ORCID or Google
account to register by clicking Login
and finding the
appropriate option. Fill out the registration form and click
Create new account
, performing any necessary verification
before proceeding.
QDR or your local internet network may be overwhelmed if larger
groups download the full project simultaneously. For this workshop, you
may choose to have learners instead exclude the NVPX (NVivo) file, which
will reduce the download size by around 95%. To do this: 1. Check the
box at the top-left of the file list (below the project description). 2.
Click select all 42 files in this data project
. 3. Scroll
to the bottom of the list, navigate to the last page, and uncheck the
NVPX file 4. Click Download
Alternatively, learners can select only the files that will be used in the workshop. This is the most efficient but least like common data reuse patterns, where researchers won’t necessarily know which raw datafiles have the information of interest to them.
List of files for workshop:
- README_Mannheimer.txt
- Mannheimer_BSR01_Transcript.pdf
- Mannheimer_BSR02_Transcript.pdf
- Mannheimer_BSR05_Transcript.pdf
- Mannheimer_Redacted_Interview_Analysis.qdpx
Once you have a QDR account, log in with the button on the top right
and return to the [summary page][mannheimer-qdr]. Click
Download Data Project
(see image below), download the ZIP
file to your desktop, and extract the files into a folder on your
desktop.

File types
Open the folder using Finder
(Mac) or
Explorer
(Windows) and inspect the file extensions
(the part after the .
in the filename). What types of files
are included and what software would typically be used to open them?
Don’t worry if you don’t recognize all of the file types. Just
identify what you know. If you have time during the exercise, you can
use a web search like txt extension
to search for
information.
The project includes the following four types of files:
-
txt
files are plain text and can be opened in any notepad or document software -
pdf
files can be opened withAdobe Reader
or other document software -
nvpx
files are NVivo for Mac projects and can only be opened in NVivo for Mac or converted in NVivo for Windows to a Windows format -
qdpx
files are the REFI-QDA qualitative interchange format and can be opened in a variety of qualitative software
In this project, the txt
files provide
metadata, or information about the project, while all other
files are part of the data itself. QualCoder can’t read
nvpx
files, so the most important files to us will be the
pdf
interview transcripts and the qdpx
project
file.
Creating a QualCoder Project
Once we have a general sense of the data we’ll be using and what format it needs to be in, the next step is generally to create a project in a qualitative software package and import the de-identified raw data.
QualCoder is an open-source and cross-platform tool for coding and analysis of textual data. It is capable of working with a variety of data formats and has tools for both qualitative and mixed methods analysis.
Qualitative software is sometimes called CAQDAS or QDAS, an acronym for Coding and Qualitative Data Analysis Software to recognize the two essential functions of qualitative research software:
- Apply codes or tags systematically to a collection of data sources
- Analyze codes (and sometimes raw data or secondary sources) to draw conclusions
CAQDAS have various features that may be useful for particular types of data or methodologies. This lesson includes an alternative software options page for those interested in learning about other CAQDAS.
Starting QualCoder
If you haven’t already installed QualCoder, please follow the instructions in [Summary and Setup][../index.html] before proceeding.
When you first open QualCoder, it will display the
Action Log
tab with some information about project settings
and any currently-open projects, like the example below.

Creating a project
Work in QualCoder is organized in projects, each of which contain
their own sets of files, codes and other information.
Click Project - Create New Project
and, after choosing a
location where you can easily find it later (like your
Desktop
) give the project a clear name like
Social_media_privacy_project_planning
. This creates a new
folder whose name ends in .qda
so QualCoder can identify it
easily as a project.
Clicking Save
returns us to the action log, which now
displays a summary of the new project.
Importing and navigating files
Now that we have a QualCoder project, let’s take a look at what the
original data contains. QDAS projects store copies of sources within
their directory. To see the project description, we’ll first need to
import README_Mannheimer.txt
into our project.
Documents can be imported with the Manage - Manage Files
command. Add README_Mannheimer.txt
by clicking on the
Page
icon with a +
or pressing
CTRL/CMD+2
.
In collections of data or other multi-file downloads, there is often
a file named readme.txt
or something similar. These are
meant to provide an overview of the collection and how to go about using
them.
To view the contents of the file, open Code text
from
the Coding
menu at the top-left then click on the name of
the file you just imported at the left. Zero in on the first paragraph
under Data Description and Collection Overview
:
The data in this study was collected using semi-structured interviews that centered around specific incidents of qualitative data archiving or reuse, big social research, or data curation. The participants for the interviews were therefore drawn from three categories: researchers who have used big social data, qualitative researchers who have published or reused qualitative data, and data curators who have worked with one or both types of data. Six key issues were identified in a literature review, and were then used to structure three interview guides for the semi-structured interviews. The six issues are context, data quality and trustworthiness, data comparability, informed consent, privacy and confidentiality, and intellectual property and data ownership.
This short paragraph packs a great deal of information about the interview topics and questions. Our topic of social media privacy may share a good deal with some of the issues the original interviews focused on.
For now, we’re going to focus on the Big Social Research group, because the type of data they were interviewed about their work with is the most similar to what our participants may be sharing. You may have drawn a different conclusion, and likewise, each group might provide unique insights.
If we can adapt questions and anticipate concerns and challenges relevant to our data collection, we may be able to improve the quality of our study and interview design, without repeating all the background research from the previous study.
Raw Data
Interview schedules and study plans provide insight into the research team’s expectations and approaches. The power of raw data, such as interview transcripts, is that it shows (much more directly) how the study participants construct their own views of the study topics and respond to the questions. Context and constructivism are core concerns of most qualitative researchers, lending the raw data possibly the most critical aspect of data for faithful and effective reuse.
It’s possible to conduct entirely secondary studies if enough raw data are available in relevant context that address relevant topics. Often, however, existing data serves as a type of pilot study - providing initial evidence as to where to begin and what to ask while still allowing the new research team to explore aspects of the data that were less relevant or highlighted in the original study.
One of the most commonly-analyzed types of raw qualitative data is interview transcripts, like we have in our project. QualCoder provides support for working directly with audio and video files. Text, however, is often preferred because it simplifies the process of removing potentially identifying information and enables both rapid scanning of content and accessibility technology.
Let’s open one of the Big Social transcripts,
Mannheimer_BSR01_Transcript.pdf
, in your default
PDF
reader (not QualCoder) and look at how transcripts are
structured in this project.

PDF
files appear as page images, but most modern PDFs
allow you to select text and other elements. In this transcript, we see
a few header elements:
- The document title
- A set of summary keywords identified by the original investigators as relevant to this interview
- A list of all speakers
After the speaker list, the transcript itself begins with each paragraph denoting a switch in speaker. The speaker’s identifier and a time stamp for the start of the conversation comprise the first line, followed by the transcribed text.
Reading even this first section in the image, we can get some insight
into decisions the research team made and how they worked. From
BSR01
’s very first sentence, we see the researchers sent a
list of interview questions in advance but with mixed results:
I haven’t looked at the questions you sent. Yeah, but this is an active project. So I don’t think I should have a problem with any of them.
The respondent may not have read all the questions but did identify a project that the rest of the interview will revolve around.
Import additional files
Import the following three files into QualCoder, then open the
BSR_01
transcript and compare the formatting in the preview
to what it looked like in the original PDF.
Files to import
- Mannheimer_BSR01_Transcript.pdf
- Mannheimer_BSR02_Transcript.pdf
- Mannheimer_BSR05_Transcript.pdf
QualCoder converts all text and PDF documents to plain text when importing, which removes formatting and visual elements. The QualCoder Wiki provides guidance on how to pre-convert PDFs to a text format if you experience readability problems.
In general, the best results will come from importing text documents in text (TXT/RTF), Word (DOC/DOCX), or webpage (HTM/HTML) formats rather than PDF.
Documents can also be created and edited in QualCoder, but its interface is not optimized for significant edits. In general, any changes, including the removal of personally identifiable information from data that will be shared, should take place before they are added to the project.
In the next section of the workshop, we will develop multiple kinds of qualitative codes and apply them to the interviews you just imported in QualCoder.
Key Points
- Qualitative data can take many forms, but text or transcribed audio-visual data are among the most common
- Reusing existing qualitative data can help plan studies more efficiently and effectively
- The Qualitative Data Repository is a source for vetted qualitative and mixed methods data
- QualCoder’s interface has sections for managing various aspects of projects and uses a mix of icons, menus, and pop-up windows to perform actions
Content from Best Practices for Qualitative Coding
Last updated on 2025-04-16 | Edit this page
Estimated time: 60 minutes
Overview
Questions
- What is the difference between inductive and deductive coding?
- How can I set up and apply a flexible code tree in QualCoder?
- How can I code text and view coded text in QualCoder?
Objectives
- Distinguish between inductive and deductive approaches to coding
- Develop deductive codes relevant to project objectives
- Code text and view coded text in QualCoder
Getting Started with Coding: Why We Code Qualitative Data
Many qualitative researchers spend much of their analytic effort on coding data, i.e. assigning labels to excerpts in the data. That is also what we will focus on today. But before doing so, it is still worthwhile to focus on why we code in the first place.
Coding is a form of abstraction: we make sense of large (sometimes overwhelming) amounts of qualitative data – somtimes referred to as “unstructured data” and generate some structure by coding it. This abstraction is not cost free: by forcing codes onto your data, you may lose some nuance and specificity. Some qualitative traditions will therefore encourage you to only start coding once you are deeply familiar with your data. Most traditions encourage an ongoing back-and-forth between data and codes to make sure data and codes match.
Moreover, the role of codes varies strongly between qualitative research traditions. In some traditions, codes are principally a background tool, used to organize materials for later writing. That is the case, for example, for most ethnographic writing, as well as for many historically oriented approaches such as comparative historical analysis (as used in sociology and political science) or process tracing (as used in political science and administrative sciences). You will rarely find a mention of codes, coding schemas, or a codebook in published work using these methods, and not all of its practitioners may apply codes to data at all.
In other approaches, such as thematic analysis or discourse analysis, codes are a key analytic outcome. Here, codes don’t just organize the data but are used a tool to give it meaning. In publications using these methods, the codes and their development are typically made explicit and often form the core of the methods section. The tools we provide in this workshop can help you with either type of analysis, but as you develop your coding schema, you want to be clear where you situate yourself methodologically.
Beginning qualitative researchers often want to jump right into analyzing data once documents are added to a project, but taking the time to develop a coding protocol first can save time and improve the transparency and quality of research.
The first step is typically to choose a coding philosophy, that is, to decide how and why code labels will be chosen and applied. Coding philosophies range along a spectrum from inductive to deductive approaches.
Deductive and Inductive Reasoning
Deductive reasoning begins from assumptions and hypotheses. It seeks to determine, using logic or data, whether the hypotheses can be shown to be false (or true).
Experimental methods in many disciples follow this pattern. First, a hypothesis or prediction about the effect of some cause is developed based on past research and observation. An experiment is conducted to isolate that specific potential cause, and conclusions are drawn based on the presence (or lack) of difference made by the difference.
A medicine trial is a classic example of a deductive approach. Scientists predict a treatment will help people in a specific way (for example, by reducing the length of an infection). They recruit participants (patients with the infection) who are randomly assigned so that some receive the medicine and others receive a placebo with no medicine. If the length of the infection is statistically shorter in the patients who received the medicine, that is taken as evidence that the medicine likely produced the desired result.
Deductive reasoning relies on minimizing exposure to outside variables that might affect the outcome of interest, or otherwise statistically adjusting for potential confounding factors.
Inductive reasoning, by contrast, seeks to draw more naturalistic conclusions by making close observations while recognizing personal biases and limits to observation. Inductive qualitative research draws on patterns observed in data to make predictions, generalizations, or analogies about more general patterns. Put differently, deductive research is typically explanatory in nature, while inductive research is exploratory.
Inductive social research often draws on critical or constructivist perspectives that emphasize how individuals and groups describe their own experience.
Taken in a longer scope, science can only develop through the complementary use of induction and deduction, sometimes visualized as a circular cycle. Observation is used to develop hypotheses, which are tested deductively through more observation. If hypotheses are not fully confirmed, inductive reasoning is used to develop revised or alternative hypotheses.
Even though deductive and inductive reasoning are both part of nearly every study in some way, how qualitative data is coded depends on the general purpose of the study. Exploratory studies tend to adopt an inductive method, while explanatory studies use more deductive approaches to code development.
Inductive coding
In 1967, US sociologists Glaser and Strauss formalized grounded theory, one method for conducting structured qualitative research without presupposing a hypothesis or theory. They recognized that how people experience the world can be at least as important as traditional measures (i.e., personal income or gross domestic product).
In grounded theory and other inductive coding methods, qualitative data like interview transcripts are read carefully and initial codes are applied that match the language and interpretation proposed by study participants themselves as closely as possible.
Often, researchers label codes in this open coding phase by using their judgment and experience to discern underlying themes in the experiences expressed across interviews.
In our scenario, the goal of this analysis is to prepare for
collecting and analyzing new data related to social media privacy and
confidentiality. BSR_05
is an interview with a PhD student
studying political communication.
Open the Coding
tab to Code text
in
QualCoder and click on BSR05_transcript_deidentified
in the
Documents
list to open it for coding on the right.
Sara Mannheimer
is the interviewer and BSR05
replaces the name of the student for privacy protection.

Codes (labels) are applied to text segments that reflect a particular concept, theme or idea. Before applying a code, it must be created in the Codes list or the Code organizer.
Codes can be nested within one or more levels of categories, which can distinguish coding processes, larger themes, or code status (such as draft). Importantly, and different from many other QDAS, QualCoder does not allow you to code directly to a category or to create subcodes of a code. They serve different purposes.
Open coding
We will start by creating open (inductive) codes. Return to the text
coding screen (Coding - Code text
). Select
BSR05
again. Scrolling 3 minutes and 49 seconds into the
interview, we learn this person was using a large dataset of Twitter
posts from the #MeToo movement with some significant privacy risks, for
example:
And a big part of our paper was we came up with a way of identifying which tweets were disclosures which were not, and we have to describe the method in the paper. And more or less, that’s a method to identify survivors of sexual violence in our in our data set.
The next challenge is flexible and can be done as a class or as a think-pair-share, where individuals spend 1-2 minutes brainstorming alone, then share with a partner and discuss, then with the class.
Creating open codes
Discuss as a class what kinds of labels you, as a social media privacy researcher, might apply to part or all of the excerpt above.
Open codes can range from very specific to more general, but theoretically fruitful codes are often somewhere in the middle - general enough to apply in multiple situations but specific enough those excerpts have something more in common.
There are multiple words or phrases that might capture some of this excerpt’s relevance for our research. QualCoder allows applying multiple codes to identical or overlapping excerpts. Doing so makes it easier to consider overlapping concepts, like “privacy” and “identity protection,” as well as other aspects of the context, such as “sexual violence” or “crime victimization”.
Codes must be created before they can be applied.
First, right-click in the code list and
Add a new category
called open coding
for our
codes.
Next, right-click the category and
Add a new code to category
, naming it after a theme or
concept you identified (i.e., privacy
).
Highlight the relevant text chunk at right with your mouse (“And a
big part… in our data set”) then click your new code. If it worked
correctly, you will see a count of 1
next to your code and
once you click somewhere else, the text will be highlighted in the
code’s color.
How many codes?
A single sentence often relates to multiple topics, so it’s common to apply more than one code to a single chunk of text or apply codes to overlapping text chunks. The next part of the workshop will discuss some of the reasons overlapping codes can be valuable.
To apply multiple codes to the same (or overlapping) text, repeat the
coding process with another code. QualCoder will display the overlap as
underlined text. Clicking on the underlined text will change to
one of the highlight colors, which can be cycled by pressing the
o
key or clicking the code of interest.
How much to code?
Decisions about how large of excerpts to use are challenging but important. Highlights generally should be only long enough to provide context and understand meaning. Some researchers always code full sentences or even paragraphs, while others make decisions case by case.
I might create codes for both “privacy” and “identity protection” but later realize they overlap so much conceptually they don’t need to be separated.
To merge two codes, load them in the Code organizer
,
right-click the code you want to replace, and choose
Merge code into code
then Apply
. After
merging, any text coded to one or both original code will have the name
of the “into” tag and the other will be removed.
Unmarking coded text
Beyond merging codes, sometimes you may accidentally mark an excerpt
and want to remove one or more codes entirely. To remove a code,
right-click and choose Unmark
. Trying to unmark text where
multiple codes have been applied will trigger a dialog box asking which
codes to unmark.
Unmark can also be used to adjust the start and end point of an excerpt. There are built-in tools available with right-click but they rely on counting the number of characters to move the endpoint so it may be easier to unmark and re-code instead.
In vivo coding
An alternative approach to inductive coding, in vivo coding, tries to further reduce researcher bias effects by creating initial code labels only from the language used in the interviews themselves.
Let’s look at a passage slightly earlier in the paragraph we’ve been
working with in BSR_05
.
And we ended up using that data for a project on looking at how people disclosed early in the hashtag campaign, and how that may have produced stigma around kind of disclosing for other women to disclose experiencing sexual violence. So it was kind of a sensitive data set.
The person being interviewed used a number of words and phrases that may be relevant to data privacy protection in these sentences including disclosed, stigma, disclosing, disclose, experience, sexual violence, and sensitive data. Rather than categorizing themes at this stage, in vivo coding retains language used by the participants.
After creating another category for in vivo
codes,
highlight “disclosed”, right-click, and choose
in vivo code
. This will add the exact word or phrase
highlighted as a code (if it does not exist). Do the same for other
language related to disclosure or sexual violence in the short
passage.
Managing in vivo codes
Before using in vivo code
in QualCoder, it is best to
ensure all existing codes, such as themes, are added to categories. This
prevents mixing up in vivo codes with other codes.
When you have finished in vivo coding, codes can be moved to the
in vivo
category by dragging them onto the category name or
right-clicking and selecting move code to
.
At this point, you may also want to consolidate in vivo codes of the
same root word, such as disclose
and
disclosure
. Instructions for merging can be found under
How much to code
above.
Be careful, however. If you move codes before coding is complete and later code the same word or phrase, it will be created as a separate code.
In vivo codes can be analyzed individually to understand specific language people use or aggregated into themes during the axial coding process.
Deductive coding
Deductive codes are applied much the same way as open codes, but development of a coding tree takes place earlier, ideally before data collection, because tags and themes reflect theories and hypotheses the study is designed to test.
Code deductively
Our research team has adapted 3 key themes from Sarikakis and Winter’s 2017 review of social media user’s consciousness of data privacy: 1. Autonomy: users desire control over when and to whom their data is disclosed 2. Compromise: users recognize privacy’s importance but also circumvent protections when seeking information 3. Stake: concern derives from being personally affected by privacy or sharing
Create the 3 themes above as codes within a new category called
deductive
. Add descriptions based on the background given
above by right-clicking and choosing View or edit memo
. You
can use memos to clarify exactly what a code means, to note questions
about whether something should be changed, or any other purpose. Memos
provide flexibility within a project to fill in gaps between formal
parts of the project.
In BSR_02
, read and highlight the text block quoted
below (from the section starting at 1:44
).
I’ve done some work on on Twitter on how like social people who are users of Twitter, I’ve done, did a project on on people who have been harassed on Twitter and the subject of kind of coordinated harassment campaigns, and how kind of their experiences.
In groups of 2-3 people, discuss whether each theme is relevant and why, then apply relevant codes to the excerpt.
Don’t focus on finding key words or synonyms in deciding whether to apply a code. Look for sections that suggest a relationship to themes of interest.
Autonomy might apply depending on whether the harassment people experienced on Twitter included doxxing or otherwise involved personal information. Without access to the original interviewee, it’s probably not reliable enough to use in drawing conclusions.
Compromise does not seem relevant here, as the presence of online harassment suggests the advantage of stricter protections, if anything.
Stake clearly applies. The Twitter users are described as experiencing coordinated harassment campaigns on Twitter, meaning not only is the platform used to harass, but to encourage others to do the same.
Like inductive and deductive reasoning, the separation between inductive and deductive coding is rarely complete. As a researcher, how closely you adhere to a single reasoning or coding model will likely depend on personal research questions and values, as well as norms in the fields where the research will be shared.
Axial coding
Most qualitative projects require more than one round of coding for a few reasons:
- The first documents rarely highlight every relevant theme. Themes and language important in later interviews may still be reflected in early interviews but less obvious before the theme was brought to the researcher’s attention. Revisiting early interviews supports consistent code application.
- Multiple coders may use tags in slightly different ways, which eventually need to be adjusted to a consistent scheme.
- Key relationships between codes may only become obvious once initial codes are considered, leading to consolidation or the development of new tags as researchers become more familiar with the documents and themes.
- When coding in vivo, people may use language differently and multiple phrases may represent a single idea.
Axial coding primarily addresses numbers 3 and 4, as it is involves relating or further breaking down primary themes or codes.
For example, the passage in BSR_05
coded to
in vivo - disclosure
concerned self-disclosure of personal
information during the #MeToo movement with the hope of reducing stigma
and producing a positive outcome. But Twitter users in
BSR_02
may have been subject to disclosure of personal
information without consent. Axial coding might involve distinguishing
disclosure based on whether it was voluntary or self-directed.
Axial coding takes many other forms, depending on the research topic and methods. Deductive research may revise and further specify existing theory based on new data in their study. In vivo codes or open codes that seemed distinct may turn out to be conceptually indistinguishable. Participants may be provided summaries of initial findings and asked if they reflect their personal experience. In all cases, axial coding is a tool to clarify analysis and theory.
Qualitative studies typically have one or more rounds of initial coding, followed by any amount of axial coding necessary to represent key concepts intelligibly to both researchers and study participants.
Importing QDPX projects
The data collection we found includes a QDPX
project
containing not only all the interview transcripts, but also the coding
scheme and coded text used by the original research team.
To import, select
Project - Import - REFI-QDA Project Import
(not codebook
import). Choose a name and location to store the project then navigate
to the QDPX file and complete the import.
Importing a project creates a new QualCoder project. Since your
existing work is auto-saved, you can switch between the two later using
Project - Open Recent Project
.
Before going further, use Manage - Manage files
to check
what files are in the new project.
Coding reports
Coding view (Coding - Code text
) is optimized for
reading through transcripts or documents and applying a variety of
codes, but it can be difficult to get a sense of what all is matches a
particular code in your document corpus using coding view.
The Coding report
allows you to view every text passage
coded to a single code (or set of codes), optionally restricted to
specific source documents. This can be a great help to axial coding
because you can look at one or more related codes more closely to see if
they should be merged, split, relabeled, or left as-is.
Create a report using Reports - Coding report
. Nothing
should display in the right window, but sources will be listed in the
top-left and codes in the bottom-left.
Select only the Big Social Research interviews by right-clicking in
the sources, choosing Select files like
and entering
BSR
in the dialogue (it must be all-caps).
Let’s focus on privacy harms using the codes from the original study
team. Find and click the
privacy - considering potential harms
category. When you
select a category, any codes linked below it in the coding tree will
also be included. Finally press the play button
(right-facing triangle) to run the report.
The report shows all three text chunks coded to any of the selected privacy codes in the BSR interviews, with information on the code, file, and coder in a highlighted row above the excerpt.
Categories with codes
Other major QDAS, unlike QualCoder, allows coding directly to categories. This project was originally created in NVivo and had text coded to both codes and categories. When importing a QDPX project, QualCoder creates codes with identical names when it encounters a category with codes.
Privacy - considering potential harms
is an example of
this behavior, where a code duplicating the category was automatically
created.
Axial coding with reports
Read through the three results in the above report. Each one currently has a different code applied. Based on the text and names, discuss with the group whether you think they represent similar enough concepts their codes should be combined if these were the only interviews being considered.
There is no authoritative right or wrong answer here. Whether to combine or split codes depends on the research questions, methods, and subjectivities, as well as on what participants tell you. The most important guide is to ensure that you are treating your research participants in a way that is faithful to their own experience, as you understand it.
We won’t make any changes at this time, but if you did want to merge
two codes, you would use the Code organizer
or
Code text
screen, right-click one, and select
Merge code into code
. Now both will be a single code by the
into
name, which you can rename if needed.
QualCoder’s coding tools (specifically Code text
,
Code organizer
and Coding report
) comprise the
core suite for the coding process itself. In the next section, we will
build on that foundation and explore qualitative and mixed methods
analysis methods that can be used in QualCoder.
Key Points
- Qualitative code protocols are developed based on research goals and philosophies
- Inductive research focuses on discovering or exploring themes and often uses open or in vivo coding
- Deductive research focuses on testing hypotheses and typically applies a predefined coding scheme based on theory
- In QualCoder, codes are labels applied to highlighted excerpts of text
Content from Qualitative Data Analysis
Last updated on 2025-04-16 | Edit this page
Estimated time: 60 minutes
Overview
Questions
- How can QualCoder help analyze coded data?
- What are some common approaches to analyzing qualitative data?
Objectives
- Practice drawing conclusions about cases and themes with QualCoder
- Distinguish questions answerable using cases, themes, or only in combination
By the end of coding, researchers can be quite familiar with the data. Even if they have already drawn some tentative conclusions, structured data analysis is important to validate findings, discover alternatives, and document evidence and rationales. This step increases the research’s impact and value not just for others, but for future revision or expansion of your own research.
How you can analyze your data depends on multiple decisions, including your software, the type of data you have, and how you structured your codes. But the choice of methods also depend on your research questions.
Cases
In qualitative analysis, cases most often represent individual people, like those interviewed in the BSR interviews.
You may have already done some informal case-based analysis by observing the types of privacy concerns different researchers encountered or how they went about dealing with them.
Case analysis considers the similarities and differences between individuals to help understand people holistically, including their unique contexts. In its most basic form, reading an interview is a form of case analysis. Often, researchers keep notes about individual cases, which may include summaries of relevant information and thoughts about how different themes and personal characteristics seem related in that individual’s view of the world.
Groups
A case is not always a person. It can also be a document, an organization, a news source, or another unit of aggregation whose members are categorically distinct from one another. This primarily occurs in content analysis, rather than interview and focus group research.
Brittany Shaughnessy, for example, wrote a thesis, studying gun rights messaging in the 2020 US election. She performed qualitative content analysis on Twitter posts from the official accounts of two advocacy organizations: Everytown for Gun Safety (supports gun control) and the National Rifle Association (supports gun rights).
In this situation, individual media relations personnel are not the primary interest, even if we could identify them. The purpose of the research is to compare the topics and language used by advocacy organizations with contrasting goals.
Case analysis practice
The cases we have examined discuss privacy for a variety of social and review platforms, including (interview and starting timestamp in parentheses):
- Academic peer review (BSR_01
25:03
) - Wikipedia (BSR_02
35:25
) - Twitter (BSR_03
33:40
)
Discuss some differences between the platforms in what concerns researchers express about data privacy and the challenges of resolving them. Are there common themes that emerge across all three?
Treat this exercise as inductive and try to consider what you read as a whole, rather than focusing on the deductive themes we coded.
Case analysis may be the primary focus of a study, particularly when the goal is to understand individual thought processes or group cultures.
Themes
Themes inevitably emerge when studying cases, but in case-based analysis are treated primarily as features of specific contexts.
Thematic analysis, by contrast, focuses on how themes are similar or different across cases. Goals can include constructing general models of a concept, discovering how circumstances can impact an individual’s mental model of a concept, and testing the validity of theoretical propositions in lived experience.
Each labeled code can be treated as a potential theme, and
Coding report
provides a direct way to view all passages
coded to a specific tag. To see more context around the coded passage,
right-click the highlighted header above that passage in the coding
report, then choose View in context
. This displays a pop-up
window centered on the passage that allows you to scroll through the
rest of the file.
You may want to take notes on sub-themes or variations within a theme
in a journal
, a special type of memo in QualCoder meant to
record your thoughts when coding. Create or edit journals from the
Manage - Journals
dialog.
Alternately, you can apply additional codes as you work more closely
with individual themes, although codes can only be applied after
returning to the Code text
view.
QualCoder also provides a count of the number of passages to which each code has been applied, which can give a quick sense of how ubiquitous themes are across your data. Be cautious about using such counts to draw conclusions, however. A theme may be mentioned only a small number of times but still be critical to understanding a topic or how subgroups of individuals think about that theme.
The Code frequencies
report also allows counting the
number of times themes appear in a specific subset of files.
Code frequencies may provide an impression of how widely relevant specific themes are. But again, counts cannot reveal the richness of the stories qualitative research is designed to engage, so exercise judgment before using them as a primary analytic tool.
Select all files (click on one file in the box then press
CMD/CTRL+a
to select all). Select only the
code
privacy - considering potential harms
, although it is
within a category of the same name.
Framework matrices
Framework matrices are a type of visual organizer some qualitative researchers use to conduct and interpret analysis. A framework matrix places one case or group in each row and one theme in each column, with the themes related to a single overarching framework. Once the table is set up (on a computer or by hand), the researcher fills each cell with one or more quotes or summaries that encapsulate that theme for the case or group.
This process is undertaken systematically, following these steps outlined by Laurie J. Goldsmith:
- Data familiarization
- Identifying a thematic framework
- Indexing all study data against the framework
- Charting to summarize the indexed data
- Mapping and interpretation of patterns found within the chart
The table below is an abbreviated example of what a completed
framework matrix might look like, using modified versions of some of
Sarah Mannheimer’s privacy
codes as an example:
data collection | data security | data sharing | |
---|---|---|---|
BSR_03 |
“if you assemble it in one place, it makes it easier to find. And so we would not make that data publicly available. sources are public, anyone can go do what we did. But nope, not gonna make that publicly available.” “one of my students is working on review based recommendation and making use of that review text. And they’re, like, since it’s review text, if I really wanted to figure out who a user is I and I could go figure out who wrote that review, because it’s on, it’s like, they just scraped Goodreads public reviews. Yeah, but I’m not going to go do that. Because re-identifying users is not the business for it.” |
||
BSR_05 | “I was always trying to wonder like, how many copies of this data should I have?And like, where should these be. But because the more copies there are like, the more chances there are that someone is gonna be able to touch it that shouldn’t be able to touch it.” |
“we generally try not to quote people who aren’t like public, big public figures or who, like, wouldn’t expect that their tweet could be quoted… So what we ended up doing for those was altering, we didn’t report actually direct quotes, we altered the text. And we do like altered texts, and we mismatch—mash together, like similar tweets, so that, hopefully, they shouldn’t be identifiable. Like, you shouldn’t be able to reverse look them up or something like that” “what pushed that conversation into releasing them was that I was able to propose [the data repository we used] as kind of this—I think someone referred to it like as a walled garden approach, like the data is there, you can see it, but like, there’s a wall around it, that only certain people can get through” |
|
BSR_07 | “the approach that I’ve been told is sufficient or is a good approach is basically that you are storing the data on a password protected computer. I backed things up to a external hard drive, and that’s also password protected. And I know I don’t, oh, and also, I shared some data with my co-author on Google Drive. And, and it was, you know, again, it was it was shared only with him. So I feel, I feel like those are sufficient steps to, to safeguard c onfidentiality and privacy.” | ||
BSR_08 | “we usually would not, in the paper publish, the Twitter handles, or the names of individuals, except for organizations” | ||
BSR_10 | “we collected a tweet posted by the libraries in [disaster areas] and then wanted to look at how they communicated during a certain like [disasters]. And there were some just library patrons communicating with these libraries. So in the case, when I publish things, I try not to focus on these individuals.” |
In the matrix above, Coding report
was used to find
sections of BSR interviews related to each of the three topics. For
example, all codes in these three categories were used as sources for
the first column (based on the report shown below:
- privacy - assembling a lot of data can threaten privacy
- privacy - try to collect as little data as possible
- privacy - research design - privacy - data collection methods to support privacy

Much of the work of analysis and theory-building is part of creating a framework matrix, and so, by the time you finish, you’ll likely already be much closer to answering research questions. That said, there are also advantages to working with a framework matrix during the analytic process, as well as their utility as a summary tool for others.
Reading across columns (within a line) on a framework matrix allows for analyzing cases. Reading down rows (within a column) allows for thematic analysis. And having both summarized together opens up options to study how clusters of cases may share similar approaches to themes. This kind of intersectional analysis can be done informally, or can be used to create formal case classifications or thematic typologies to stimulate further theory-building and research.
Sentiment and degree
Sometimes, particularly when considering deductive hypotheses, it is not enough to code only for the presence or absence of a theme. In such situations, semi-quantitative coding may be applied in one of at least two ways.
Sentiment codes indicate whether the feeling or attitude expressed in an excerpt of text is positive, neutral, negative, or mixed in relation to a theme. Neutral and mixed can be hard to distinguish. Neutral sentiment is generally unbothered about good or bad in relation to something, while a mixed sentiment includes both positive and negative feelings, often toward different aspects or implications.
In the excerpt below from BSR_02
, certain Wikipedia
contributors are attributed a negative sentiment toward contribution
disclosures, which might also be framed as a positive sentiment toward
privacy.
Some of them like hold ideological views that are against like the counting of contributions. And they’re just like, “I don’t believe that that’s something we should be doing. And so I want to remove myself from this list.”
Sentiment codes can be applied in the same way as other codes, by
creating a category like sentiment
then adding codes like
positive
. They work best when applied to the same excerpts
as a thematic code, so there is no ambiguity as to which theme is
associated with the sentiment.
An alternative way to integrate degrees of valuation into qualitative
coding is to code on a scale. For example, the amount of stake that a
sexual assault survivor has in protecting their identity from disclosure
is higher than that of a Wikipedia contributor who wishes to remain
anonymous to avoid attention. Scales typically are numeric with a
relatively small number of rating points, such as a three-point
Low
, Medium
, High
scale.
Occasionally, qualitative data also asks about specific quantitative measures that may have more natural units, such as a study of childhood reading experiences that asks about how many minutes a day each parent reads to their child.
QualCoder provides no option to directly attach a numeric rating to a tag. It is again possible to create a separate set of codes to capture ratings, as described above for sentiment, but not in the same way as some other CAQDAS packages which provide integrated code scoring functionality.
Key Points
- Case analysis focuses on the unique situation of each person or group
- Theme analysis focuses on how the study population perceives or discusses themes or ideas
- Framework matrices are a formal method to combine case and theme analysis using a visual organizer
- Information on sentiment, degree, or quantity can also be encoded for qualitative analysis
Content from Saving and Sharing Qualitative Data
Last updated on 2025-04-16 | Edit this page
Estimated time: 60 minutes
Overview
Questions
- How can QualCoder projects be shared or archived in common data formats?
- What are the advantages of sharing qualitative data?
- What can I do to continue learning?
Objectives
- Recognize reasons to consider archiving or sharing qualitative data
- Practice import/export of various parts of QualCoder projects
Many qualitative researchers prefer not to share or archive their data and analysis. In large part, this is because of the amount of interpretive labor they invest, along with frequent direct trusted connections with groups and individuals being studied. This trust can take time to develop before and during data collection, whether as participants or observers.
There are important advantages to archiving and sharing data, however, even for qualitative data. Each researcher ultimately must make their own decisions, but this final section will outline potential values and risks of archiving or sharing data, as well as how to go about doing so in QualCoder.
Callout
Data sharing and interchangeable data formats have only recently become somewhat widespread in qualitative research. Be aware that information in this section may rapidly become out of date, and if in doubt, consult with a human subjects protection board, archivist, or librarian at your institution or the Qualitative Data Repository before sharing data.
Import and export in QualCoder
QualCoder projects are automatically saved to your local computer as
you make changes to them.We previously used the
REFI-QDA Project import
functionality (in the
Project - Import
menu) to import Dr. Mannheimer’s analyzed
project.
To share a project with another QualCoder user only requires copying
the directory you set up when you created the project. To share with
users of other CAQDAS, however, you need to export the project with the
Project - Export - REFI-QDA Project export
dialog. Choose
an option on the warning screen; if you’re not sure, choose the default
option. Choose or create a directory for the project and click
Open
. This will create a file ending in
QDPX
.
QDPX
files use the REFI-QDA Project
standard, an XML-based archive format that stores both source files
and core components of qualitative projects in a way that most major
CAQDAS packages can recognize and import. The components that can be
included are below (* indicates a feature is not fully supported in
imports/exports as of QualCoder 3.6).
- sources (text, PDF, image and multimedia formats)*
- segments (specified chunks of materials/data identified as meaningful by the user)
- codes (labelled ‘tags’ attached to segments)
- annotations (comments attached to segments, codes, sources or links)
- memos (writing spaces for analytical notes, either standalone or attached to sources and/or codes)
- links between codes, segments or memos
- cases
- sets/groups of entities*
- visual representations of linked entities in the project*
- user information
You can also export only the codebook
(REFI-QDA Codebook export
) if you only want to share the
names of codes and structure of the coding tree. This can be used by
researchers with similar data to create a parallel coding structure,
without the potential risks of sharing raw data. Codebook export files
end in QDC
instead of QDPX
.
For more information on imports and exports, see the QualCoder Manual.
Other qualitative data formats
Each major CAQDAS package has its own proprietary data format,
specialized to its features. In most cases, projects cannot be moved
between software in these formats, although QualCoder does provide
direct import for RQDA
data.
Data files in some software may also be restricted to specific platforms
(Windows or Mac) or software versions. In general, we recommend keeping
a copy of your final project in both the original format and in
QDPX
format.
Data sharing for collaboration
Working with collaborators is one of the most common reasons for sharing qualitative projects. QualCoder doesn’t offer a cloud storage service that would enable real-time sharing, but projects can be stored on a secure shared storage platform (such as Google Drive or Microsoft OneDrive) and used by multiple users.
Here are a few tips to keep in mind to improve the collaborative QualCoder experience:
- Plan ahead to ensure only one collaborator at a time is using the data. Making changes to a project on multiple computers simultaneously may lead to data loss, and you may not even realize anything is missing or corrupted immediately.
- When working collaboratively, make sure to change the active user in
Project - Settings
every time you open the project. This ensures coding and other changes can be tracked by user, and also allows for later comparison of coding or inter-rater reliability calculations withCoding comparison
reports. - If multiple people are coding the same text, it may be easier to agree on a single coding scheme and create the code tree before starting the coding process.
- If adding codes after initial tree creation in a collaborative
project, consider placing them under a
Draft codes
category and using journals or code memos to document your reasoning until you are able to discuss them with your team.
Exporting project elements
Sometimes, however, you only want to share a specific portion of the
project with collaborators. Most views and reports in QualCoder can be
exported in a variety of formats, including XLS
or
CSV
(spreadsheet), HTML
(webpage), and
TXT
(plain text). Unlike sharing a whole project, exporting
individual project elements gives fine-grained control over the setup of
the exported file.
Transparency and reuse
Certain scientific disciplines have called for research transparency, including sharing data and analytic procedures, with momentum growing in many social sciences. Until recently, that effort was primarily applied to quantitative data, such as surveys, experiments, and “big data” gathered from the internet.
A small number of initiatives are working to change that perception
and promote the thoughtful, informed, careful sharing of qualitative and
mixed methods data as well, led in part by the Qualitative Data Repository and enabled
by improvements in sharing like REFI-QDA
.
Helping future you
If the arguments above don’t convince you that it’s good and worth the effort to archive qualitative data securely, there is one more important case to be made.
Archiving your data in an open and interchangeable format that is backed up in multiple locations can help future you. If your only copy of your project is on your laptop and it is lost or destroyed before analysis and review are complete, or if you decide you want to do additional analysis later, there isn’t much that can be done short of repeating a large amount of work.
QualCoder only stores data on your computer. Whether or not you plan to share or reuse your data, consider exporting your project at key points, such as after you’ve completed initial or axial coding.
Next steps
This lesson has covered basic principles of qualitative research, as well as how to use Taguette for qualitative coding and data analysis. With that foundation, the next step is to find or collect your own data and try it for yourself.
In addition to the Qualitative Data Repository (used and discussed here), some other social science data archives such as ICPSR have qualitative sections. For qualitative content analysis, there are thousands of open-access archives of documents, research, education resources, and even entire books.
Key Points
- Archiving and sharing qualitative data can help you, your collaborators, and other researchers
- QualCoder’s import and export options can help you share key components of your work
- There are many resources available to continue learning about qualitative research and finding, reusing and sharing qualitative data