First-Cycle Coding Methods

6-Step Guide

Coding is the foundational analytic act in qualitative research. It is the process of assigning short labels --- codes --- to segments of your data in order to sort, synthesize, and begin to interpret the meaning within your transcripts, field notes, or documents. If data collection is about hearing your participants, coding is about beginning to understand them. This guide introduces the most widely used first-cycle coding methods and walks you through the practical process of coding qualitative data for the first time.

Step 1: Understand What Coding Is (and Is Not)

A code is a researcher-generated word or short phrase that symbolically assigns a summative, salient, or evocative attribute to a portion of data. Coding is not merely labeling --- it is an analytic process that involves reading data closely, interpreting meaning, and making decisions about what matters and why.

Coding is not the same as analysis. Coding is a transitional step between data collection and more intensive analysis. It reduces your data into manageable segments while preserving (and flagging) the complexity within those segments. Think of codes as handles you attach to your data so you can pick it up, examine it, and rearrange it during later analytic phases.

Johnny Saldana, whose The Coding Manual for Qualitative Researchers is the essential reference in this field, distinguishes between first-cycle and second-cycle coding methods. First-cycle methods are applied during your initial pass through the data. Second-cycle methods (pattern coding, focused coding, axial coding, theoretical coding) reorganize and synthesize first-cycle codes into higher-order categories and themes. This guide focuses on the first cycle.

Step 2: Learn the Major First-Cycle Methods

There are dozens of first-cycle coding methods, but five are used most frequently across qualitative traditions. Understanding the logic of each will help you choose the right approach for your study.

Open Coding

Open coding is the most common starting point, especially in grounded theory. You read through your data line by line or segment by segment, remaining open to whatever the data suggest. You generate codes freely, without trying to fit data into preconceived categories. The goal is to fracture the data --- to break it apart so you can examine its components.

**Interview excerpt:** "When I walked into the lecture hall on the first day, I just froze. There were 300 people in there. I came from a high school with 200 kids total. I didn't know where to sit, who to talk to. I just found a seat in the back and tried to disappear."

Open codes applied:

  • Shock at scale
  • Size comparison (high school vs. university)
  • Disorientation
  • Social isolation
  • Self-protective withdrawal
  • Desire for invisibility

In Vivo Coding

In vivo codes use the participant's own words as the code label. This method honors participants' voices and keeps analysis grounded in their language rather than the researcher's conceptual frameworks. In vivo codes are placed in quotation marks to signal that they are direct participant language.

**Interview excerpt:** "I felt like a fraud the whole first semester. Everyone else seemed to know the rules --- how to email a professor, how to use the library database, what office hours even were. I was just faking it every single day."

In vivo codes applied:

  • "felt like a fraud"
  • "know the rules"
  • "faking it"

In vivo coding is especially powerful in phenomenological research, where capturing lived experience in participants' own language is a methodological priority.

Descriptive Coding

Descriptive coding assigns labels that summarize the topic of a data segment in a word or short noun phrase. Descriptive codes identify what a passage is about, not what it means. They are useful for organizing large data sets and are often a good first step before applying more interpretive coding methods.

**Interview excerpt:** "My advisor told me I needed to pick a methodology before I could write my proposal. But I didn't even know the difference between qualitative and quantitative at that point. My program didn't have a standalone methods course --- it was just one chapter in the research foundations class."

Descriptive codes applied:

  • Advisor guidance
  • Methodology selection
  • Methods training gap
  • Curriculum limitations

Process Coding

Process coding uses gerunds (words ending in "-ing") to label observable or conceptual actions in the data. This method is ideal when your research focuses on how people do things, navigate processes, or manage sequences of action. It is particularly useful in grounded theory, where understanding social processes is central.

**Interview excerpt:** "I started by looking at what other dissertations in my program had done. Then I read a couple of methods textbooks. I kept going back and forth between phenomenology and case study for weeks before I finally committed."

Process codes applied:

  • Reviewing precedent
  • Consulting literature
  • Deliberating between approaches
  • Committing to a methodology

Initial Coding

Initial coding is Charmaz's term for the grounded theory approach to first-cycle coding. It combines elements of open coding and process coding, emphasizing line-by-line analysis that stays close to the data while remaining open to all possible theoretical directions. Initial coding avoids applying preexisting theoretical frameworks, instead letting concepts emerge from the data themselves.

Step 3: Read Your Data Before You Code

Before assigning a single code, read through your entire data set --- all transcripts, all field notes --- at least once without coding. This immersive reading familiarizes you with the breadth and depth of your data. During this initial reading, jot down impressions, surprises, and recurring ideas in a separate memo document. These early memos will inform your coding decisions.

Some researchers read through the data twice before coding: once to absorb the content and a second time to start noticing patterns, tensions, and silences. Resist the temptation to skip this step. Coding without first immersing yourself in the data leads to superficial, topic-level codes that miss the deeper interpretive possibilities.

Step 4: Generate Your Codes

Begin coding your first transcript. Work through the data segment by segment. A segment might be a single line, a sentence, a paragraph, or a meaningful unit of thought --- the appropriate size depends on your method and your data.

For each segment, ask yourself: What is happening here? What is this about? What does this mean to the participant? What process is at work? Write your code in the margin, in a digital coding software (NVivo, Atlas.ti, Dedoose), or in a simple spreadsheet with columns for the data excerpt, the code, and your analytic memo.

During first-cycle coding, err on the side of generating too many codes rather than too few. You can always consolidate later. A typical first-cycle coding process for a single transcript might generate 80 to 150 codes. Across an entire data set, you may have 300 to 800 initial codes. This is normal and expected.

The first time I coded a transcript, I panicked because I had over a hundred codes from a single interview. My advisor told me, 'Good --- that means you're actually reading the data. Now the real work begins.' She was right. The mess is part of the process.
Dr. Marcus Webb

Step 5: Build Your Codebook

As you code, build a codebook --- a structured document that catalogs every code you are using. A codebook entry typically includes four elements:

  1. Code name: A concise, descriptive label (e.g., "Navigating bureaucracy")
  2. Code definition: A clear statement of what the code means and when to apply it (e.g., "Participant describes experiences of dealing with institutional administrative processes, including registration, financial aid, advising systems, or departmental requirements")
  3. Inclusion criteria: What counts as an instance of this code
  4. Example: A representative data excerpt that illustrates the code

Your codebook is a living document. It will grow, change, and undergo revision throughout your analysis. Codes will be split when you realize a single code is capturing two different ideas. Codes will be merged when you realize two codes are describing the same phenomenon. Codes will be renamed when you find more precise language. This iterative refinement is not a sign of failure --- it is the engine of qualitative analysis.

Code Name Definition Example Data
Navigating bureaucracy Participant describes dealings with institutional administrative processes "I spent three days trying to figure out how to register for classes. Nobody could tell me which form I needed."
Seeking mentorship Participant describes actively looking for guidance from more experienced individuals "I found a senior student from my hometown and basically followed her around for the first month."
Impostor feelings Participant expresses doubt about belonging or qualification in academic setting "I kept waiting for someone to realize I didn't belong there."

Step 6: Tips for Beginning Coders

Code with a question in mind. Your research questions should orient your coding without rigidly constraining it. You are looking for data relevant to your questions, but you also remain open to unexpected findings.

Use analytic memos. Every time you notice something interesting, surprising, confusing, or contradictory, write a memo. Memos are where your real analysis happens. A code labels the data; a memo begins to interpret it.

Code collaboratively when possible. If you have a research partner or peer, independently code the same transcript and then compare. Discrepancies are not problems --- they are analytic opportunities that force you to articulate why you see the data as you do.

Do not confuse quantity with quality. Having hundreds of codes does not mean your analysis is thorough. The quality of your code definitions, the precision of your code applications, and the depth of your memos matter far more than the number of codes.

Expect discomfort. First-cycle coding is inherently messy, ambiguous, and uncertain. You will question whether you are "doing it right." That uncertainty is a feature of qualitative analysis, not a bug. Trust the process, keep coding, and let patterns emerge gradually.

Take breaks. Coding is cognitively demanding. Work in focused sessions of 60 to 90 minutes, then step away. Fresh eyes see patterns that fatigued eyes miss.

Ready to build your codebook? Use the free Subthesis Codebook Generator.

Get Started