Blog

Research

How to Build Themes From Codes (Theme Mapping Worksheet + Worked Example)

Andrew Russo

Posted in Zoom Mar 6 · 8 Mar, 2026

How to Build Themes From Codes (Theme Mapping Worksheet + Worked Example)

You build themes from codes by grouping related codes into clusters, naming what those clusters mean, and then testing each theme against your data until it has clear boundaries and strong evidence. A theme mapping worksheet helps you do this in a repeatable way, so your themes stay grounded in quotes, not guesses.

This guide shows a practical worksheet you can copy, plus a worked example with code clusters, theme labels, boundaries, and the evidence you should collect to support each theme.

Primary keyword: build themes from codes

Key takeaways

A code is a tag you apply to data; a theme is a meaningful pattern across many data points.
Use a theme mapping worksheet to move from “lists of codes” to “themes with boundaries and proof.”
Good themes have: a clear label, a short definition, inclusion/exclusion rules, and enough evidence (quotes) to stand up to review.
Expect to merge, split, and rename themes as you test them against the full dataset.

Codes vs. themes: what changes when you “go thematic”?

Codes stay close to the surface of what people said or did, like “long wait time” or “confusing instructions.” Themes sit one level higher and explain what those codes add up to, like “friction in the service journey.”

If you try to turn each code into a theme, you will end up with too many themes and not enough insight.

A quick definition you can use in your project notes

Code: A short label applied to a segment of data (a phrase, sentence, or turn of talk).
Category/code cluster: A group of related codes that often co-occur or point to the same issue.
Theme: A coherent pattern of meaning that answers your research question and is supported by multiple pieces of evidence.

What “evidence” means in practice

Evidence is not a single strong quote, even if it sounds perfect. Evidence is a set of data extracts that show the pattern appears across participants, contexts, or time points.

In a worksheet, that usually means: quote IDs, participant IDs, and short excerpts that demonstrate the theme and its sub-points.

The theme mapping worksheet (template you can copy)

A theme mapping worksheet is a structured table that forces you to make decisions: what belongs together, what the theme means, and what does not belong. You can build it in Google Sheets, Excel, Notion, Airtable, or inside your qualitative analysis tool.

Worksheet columns

Code cluster name: A working label for the group of codes (you can rename later).
Codes in cluster (and short meanings): List the codes and add a 3–7 word meaning if needed.
Why these codes belong together: Your logic (similar cause, same outcome, same stage, same emotion).
Candidate theme label: A clear, human-readable name.
Theme definition (1–2 sentences): What the theme is really about.
Theme boundaries: What’s in and what’s out (inclusion/exclusion rules).
Subthemes (optional): Useful when one theme has distinct parts.
Evidence required: What you must collect to justify the theme (types of quotes, spread of participants, negative cases).
Best supporting extracts: Quote IDs + short excerpt.
Disconfirming/edge cases: Extracts that challenge the theme or show limits.
Notes / next actions: Merge, split, rename, gather more data, clarify definition.

A simple “evidence required” checklist

Coverage: More than one participant/source supports it.
Depth: At least a few extracts explain “why,” not only “what.”
Variation: The theme holds across different contexts (or you state the boundary clearly).
Negative cases: You checked for quotes that contradict it and can explain them.

Step-by-step: how to cluster codes into themes

This workflow assumes you already coded your data, either manually or in a tool. If you are still refining codes, do a quick clean-up first so similar codes use the same wording.

1) Clean your code list before you cluster

Merge duplicates (example: “hard to find” and “can’t locate”).
Standardize tense and format (example: verb-based or noun-based, but consistent).
Clarify vague codes by adding a short memo (example: what counts as “support issues”?).

2) Print or export coded segments and look for “families”

Clustering works best when you look at codes and their attached excerpts together. A code without context often leads to weak clusters.

Start by grouping codes that share a common idea, such as the same stage of a journey (onboarding), the same driver (uncertainty), or the same outcome (drop-off).

3) Build clusters, then name them as categories first

At this stage, avoid fancy theme names. Use plain category names like “Onboarding confusion,” “Price sensitivity,” or “Trust and credibility.”

When a cluster feels too broad, split it by cause or by moment in time.

4) Turn categories into candidate themes

Write a 1–2 sentence definition of what the cluster is showing.
Ask: “How does this help answer the research question?”
Draft a theme label that would make sense to a non-expert reader.

5) Set theme boundaries (this prevents “everything fits” themes)

Boundaries are rules. They protect your analysis from becoming a catch-all.

Use two lists: what belongs, and what looks similar but does not belong.

6) Test each theme against the full dataset

Pull all extracts for the cluster and read them in one sitting.
Check for internal consistency (do the extracts really match the definition?).
Check for external distinction (is it clearly different from other themes?).

7) Capture evidence and “negative cases” in the worksheet

Negative cases do not ruin a theme. They often sharpen it by revealing limits or subgroups.

If a theme only works for a subset (for example, first-time users), add that boundary instead of forcing it to fit everyone.

Worked example: theme mapping worksheet (codes → clusters → themes)

Below is a realistic example using a fictional dataset: interviews with people who tried a new online appointment booking portal for a clinic. The goal is to show the method, not claim findings from a real study.

Research question (example)

“What helps or blocks patients from booking an appointment online?”

Sample code list (excerpt)

Confusing medical terms
Too many steps
Not sure which appointment type to choose
Error message with no fix
Calendar shows no availability
Gave up and called instead
Worried about sharing personal data
Unsure if booking went through
No confirmation text/email
Long wait on phone (comparison)
Helpful receptionist fixes it
Prefer to talk to a person

Theme mapping worksheet (worked rows)

You can copy the structure below into a spreadsheet and add your own quote IDs.

Row 1
- Code cluster name: Choosing the right option feels risky
- Codes in cluster: Confusing medical terms; Not sure which appointment type to choose; Prefer to talk to a person
- Why together: All point to uncertainty and fear of selecting the “wrong” path without guidance.
- Candidate theme label: “Decision anxiety blocks self-service”
- Theme definition: People avoid online booking when they cannot confidently match their need to an appointment type and want reassurance from a human.
- Theme boundaries (in): Confusion about categories; desire for advice; worry about making a mistake.
- Theme boundaries (out): Pure usability friction like slow pages; availability issues like no slots; technical errors.
- Possible subthemes: Language and jargon; Need for triage or guidance.
- Evidence required: Multiple participants describing uncertainty; at least one quote where calling is framed as “safer”; at least one quote showing what guidance would help (examples, tooltips, decision tree).
- Best supporting extracts (example format): P03: “I didn’t know which one to pick… I’d rather call so I don’t book the wrong thing.”; P07: “The terms felt medical. I wasn’t sure what applied to me.”
- Disconfirming/edge cases: P11: “I just picked one and it worked fine,” which may indicate the issue affects first-time users more.
- Notes / next actions: Check if this pattern is strongest among first-time users; consider adding boundary “new patients.”
Row 2
- Code cluster name: The process feels longer than it should
- Codes in cluster: Too many steps; Gave up and called instead
- Why together: These codes describe drop-off caused by perceived effort.
- Candidate theme label: “Effort fatigue leads to drop-off”
- Theme definition: When booking requires many screens or repeated data entry, people abandon and switch channels.
- Theme boundaries (in): Step count complaints; repetition; time/effort language (“took forever,” “kept asking”).
- Theme boundaries (out): Confusion about what to choose (decision anxiety); failures after submission (confirmation); true unavailability of appointments.
- Possible subthemes: Repeated forms; unclear progress indicator.
- Evidence required: Several extracts that mention effort and an action (abandoned, postponed, called); at least one extract that compares online vs phone effort.
- Best supporting extracts (example format): P02: “It was like page after page… I stopped and just called.”; P09: “I kept filling the same info.”
- Disconfirming/edge cases: P05: “It was quick for me,” which may point to differences by device or returning users.
- Notes / next actions: Check device type in memos; split into “mobile friction” vs “overall step count” if needed.
Row 3
- Code cluster name: Breakdowns with no recovery path
- Codes in cluster: Error message with no fix; Calendar shows no availability
- Why together: Both describe dead ends where the user cannot proceed.
- Candidate theme label: “Dead ends destroy momentum”
- Theme definition: People lose trust and stop trying when the system blocks them without giving a clear next step.
- Theme boundaries (in): Errors; empty availability; blocked progress; “nothing happens.”
- Theme boundaries (out): Uncertainty about whether it submitted (confirmation); privacy concerns (trust but different driver).
- Possible subthemes: Technical errors; perceived lack of capacity.
- Evidence required: Quotes that include both the problem and the impact (stopped, switched channels, frustration); at least one quote describing what a recovery step would look like (alternate dates, waitlist, call-back).
- Best supporting extracts (example format): P06: “It said error but didn’t tell me what to do… so I quit.”; P10: “No times showed at all. I thought the site was broken.”
- Disconfirming/edge cases: If some users interpret “no availability” as real capacity rather than a system issue, note that as a boundary.
- Notes / next actions: Decide whether “no availability” is a usability issue or a real-world constraint; you may split it later.
Row 4
- Code cluster name: Trust hinges on confirmation and privacy
- Codes in cluster: Unsure if booking went through; No confirmation text/email; Worried about sharing personal data
- Why together: All describe uncertainty about whether the system is safe and reliable.
- Candidate theme label: “Low trust increases double-checking and drop-off”
- Theme definition: Without clear proof of success and clear signals of security, people hesitate, repeat steps, or switch to phone to confirm.
- Theme boundaries (in): Confirmation cues; receipt/record concerns; privacy and data sharing worries.
- Theme boundaries (out): Step count fatigue; choosing appointment types; errors and dead ends (unless they trigger trust concerns, which should be noted as overlap).
- Possible subthemes: Confirmation signals; privacy reassurance.
- Evidence required: Extracts that show (1) doubt, (2) the behavior it caused (re-try, call, screenshot), and (3) what reassurance would help (confirmation number, email, secure portal language).
- Best supporting extracts (example format): P01: “I clicked submit and then… nothing. I didn’t know if it worked.”; P08: “I don’t like putting health info online unless I know it’s secure.”
- Disconfirming/edge cases: Users who trust it because they recognize the clinic brand; note brand familiarity as a possible condition.
- Notes / next actions: Consider whether this theme needs a boundary tied to first-time use or brand familiarity.

What this example shows (and why it matters)

Each theme is broader than a code, but still specific enough to guide action.
Boundaries prevent overlap from turning into confusion.
Evidence requirements keep themes tied to data, not intuition.

Common pitfalls (and how to avoid them)

Most theme problems come from skipping the “definition + boundaries + evidence” step. The worksheet makes that step hard to ignore.

Pitfall 1: Naming themes as topics instead of meanings

Too vague: “Communication”
Better: “Lack of confirmation creates doubt”

Pitfall 2: Letting one theme swallow everything

Catch-all themes often sound like “User experience issues.” That label does not explain what’s happening.

Fix it by splitting by driver (effort vs uncertainty vs trust) or by stage (before booking vs after submission).

Pitfall 3: Building themes from code counts alone

A code appearing often does not automatically make it a theme. You still need a pattern of meaning and a clear link to your question.

Use frequency as a hint, then confirm with extracts that show depth and variation.

Pitfall 4: Ignoring disconfirming data

If you only collect quotes that support the theme, you will miss important boundaries. Add at least one edge case per theme in your worksheet.

Pitfall 5: Losing the audit trail

When someone asks, “Where did this theme come from?” you should be able to point to the cluster, the definition, and the extracts. Store quote IDs and keep a version history of theme names.

Decision criteria: when to merge, split, or drop a theme

You do not need a perfect theme set on the first pass. Use these checks to decide what to do next.

Merge themes when

The definitions overlap and you cannot write clean boundaries.
The same extracts keep showing up in both themes.
A combined label would be clearer and still specific.

Split a theme when

It has two different drivers (example: effort vs fear).
It covers two different stages (before action vs after action).
Different participant groups experience it in different ways.

Drop or demote a theme when

You cannot find enough extracts beyond one or two participants.
It does not help answer the research question.
It is better framed as a background detail or a subtheme.

Common questions

How many codes should go into one theme?
There is no fixed number, but each theme should feel coherent and not require a long list of exceptions. If you need many exceptions, split the theme or tighten the boundary.
How many themes should I end up with?
Enough to answer your question clearly without repeating yourself. If two themes sound similar in plain language, revisit boundaries or merge them.
Can one code belong to more than one theme?
Yes, but do it intentionally. If overlap is common, write boundaries that explain when the code supports Theme A versus Theme B.
What if my themes change after I start writing?
That is normal. Update the worksheet, rename themes, and keep the earlier versions so your audit trail stays clear.
Do I need software to do theme mapping?
No. You can do it with printed excerpts and sticky notes, then capture the final clusters and evidence in a spreadsheet.
How do I show my evidence in a report?
Use 2–4 strong quotes per theme, label them with participant IDs, and include one brief note about an edge case when it changes the interpretation.
What’s the difference between a theme and a finding?
A theme is a pattern you can support with extracts. A finding is the claim you make from that pattern, often linked to implications or recommendations.

Where transcripts fit in (and how to keep your themes grounded)

Theme mapping works best when you can quickly pull accurate excerpts, check context, and trace every theme back to the original words. If you plan to share your analysis with a team, clean transcripts also reduce confusion about what was said.

If you need a dependable way to turn recordings into clear text for coding, GoTranscript offers professional transcription services that can support your research workflow.

Order Now

Transcriptions

Human-made audio-to-text in 140 languages

Captions

Human-made broadcast-ready captions

Instant Quote

Top pick

Services

PROFESSIONAL SERVICES

Human Transcription

Closed Captions

Proofreading & Transcript Editing

AUTOMATED SOLUTIONS

AI Transcriptions

Transcription & Captioning API

CUSTOM SOLUTIONS

Custom Transcription & Data Labeling

Pricing

Pricing Calculator

Loyalty Program

Education Discount

Nonprofit Discount

Green Initiative Discount

For business

Education

Government

Legal

Medical

Language Service Providers

Law Enforcement

Internal Communications

Market Research

News organisations

Company

Case Studies

Partnership

Trust Center

Our Languages

About

Our Team

Blog

Careers

Contact

Enterprise Solutions

Talk to Sales

Book a Meeting

Education & Campus Support

Order Support

Help Center

General Inquiries

Careers

PROFESSIONAL SERVICES

Human Transcription

Closed Captions

Proofreading & Transcript Editing

AUTOMATED SOLUTIONS

AI Transcriptions

Transcription & Captioning API

CUSTOM SOLUTIONS

Custom Transcription & Data Labeling

Transparent pricing

Book a meeting

Pricing Calculator

Loyalty Program

SPECIAL DISCOUNTS

Education Discount

Nonprofit Discount

Green Initiative Discount

Simple, Transparent Pricing

Billing Terms

Education

Government

Legal

Medical

Language Service Providers

Law Enforcement

Internal Communications

Market Research

News Organizations

Trusted by Global Leaders

Case Studies

Partnership

Trust Center

Our Languages

About

Our Team

Blog