Blog chevron right Investigación

Publishing Quotes Safely: Prevent Re-Identification in Papers and Theses (Guide)

Andrew Russo
Andrew Russo
Publicado en Zoom may. 21 · 23 may., 2026
Publishing Quotes Safely: Prevent Re-Identification in Papers and Theses (Guide)

Publishing quotes safely means sharing only the words you need while removing or changing details that could point to a real person. In papers and theses, the goal is simple: keep the meaning of the quote, but lower the chance that a reader can identify the speaker from names, places, roles, dates, or unique life details.

The safest approach is to review each quote before publication, trim identifying fragments, paraphrase when the exact wording is not essential, and keep a private record of the source transcript ID. This guide shows how to do that in a clear, practical way.

Key takeaways

  • Do not publish raw quotes without checking for re-identification risk.
  • Remove names, places, employers, dates, and rare details that narrow identity.
  • Use paraphrase when a direct quote adds risk but not extra value.
  • Keep the original meaning intact when editing or shortening a quote.
  • Store the source transcript ID in a private log, not in the public paper.
  • Review combinations of details, not just single identifiers.

Why quotes can reveal identity

A quote can expose someone even when you remove their name. A job title, a town, an unusual event, a family detail, or a precise timeline can be enough for a reader to connect the dots.

This risk is called re-identification. In research, it often happens because several harmless details become identifying when they appear together.

  • A named hospital plus a rare role.
  • A small village plus a specific age and event.
  • A niche department plus a public project.
  • An exact date plus a quoted incident.

If your interview or focus group covers sensitive topics, the risk rises. The same is true when your participant group is small, local, senior, or easy to recognize within a field.

How to choose quotes for a paper or thesis

Start with necessity. Use a quote only when the exact words help the reader understand tone, meaning, or evidence better than a summary would.

If a sentence works just as well as a paraphrase, paraphrase it. This is often the best option for high-risk material.

Pick quotes with low identity risk

  • Prefer quotes that express a theme without naming people or places.
  • Avoid quotes with rare events, exact dates, or unique job histories.
  • Skip long quotes when a short excerpt makes the same point.
  • Do not stack several risky quotes from the same participant.

Ask four questions before you include a quote

  • Does this exact wording matter for my analysis?
  • Could someone identify the speaker from this quote alone?
  • Could someone identify the speaker from this quote plus nearby context?
  • Can I shorten, mask, or paraphrase it without changing the meaning?

If you answer yes to the second or third question, revise the quote or replace it with a paraphrase.

How to present quotes safely

Safe presentation usually means editing lightly for privacy while protecting meaning. You should be able to explain every change and show that the quote still supports your analysis.

Remove identifying fragments

Cut details that are not needed for your point. This includes direct identifiers and also indirect clues.

  • Names of people, schools, employers, clinics, or towns.
  • Exact ages, dates, and times.
  • Rare job titles or highly specific departments.
  • Unique family details, awards, incidents, or locations.
  • References to public cases, media stories, or named projects.

You can replace removed text with neutral labels in square brackets if that helps clarity.

  • “I reported it to [my manager] the next morning.”
  • “When I was treated at [the hospital], I felt ignored.”

Avoid excessive specificity

Sometimes each detail seems harmless on its own. Together, they can identify the speaker.

Generalise where possible.

  • Change “the oncology ward in Seville” to “[hospital department]”.
  • Change “my 52nd birthday” to “a recent birthday”.
  • Change “the merger in March 2023” to “an organisational change”.
  • Change “our six-person team” to “our small team”.

Paraphrase where needed

Paraphrase when the direct wording adds more risk than value. This works well for sensitive findings, very distinctive speech, or quotes packed with identifying detail.

  • Use plain, faithful language.
  • Keep the participant’s meaning and tone.
  • Do not clean the quote so much that it changes what they said.
  • Mark clearly in your method that some extracts were paraphrased for privacy.

Example:

  • Direct quote: “After the incident at the depot in Alcobendas, everyone in payroll knew it was me because I was the only woman on nights.”
  • Safer presentation: The participant said that after a workplace incident, colleagues could easily identify her because of her role and shift pattern.

Make sure quotes match the transcript meaning

Privacy edits should not distort the evidence. Check the quote against the transcript after every cut, replacement, or paraphrase.

  • Re-read the lines before and after the extract.
  • Confirm that the speaker’s point stays the same.
  • Keep emotion markers only if they matter to analysis.
  • Do not remove qualifiers like “maybe,” “sometimes,” or “I think” if they affect meaning.
  • Do not combine separate parts of a transcript in a way that changes sense.

If you need heavy editing to make a quote safe, that is a sign to paraphrase instead.

A practical workflow for quote safety

A simple review process helps you stay consistent across a whole paper or thesis. It also makes supervision, ethics review, and future checking much easier.

Step 1: Rate the quote risk

  • Low risk: broad statement, no unique details.
  • Medium risk: some context clues, but easy to generalise.
  • High risk: rare event, small group, named place, unique role, or sensitive topic.

Use direct quotes mostly for low-risk extracts. Treat medium-risk quotes carefully, and paraphrase most high-risk material.

Step 2: Edit for privacy

  • Trim to the shortest useful excerpt.
  • Remove identifying fragments.
  • Generalise precise details.
  • Replace terms with neutral labels where needed.

Step 3: Check meaning

  • Compare the final version with the source transcript.
  • Confirm that the quote still supports the theme you discuss.
  • Ask whether the edited version could mislead a reader.

Step 4: Check context around the quote

A safe quote can become unsafe because of what appears next to it. Participant tables, demographic summaries, and method sections can add clues.

  • Avoid very narrow participant profiles.
  • Do not pair quotes with rare combinations of traits.
  • Be careful with appendix material and acknowledgements.

Step 5: Record the source privately

You need an audit trail, but you do not need to expose identities in the paper. Use a private quote log that links each published extract to an internal transcript ID.

Keep this log in a secure file, separate from the thesis or article draft. If your institution has data handling rules, follow them.

Method for documenting the source transcript ID without exposing identities

The aim is simple: you should be able to trace every quote back to the right transcript, while readers cannot trace it to a real person.

Use a two-layer reference system

  • Public label: a broad code shown in the paper, such as P07, Interview 12, or FG3-S2.
  • Private source log: a secure document that maps the public label to the actual transcript ID and file location.

Your paper should show only the public label. The private source log should stay outside the published document.

What to include in the private log

  • Public quote label.
  • Internal transcript ID.
  • Date of interview or recording, if required by your project rules.
  • Location of the source file or repository path.
  • Line numbers or time range for the original extract.
  • Notes on edits made for privacy.

Keep identifiers separate from consent forms and direct personal data. That separation reduces risk if one file is shared by mistake.

What not to include in the public paper

  • Real names or initials.
  • Exact transcript filenames.
  • Detailed recruitment notes.
  • Specific roles if they identify the person.
  • Precise dates, sites, or departments unless essential and approved.

If you discuss data protection in your methods section, describe the process at a high level. For example, you can say that you used coded participant labels and stored the source key separately.

If your work involves personal data in the EU or UK context, review the relevant rules under the General Data Protection Regulation. If your study includes health data in the United States, your institution may also require handling that aligns with HIPAA privacy guidance.

Quote safety checklist

Use this checklist before you submit a paper, thesis, or appendix.

  • Have I used a direct quote only where exact wording matters?
  • Did I remove names, places, employers, and other direct identifiers?
  • Did I reduce precise dates, ages, locations, and roles where possible?
  • Could several details together reveal the speaker?
  • Did I shorten the quote to the minimum needed?
  • If I paraphrased, does it still match the transcript meaning?
  • Did I re-check the lines around the quote in the original transcript?
  • Have I avoided linking the quote to a highly specific participant profile?
  • Does the public label avoid real identities and file names?
  • Is the source transcript ID stored only in a private log?
  • Is the log kept separate from direct identifiers and consent records?
  • Have I applied the same review method to every quote in the document?

Common mistakes to avoid

Many privacy problems come from small choices that seem harmless at the time. These are the mistakes to watch for most often.

  • Using vivid quotes just because they sound good: memorable wording is often memorable because it is distinctive.
  • Leaving in one “minor” detail: a single department name or unusual event can identify a person.
  • Over-editing a quote: if the meaning shifts, the evidence becomes weak or misleading.
  • Publishing too much participant context: tables and quote labels can reveal more than the quote itself.
  • Keeping transcript file names in the manuscript: raw IDs often include dates, locations, or initials.
  • Treating focus groups like one-to-one interviews: group settings can create extra clues about who said what.

A good rule is this: if a participant could recognise themselves easily from the published passage, ask whether someone else could too. If the answer might be yes, reduce the detail or paraphrase.

Common questions

Can I use direct quotes if I already anonymised names?

Yes, but removing names is not enough on its own. You also need to check for indirect identifiers such as role, place, date, event, and rare personal details.

When should I paraphrase instead of quote directly?

Paraphrase when the exact words are not essential to your analysis, or when the wording includes too many identifying clues to edit safely.

Is it acceptable to change small details in a quote?

Yes, if you do it to protect identity and the meaning stays the same. Keep a private note of what you changed and apply the same method consistently.

How should I label participants in a thesis?

Use simple public codes like P01, P02, or FG1-P3. Keep the link between those labels and the real transcript IDs in a separate private log.

Can a demographic table increase re-identification risk?

Yes. A very detailed table can make a quote identifiable even if the quote itself looks safe, so use broad categories and avoid rare combinations.

Should I include transcript file names in appendices?

No. File names often contain dates, locations, initials, or other clues. Use public labels instead and store the real file references privately.

What if my supervisor asks for a traceable audit trail?

Keep a secure source log with the public quote label, internal transcript ID, source location, and edit notes. That gives traceability without exposing identities in the paper.

When you prepare interview material for analysis and publication, accurate transcripts make privacy review much easier. If you need help creating clear, review-ready files, GoTranscript provides professional transcription services that support careful quote selection and safer research workflows.