Blog chevron right How-to Guides

File Naming Convention for Interviews and Transcripts (Examples + Rules)

Daniel Chang
Daniel Chang
Posted in Zoom Apr 11 · 13 Apr, 2026
File Naming Convention for Interviews and Transcripts (Examples + Rules)

A good file naming convention for interviews and transcripts makes every file easy to find, sort, share, and audit later.

The most reliable format uses the same core fields in the same order every time: Study ID, Participant ID, Wave/Session, Date, and Version (RAW/CLEAN/CODED), plus an optional file type tag.

This guide gives ready-to-copy rules and examples, plus simple checks that prevent the most common errors.

Key takeaways

  • Use a fixed order: Study → Participant → Session → Date → Version → Type.
  • Use ISO dates (YYYY-MM-DD) to avoid ambiguity and to sort correctly.
  • Choose one separator (usually underscore) and never mix separators within the same project.
  • Lock your ID formats (zero padding, allowed characters) before data collection starts.
  • Keep file names short, consistent, and readable; store details in a metadata sheet, not the name.

Why a naming convention matters (more than you think)

Interview projects produce many similar files: audio, video, transcripts, coded versions, consent forms, and exports.

If names vary, you lose time searching, risk using the wrong version, and make collaboration harder.

  • Faster retrieval: You can locate files without opening them.
  • Reliable sorting: Correct ordering in folders and spreadsheets.
  • Cleaner handoffs: Teammates and vendors can follow the same structure.
  • Safer analysis: You reduce mix-ups between participants, sessions, and versions.

The robust pattern to use (copy/paste template)

Use this as your default template for both interview recordings and transcripts.

Recommended pattern

{STUDY}-{SITE}_{PID}_W{WAVE}_S{SESSION}_{YYYY-MM-DD}_{VERSION}_{TYPE}

If you do not have sites, waves, or sessions, drop those fields rather than leaving them blank.

Keep the remaining fields in the same order so every filename still “reads” the same way.

Field definitions (what each part means)

  • STUDY: Short study code (letters/numbers), like HCI24 or DIAB.
  • SITE (optional): Location or team code, like NYC or UK01.
  • PID: Participant ID with fixed length, like P001, P027.
  • WAVE: Research wave, like W1, W2 (use when you recontact participants over time).
  • SESSION: Session or interview number, like S01, S02 (use when multiple sessions happen in one wave).
  • YYYY-MM-DD: Interview date using ISO format to avoid ambiguity.
  • VERSION: Use a controlled set: RAW, CLEAN, CODED.
  • TYPE: What the file is, like AUDIO, VIDEO, TRN (transcript), CAPTIONS, NOTES.

ISO 8601 date format is widely used because it sorts naturally and avoids month/day confusion.

You can reference the standard at ISO 8601 date and time format.

Examples you can adapt (interviews + transcripts)

Below are concrete examples for common setups.

Pick one style and apply it to every file in the same project.

Example set A: Simple study with one session per participant

  • HCI24_P001_2026-04-13_RAW_AUDIO
  • HCI24_P001_2026-04-13_CLEAN_TRN
  • HCI24_P001_2026-04-13_CODED_TRN

Example set B: Multiple waves over time

  • DIAB_P014_W1_2026-01-09_RAW_AUDIO
  • DIAB_P014_W1_2026-01-09_CLEAN_TRN
  • DIAB_P014_W2_2026-03-10_RAW_AUDIO
  • DIAB_P014_W2_2026-03-10_CLEAN_TRN

Example set C: Multiple sessions within a wave

  • EDU25_P203_W1_S01_2026-02-03_RAW_AUDIO
  • EDU25_P203_W1_S01_2026-02-03_CLEAN_TRN
  • EDU25_P203_W1_S02_2026-02-17_RAW_AUDIO
  • EDU25_P203_W1_S02_2026-02-17_CLEAN_TRN

Example set D: With site or team code

  • MKTG-UK01_P045_W1_S01_2026-03-05_RAW_AUDIO
  • MKTG-UK01_P045_W1_S01_2026-03-05_CLEAN_TRN

Example set E: Adding a human-friendly topic tag (use sparingly)

Only add a topic tag if it is stable and non-sensitive.

Keep it short and avoid names, companies, or health details.

  • HCI24_P001_W1_S01_2026-04-13_CLEAN_TRN_ONBOARDING

Rules that keep filenames consistent (and sortable)

Set rules before you collect any interviews, then write them down in one shared place.

Use a short “allowed values” list so everyone makes the same choices.

1) Use one separator and stick to it

  • Recommended: underscore (_) between fields.
  • Use hyphen (-) only inside a field when needed (like ISO dates or a study-site pair).
  • Avoid spaces because they can cause issues in scripts and links.

2) Keep IDs fixed-length with zero padding

P1, P2, P10 will sort incorrectly, so pad them.

  • Use P001P999 for most studies.
  • Use S01, S02 for sessions.
  • Use W1, W2 (or W01, W02 if you expect many waves).

3) Use ISO dates (YYYY-MM-DD)

03-04-2026 can mean March 4 or April 3, depending on the country.

2026-03-04 always means March 4, 2026, and it sorts correctly.

4) Restrict characters

Use only letters A–Z, numbers 0–9, underscore, and hyphen.

Avoid / \ : * ? " < > | because operating systems treat them as special characters.

5) Control version names (RAW/CLEAN/CODED)

Decide what each version means and do not reuse the labels for other steps.

  • RAW: Original export from recorder or meeting platform, unchanged.
  • CLEAN: Edited transcript for readability and obvious fixes (based on your rules).
  • CODED: Transcript with codes, highlights, or analysis annotations.

6) Put “v2” in a separate field only when you truly revise

If you need revision tracking, add a REV field after VERSION, like REV01, REV02.

Do not mix “version” meanings (RAW/CLEAN/CODED) with “revision number” meanings.

Preventing common errors (and how to fix them)

Most naming problems repeat, so you can prevent them with a few simple checks.

Use the checklist below during intake and before sharing files with others.

Duplicate IDs

  • Cause: Two people assign participant IDs at the same time, or you reuse a test ID.
  • Prevention: Maintain one master ID log (sheet or database) with the next available PID.
  • Fix: Rename all files for the duplicated participant to a new PID and update the master log.

Ambiguous dates

  • Cause: Using MM-DD-YYYY or DD-MM-YYYY across teams.
  • Prevention: Require ISO dates only: YYYY-MM-DD.
  • Fix: Convert all existing names, then block non-ISO entries in your intake form.

Inconsistent separators

  • Cause: Some files use spaces, others use hyphens, others use underscores.
  • Prevention: Publish one rule: underscores between fields, hyphens only inside dates.
  • Fix: Batch-rename by pattern and re-export file lists to confirm sorting.

Missing fields (you can’t tell what a file is)

  • Cause: People name items like Interview1.mp3 during a rush.
  • Prevention: Create a naming “starter” template in your folder as a text file that people can copy.
  • Fix: Use a metadata sheet to reconstruct missing details, then rename.

Mixing identifying information into filenames

  • Cause: Adding participant names or employer names to “make it easier.”
  • Prevention: Ban direct identifiers in file names and store sensitive mapping in a secure location.
  • Fix: Rename immediately and update any shared links or references.

Step-by-step: roll out a naming convention in your team

You can implement this in under an hour if you keep the scope small.

Focus on stable fields and document the decisions.

Step 1: Choose your required fields

  • Always: STUDY, PID, date, version, type.
  • Sometimes: wave, session, site/team.

Step 2: Decide formats and allowed values

  • PID format: P001 (or PT001 if you prefer).
  • Session format: S01.
  • Date format: YYYY-MM-DD.
  • Version list: RAW, CLEAN, CODED.
  • Type list: AUDIO, VIDEO, TRN, NOTES.

Step 3: Create a one-page “Naming Rules” doc

  • Include the template and 5–10 examples that match your project.
  • Add a “Do not use” list (names, spaces, ambiguous dates).
  • Pin it in your project folder and link it in your task tracker.

Step 4: Add a metadata sheet

File names should identify and sort, but they should not carry everything.

Track details like interviewer, language, consent status, and notes in a sheet keyed by the filename.

Step 5: Enforce with lightweight checks

  • When a file arrives, confirm it matches the pattern before it moves to the “processed” folder.
  • If you use scripts, validate with a regular expression and reject mismatches.

Common questions

Should interview audio and transcripts use the same filename?

Yes, keep the same core fields so anyone can match audio to transcript quickly.

Change only the TYPE field (and the file extension).

What if I do multiple interviews on the same day with the same participant?

Add or use the S{SESSION} field, like S01 and S02.

If sessions do not fit, use a time field like T1430, but keep it optional and consistent.

Do I need both wave and session?

Use WAVE when the study repeats over months or phases.

Use SESSION when you can have multiple interviews within a wave or phase.

How do I name files if I anonymize participants?

Use a non-identifying participant ID like P037 and store the mapping separately.

Avoid names or contact details in both file names and folder names.

What’s the best way to mark “final” transcripts?

Prefer a revision tag like REV02 and keep a short change log in your metadata sheet.

Avoid adding FINAL repeatedly because it often becomes outdated.

Can I use lowercase instead of uppercase?

You can, but choose one style and keep it consistent across the project.

Many teams use uppercase for controlled values like RAW and CLEAN because it is easy to scan.

How should I name captions and subtitles for interview videos?

Use the same core fields and swap the type field to CAPTIONS or SUBS.

If you distribute publicly, you may also need accessibility-focused caption formats, such as those described by the W3C guidance on captions and transcripts.

If you want to turn interviews into consistent, easy-to-manage transcripts (including clean and coded-ready versions), GoTranscript offers options that fit different workflows, including transcription proofreading services and automated transcription.

When you’re ready, you can also use GoTranscript’s professional transcription services to support your interview and transcript pipeline while keeping your naming rules consistent end-to-end.