The best audio/video formats for ELAN projects are ones with constant frame rate video and uncompressed (or simple) audio, because they reduce timecode drift and play smoothly on most computers. In practice, that often means converting field recordings to MP4 (H.264, constant frame rate) for video and WAV (48 kHz) for audio, then verifying sync before annotation.
This guide explains which formats tend to drift, which ones stay stable, and how to standardize media across a project so everyone annotates the same, reproducible files.
Primary keyword: best audio/video formats for ELAN
Key takeaways
- Use constant frame rate (CFR) video to reduce sync drift in ELAN; avoid variable frame rate clips when you can.
- Standardize audio to WAV, 48 kHz (and keep the same sample rate across the whole project).
- Prefer MP4 (H.264 video + AAC audio) for light, stable review copies, and keep WAV as the audio reference when precision matters.
- Lock a project-wide media spec (sample rate, frame rate, channel count, naming) before the team annotates.
- Always run a quick sync check after conversion (start, middle, end) and document the exact settings for reproducibility.
Why ELAN projects get sync drift (and what “good formats” prevent)
ELAN aligns annotations to a media timeline, so small timing mismatches can snowball into noticeable drift near the end of a long session. Drift can show up as lip movements no longer matching speech, or an audio event occurring a fraction of a second away from your tier boundaries.
Formats reduce drift when they keep time predictable, which usually comes down to a few practical causes you can control.
Common causes of drift in fieldwork media
- Variable frame rate (VFR) video, common in phones and some screen recordings, can play “smoothly” but report time inconsistently across tools.
- Mixed sample rates (44.1 kHz in one file, 48 kHz in another) can complicate alignment when you combine sources.
- Long recordings increase the chance that tiny timing offsets become visible.
- Bluetooth or wireless audio introduces latency, which may be constant (offset) or unstable (drift) depending on the setup.
- Containers/codecs optimized for streaming sometimes trade precise seeking for compression efficiency.
What “stable” looks like for annotation
- Video: constant frame rate, common frame sizes, widely supported codec.
- Audio: a consistent sample rate (often 48 kHz), simple channel layout (mono/stereo), and minimal processing.
- Project: one agreed “working format” so every annotator sees the same timeline behavior.
Recommended formats for ELAN (and when to use each)
You can think in two layers: a durable “archive master” for long-term preservation, and a fast “working copy” for everyday annotation. Many teams keep both: masters stay untouched; working copies feed ELAN.
Best practice baseline (simple and widely compatible)
- Audio working/master: WAV (PCM), 48 kHz, 16-bit or 24-bit, mono or stereo as recorded.
- Video working copy: MP4 container, H.264 (AVC) video, constant frame rate (match source if possible), AAC audio.
This combo is popular because most operating systems and media frameworks handle it reliably, and it stays lightweight enough for laptops used in field settings.
If you have separate audio and video recorders
- Make the WAV file the “truth” for timing if you need fine-grained segmentation.
- Use the video as a visual reference, but do not rely on a phone’s embedded audio track if you recorded better audio separately.
If the audio is separate, you’ll usually align once (using a clap/slate) and then keep those aligned files fixed for the rest of the project.
When to avoid certain formats
- Avoid editing in-place: repeatedly re-saving compressed video can introduce artifacts and sometimes odd timeline behavior.
- Avoid VFR sources as-is: if your phone recorded VFR, convert to CFR before team annotation.
- Avoid exotic codecs: some camera codecs decode poorly on older laptops, leading to stutter that feels like “sync problems.”
Archive-friendly masters (optional but useful)
- Audio master: WAV (PCM) at the original bit depth and sample rate (often 48 kHz, 24-bit).
- Video master: keep the camera original (even if large) as your master, but create a standardized CFR MP4 for ELAN work.
Keeping originals helps if you later need to prove provenance or redo conversions with new tools.
Media settings that reduce drift and improve performance
Formats matter, but settings matter just as much. Standardizing settings across fieldwork recordings makes timelines more predictable and reduces “it works on my laptop” issues.
Video: constant frame rate, sane resolution, consistent frame rate
- Frame rate: pick one target (commonly 25, 29.97, or 30 fps) and stick to it within a project whenever you can.
- Constant frame rate: convert VFR to CFR for ELAN working files.
- Resolution: downscale working copies if needed (for example, 1080p → 720p) to reduce CPU load without changing timing.
- Keyframes: more frequent keyframes can make seeking smoother in some players, which helps when jumping around during annotation.
Performance problems can look like drift when the video lags behind audio during playback, so lighter working copies often reduce “false drift.”
Audio: consistent sample rate and channels
- Sample rate: 48 kHz is a common standard for video-related work; keep it consistent across all files used together.
- Channels: choose a rule (mono for lav mic, stereo for ambient) and document it; avoid switching mid-project without a reason.
- Normalize carefully: if you change loudness, keep a copy of the original and log what you did.
Naming and structure: prevent “wrong file” sync issues
- Use stable IDs: Project_Speaker_Session_Date_Take.
- Keep media in a dedicated folder next to ELAN files (or use consistent relative paths).
- Never rename files after annotation starts unless you also update the ELAN links.
How to standardize formats across fieldwork recordings (a practical workflow)
A standard workflow helps most when multiple people annotate, or when you collect data across months with different devices. The goal is to turn “whatever the camera gave me” into “the same spec every time.”
Step 1: Decide your project media spec (write it down)
- Video working format: MP4, H.264, CFR, target fps (choose one), target resolution (choose one).
- Audio working format: WAV PCM, 48 kHz, 16/24-bit, channel rule (mono/stereo).
- File naming: exact pattern, including leading zeros for take numbers.
- Folder layout: e.g., /media/original, /media/working, /elan, /docs.
This spec becomes your reference when you convert new sessions, onboard annotators, or revisit data later.
Step 2: Ingest and protect originals
- Copy originals off cards/phones into an original folder.
- Do not edit originals; create working copies in a separate folder.
- Store a basic text log: what device, date, and any recording notes (mic placement, interruptions).
Step 3: Convert to working files (audio + video)
- Convert phone or camera video to your CFR MP4 working format.
- Extract or convert the best audio track to WAV (or convert your external recorder WAV to the project sample rate if needed).
- If you must combine audio and video into one file, do it once, then freeze that file for annotation.
Step 4: Verify sync before annotation
- Check sync at three points: start, middle, end.
- Use a visible/clear cue if available (clap, slate, door close).
- Confirm that ELAN plays media smoothly without dropped frames on the target laptop(s).
Step 5: Document settings for reproducibility
- Record the conversion settings (codec, fps, sample rate, bit depth, channel count).
- Record the tool and version you used for conversion.
- Keep a “media manifest” listing each original file and its working derivative(s).
Reproducibility is not only for publication; it helps you fix issues without guessing later.
Conversion + verification checklist (copy/paste)
Use this checklist per session, or add it to your project’s SOP. It focuses on reducing sync/drift risk and catching problems early.
A. Before conversion
- Confirm source details: device model, recording app, stated fps, stated sample rate (if available).
- Identify VFR risk: phone video, screen recording, or “HDR/low light” modes often produce VFR.
- Collect sync cue: note if you have a clap/slate; if not, choose a sharp audio event early in the recording.
- Decide the “truth” audio: embedded camera audio or external recorder audio.
B. Convert to working formats
- Video working: MP4 (H.264), set constant frame rate, choose target fps, choose resolution.
- Audio working: WAV (PCM), set 48 kHz (or your project standard), choose bit depth, keep channel layout consistent.
- Keep originals: do not overwrite; write working files to a separate folder.
- Export logs: save conversion logs or commands if your tool supports it.
C. Verify sync (start/middle/end)
- Start: do lips match speech or does the clap align with the video frame?
- Middle: jump to a clear consonant (like “t/k/p”) and confirm alignment.
- End: verify alignment again; drift often shows up here first.
- Playback performance: watch for stutter; if video stutters, try a lower-res working copy.
D. If sync is off, diagnose quickly
- Constant offset (same everywhere): apply a one-time shift or re-mux with corrected offset.
- Drift (worse over time): suspect VFR, mismatched frame rate, or sample rate issues; reconvert to CFR and standard sample rate.
- Random jumps: suspect corrupted media, edit points, or playback performance problems; make a new working copy.
E. Document for reproducibility
- Media manifest: original filename → working filename(s).
- Settings: fps, CFR, codec, sample rate, bit depth, channels.
- Tooling: converter name + version; date of conversion.
- Notes: any manual offsets applied and why.
Pitfalls to avoid when preparing ELAN media
Most sync problems come from a few repeat mistakes. Avoiding them saves hours of re-annotation.
1) Mixing multiple “working copies” across annotators
- If two people annotate different conversions of the same recording, the timelines may not match.
- Fix: store one canonical working file per session, and share it along with the .eaf.
2) Converting only some sessions to CFR
- Projects feel stable until you hit the one VFR session that drifts.
- Fix: treat CFR conversion as mandatory for the working set.
3) “Improving” audio without tracking changes
- Noise reduction and time-stretch tools can change timing if used incorrectly.
- Fix: keep processing minimal for annotation, and log any changes you do make.
4) Renaming/moving media after ELAN files are created
- ELAN links can break, leading to confusion and accidental relinking to the wrong file.
- Fix: finalize names first, then start annotation, and keep a stable folder layout.
Decision guide: choose your ELAN working format in 60 seconds
If you want a quick rule set, use the one that matches your situation.
- You recorded on phones: convert video to MP4 H.264 CFR; extract/convert audio to WAV 48 kHz; verify end-of-file sync.
- You recorded video + separate audio recorder: keep external WAV as the main audio; create a CFR MP4 for visual reference; align once using a clap.
- You have low-power laptops: create lower-res CFR MP4 working copies (audio stays WAV) to avoid playback stutter.
- You need to share with collaborators: standardize naming, keep one canonical working set, and include a manifest and settings sheet.
Common questions
What is the best audio format for ELAN?
WAV (PCM) is a safe choice for ELAN work because it is simple, widely supported, and keeps timing stable. Standardize the sample rate (often 48 kHz) across your project.
What video format works best in ELAN?
MP4 with H.264 video is commonly compatible and efficient, especially when you ensure the video is constant frame rate. If you see drift or stutter, create a new CFR working copy at a lower resolution.
Do I need to keep the original camera files?
Keeping originals is a good practice because it preserves provenance and lets you redo conversions later. You can annotate using standardized working copies while storing originals separately.
How can I tell if my video is variable frame rate?
Phone recordings and screen captures often use variable frame rate. If you see end-of-file drift after conversion or inconsistent playback across machines, treat it as a VFR risk and convert to CFR for the working set.
Why does sync look fine at the start but bad at the end?
That pattern usually signals drift, often from variable frame rate video or mismatched timing assumptions between audio and video. Convert the video to constant frame rate and confirm your audio sample rate matches your project standard.
Should I embed WAV audio into the MP4?
MP4 typically uses AAC for audio, not WAV, so many teams keep a separate WAV as the high-quality audio reference. If you embed audio into video for convenience, keep the separate WAV too and document which file is authoritative for annotation.
What should I document so someone else can reproduce my media prep?
At minimum: original filenames, working filenames, codec/container, frame rate (and whether it is CFR), resolution, audio sample rate/bit depth/channels, the conversion tool + version, and any offsets you applied.
When transcription and captions help ELAN workflows
Many ELAN projects start faster when you have a clean text base to annotate and align, especially for long interviews or multi-speaker sessions. A transcript can also help you spot repeated sync issues because you can compare where words fall against the timeline.
- If you want machine-first speed, you can start with automated transcription and then correct terms and names.
- If you already have a draft transcript, transcription proofreading services can help clean it up before you import or reference it in annotation.
- If you need text that follows spoken audio for accessibility or sharing, closed caption services can complement your ELAN deliverables.
If you’d like support turning recordings into consistent, usable text assets for your research pipeline, GoTranscript offers professional transcription services that can fit alongside your ELAN media-prep standards.