AssemblyAI’s May Updates: Reasoning, Streaming & PII (Full Transcript)

May recap: LLM Gateway reasoning + Gemini 3.5 Flash, JSON repair, big streaming diarization gains with per-word labels, continuous partials, and streaming PII redaction.
Download Transcript (DOCX)
Speakers
add Add new speaker

[00:00:00] Speaker 1: Hi everyone, Mark from Assembly here. It's been another busy month here at Assembly AI, so let me get you caught up to speed on everything that we shipped in May. Let's start with LLM Gateway, which got a ton of love this month. The biggest addition is Chain of Thought reasoning. Now you can turn on reasoning for any supported model with a single parameter. Just pass a reasoning effort level of low, medium or high, and LLM Gateway handles all the provider-specific differences for you, whether you're on Claw, Gemini or an open AI model. We also added Gemini 3.5 Flash to the lineup, shipped JSON repair post-processing, so malformed JSON gets fixed before it even reaches your app. Next up, and this is a big deal, we shipped a major upgrade to our streaming speaker diarization. Accuracy is way up across the board, we cut false alarm speakers by 66%, and phantom turns by 60%. But the headline feature is per-word speaker labels. If you look into the API response, you'll see that each word carries a speaker label. So when someone cuts in mid-sentence, you'll actually catch it, and words the model isn't confident about get tagged as unknown instead of being lumped in with the wrong speaker. This is live now on both the US and EU regions, with no code changes needed to get the accuracy gains. Staying on streaming, we also launched continuous partials. By default, Universal 3 Pro emits one partial near the start of a turn and more around silence, but with continuous partials on, you get a steady stream of mid-turn transcripts roughly every 3 seconds, even when there's no pause. This is perfect for long monologues, like when a caller is reading out a credit card number or a long address. And you can even toggle it on and off mid-session, so you can switch it on just for the moments where you really need it. The Playground has also gotten some serious attention. The Voice Agent Playground now plays a sample for every one of our 34 voices. And on top of that, we've shipped a bunch of improvements around the Playground and the Dashboard. My personal favorite is that you can now share a voice agent that you created on our Playground with a public link. And last but not least, PII Redaction is now live for streaming. Just set Redact PII to true in your initial connection, and our API will automatically detect and remove sensitive information like names, phone numbers, credit card numbers in real time. It works across all our streaming models, and Redaction is applied to final turns, so unredacted text never reaches your client. My credit card number is 4220-1230-4551-2121. And just a call out that if you have Redact PII enabled on your streaming request, we'll disable partial transcripts by default, so that we don't leak any sensitive information in the partial transcripts. As always, the best way to keep a pulse on everything we ship is our changelog. But that's all I had for this month, and I'll see you again on the next video. Bye.

ai AI Insights
Arow Summary
Mark from AssemblyAI recaps May product updates: LLM Gateway adds one-parameter Chain-of-Thought reasoning across providers, Gemini 3.5 Flash support, and JSON repair. Streaming speaker diarization gets a major accuracy upgrade with fewer false speakers/phantom turns and new per-word speaker labels plus "unknown" tagging. Streaming also gains continuous partial transcripts every ~3 seconds, toggleable mid-session. Playground/Dashboard updates include samples for all 34 voices and shareable public links for voice agents. Streaming PII Redaction launches to remove sensitive data in real time on final turns, with partials disabled by default to avoid leaks; users are pointed to the changelog for ongoing updates.
Arow Title
AssemblyAI May Ship Recap: LLM Gateway, Streaming, Playground
Arow Keywords
AssemblyAI Remove
LLM Gateway Remove
chain-of-thought reasoning Remove
reasoning effort Remove
Claude Remove
Gemini Remove
OpenAI Remove
Gemini 3.5 Flash Remove
JSON repair Remove
streaming speaker diarization Remove
per-word speaker labels Remove
continuous partials Remove
Universal 3 Pro Remove
Voice Agent Playground Remove
Dashboard Remove
public link sharing Remove
PII redaction Remove
streaming redaction Remove
partial transcripts Remove
changelog Remove
Arow Key Takeaways
  • LLM Gateway now supports configurable reasoning (low/medium/high) across multiple model providers via a single parameter.
  • Gemini 3.5 Flash is available in LLM Gateway, and malformed JSON can be auto-repaired before reaching applications.
  • Streaming speaker diarization accuracy improved significantly, including per-word speaker labels and an "unknown" tag for low-confidence words.
  • Continuous partial transcripts provide mid-turn updates about every 3 seconds and can be toggled during a session.
  • Playground improvements include voice samples for all 34 voices and shareable public links for voice agents.
  • Streaming PII Redaction is live; it redacts sensitive info on final turns and disables partials by default to prevent leakage.
Arow Sentiments
Positive: Upbeat product-update tone highlighting multiple shipped improvements, major accuracy gains, and new features with emphasis on ease of adoption and safety.
Arow Enter your query
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript