Deepgram Unveils Enhanced Speaker Labeling Feature

Convert Your Audio To Text

4.9/5

3718 customer reviews

Discover Deepgram's latest speaker labeling feature with easy implementation, enabling precise multi-speaker transcription in audio files effortlessly.

Create speaker labeled transcripts with Google Colab Deepgram Studios

Added on 01/29/2025

Speakers

Add new speaker

Speaker 1: Hello Internet. Deepgram is back with another update. See last time, we showcased our brand new AI speech recognition model, Nova. Link to that video is in the description. However, this time we're unveiling something a little different. It's a feature that many of us techies need, but is hard to find. That of course is speaker labeling, otherwise known as diarization. See an AI can transcribe spoken words, sure, but training an AI to recognize different voices in a multi-person conversation is an entirely different story. However, Deepgram has your back. The latest version of our speaker labeling feature is here, and it's better than ever. So instead of your transcripts looking like this, they can now look like this. Isn't that much better? And the best part? We've made it as simple as possible for you to implement this. Of course, you can check out our SDKs and our user-friendly docs to learn all you need to know. But here at Deepgram, we like to go the extra mile, so we've written a Python notebook for you. Link in the description. Using it is actually pretty easy. Just copy the notebook, upload any audio you want, and then run the cells. All the instructions that you need are already written inside the notebook. Follow those instructions and bam, you should have an output that looks like this. A speaker labeled transcript appears for you in seconds. But alright, if you're interested in what the code really says, let's break it down. This first cell simply installs dependencies. After all, you can't transcribe audio with Deepgram's models without first installing Deepgram itself. This second cell contains no code, but it does remind you to upload the audio of your choice into the notebook. Here I'm using a podcast excerpt. And this next cell is where the magic happens. There's only three variables you need to fill out first though. Number one, plug in your Deepgram API key here. To get one, just sign up for Deepgram. With every sign up, you'll receive up to 45,000 minutes of transcription for free. Alright, now that you have your API key, we'll have to specify, number two, the MIME type. In this case, my audio is in M4A format. But if you're using an MP3 or a WAV or any of these other formats on screen, you should be golden. Finally, I'm specifying the directory that my audios are in. Here, both my code and my audio files live in the current directory, so a simple dot should suffice. And now, I'm all set. By running this cell, Deepgram's API will now transcribe every audio in the directory I specified whose MIME type matches the MIME type I specified. The output is a beautiful, pretty printed JSON that looks like this. You can delve into the shape of this JSON if you wish, but if you simply want to extract a speaker-labeled transcript from this nested structure, all you have to do is run the final cell. Here, we've written a LeetCode-style function that will parse every JSON file in the directory that you specified in the previous cell. Then it'll output a TXT file containing your speaker-labeled transcript. And voila, you're done. Here's what my output looks like.

Speaker 2: You know, the first week I found it worked like 85% of the time, then it dropped down to about 66% of the time. So whatever it is, even if it's only 85%, that's still not enough for you to take your other cards out of your wallet. For me, I think I'm hovering around 5%. That's awful. You're not saving time at all. No. I really doubt that, because I can't imagine you trying something 20 times. What would inspire me to keep trying it? Well, anger. That's true. That's true.

Speaker 1: That's true. And here's the output that our friends at NASA got when they transcribed the audio of the first all-female spacewalk.

Speaker 3: My pleasure working with you this morning, and I'm working on getting that EV hatch open, and I can report it's opened and disposed. Thank you, Drew. Thank you so much. Okay. On your GCMs, take your power switches to back. Stagger switch throws and expect a warning tone.

Speaker 1: And now that you know how to diarize transcripts, go forth and get creative. For some inspiration, here's what we've seen done in the past. Call centers can use speaker labeling to distinguish between their employees and customers for every given phone call. Then topic detection or sentiment analysis can be applied to see what customers speak about the most and how they feel about those topics. Podcasters, meanwhile, have used speaker labeling to create closed captions for their content. And of course, we here at DeepGram Studios have created videos in which we use our diarization feature to create some fun analyses. For example, in this video, link in the description, we analyze just how much famous late night talk show hosts really let their guests speak. And that's only the beginning of what you can do with AI speech recognition. Truly, the sky and your own creativity are the limit. And if you have a fun demo that you'd like to showcase, hit us up on our socials, whether that's on Twitter, TikTok, LinkedIn, or even in the comment section below. Here are some satisfied DeepGram customers that we've featured in the past. Their code is quite impressive. Check it out. Link in the description, as usual. And as always, follow DeepGram for more AI content.

Summary

Generate a brief summary highlighting the main points of the transcript.

Generate

Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate

Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate

Enter your query

Submit

Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate

Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate

Back

Forward

{{ Math.round(speed * 100) / 100 }}x

Select Audio file

Convert Your Audio To Text

Secure and Encryption, NDA

4.9/5 3718 customer reviews

1/730

Verified Order

“I needed an interview transcribed accurately and I was happy with the quick turnaround. ”

Jen

Jul 20, 2025

“Very accurate transcription, fast service, easy to use and order, thank you!”

Gabby

Jul 15, 2025

“I am beyond happy with this service, which I am using it produce interview transcripts for my dissertation research. The interface is easy, the customer service was prompt and informative, the transcript is accurate, and the pricing is wonderful. I will recommend GoTranscript to anyone who is in need of affordable human-powered transcription services.”

Justin McDonald

Jun 29, 2025

“great work. quick and professional”

christian oradesky

Jun 28, 2025

We Trust in Human Precision

Value-Driven Pricing

Trusted by Global Leaders

GoTranscript

24/7 Customer Support