Introducing Universal 2: Enhanced Speech Recognition
Explore Universal 2, our advanced model with improved accuracy in rare words, structure, and alphanumerics. Test it with tools like sentiment analysis.
File
Universal-2 The Most Powerful Speech-to-Text Ever Demo Tutorial
Added on 01/29/2025
Speakers
add Add new speaker

Speaker 1: In the world of speech recognition, accuracy isn't just a metric, it's everything. Today we're proud to announce Universal 2, our most accurate speech-to-text model yet, which has been trained on over 12.5 million hours of audio data. Here's what's improved in Universal 2. There is a 24% improvement in the recognition of rare words like names, brands, and locations. There's also been a 15% improvement in transcript structure with proper punctuation and casing across things like emails, dates, and dollar amounts. And also a 21% increase in detecting alphanumerics, so higher accuracy across critical data like phone numbers, zip codes, and other numerical identifiers. Here's how Universal 2 performs across those three key areas when compared to other speech-to-text models. Universal 2 has the lowest word error rate across these three key areas. Now let's see Universal 2 in action. In the description box below, you'll see a link to this Google Colab so you too can try this out to test out how Universal 2 can be deployed. The very first thing we're doing is importing AssemblyAI and defining our AssemblyAI API key. You can also check out the link in the description box below to get your free AssemblyAI API key to test this out. This code snippet right here helps us to do speech recognition with AssemblyAI. We're making use of this audio file on hand, but feel free to replace that with whatever audio file you want to make use of. And also we're making use of the AssemblyAI transcriber object and the transcribe function where we pass the audio file to. Once we do that, we're just going to simply print out the transcript to see this. With Universal 2, you can also do speaker diarization in just a few lines of code. So here's exactly how you would go about doing it. The main thing is, of course, to configure it and turn speaker labels equals to true in our transcription config object. Once you do that, you're also going to be printing out our speaker as well as what they're uttering. And this is exactly how our printed out transcript would look like. Universal 2 also enables you to do a wide range of audio intelligent tasks at high accuracy. So things like sentiment analysis, summarization, PII reduction, and many more. So here's an example of how you would do summarization with AssemblyAI. All you would have to do is modify the transcription config, set summarization to true, select a summarization model, in this case informative, and then also set the summarization type. Once you print out your transcript summary, this is exactly how it would look like. We have a summary in bullet points. Next up is sentiment analysis. Similarly, you would turn on the sentiment analysis model by setting it to true in the transcription config. And upon printing it out, you can also print out things like the text, the sentiment, as well as the confidence score and the timestamp at which that word was uttered. So here's exactly how your transcript, when it's printed out, would look like. To find out more about Universal 2 and all the major improvements, check out the link in the description box below. And to learn more about all the audio intelligence tasks that you can use AssemblyAI with, check out our documentation page.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript