[00:00:00] Speaker 1: In this video you're going to learn how to transcribe any audio or video file using 11Labs, whether it's a podcast, a meeting recording, a YouTube video or even a full-length film. You can get a highly accurate transcription with timestamps, speaker labels and even entity detection in just a few clicks and I'm going to walk you through the entire process step by step. Inside 11Labs you can find a speech-to-text tool powered by their latest model ScribeV2, which is one of the most accurate transcription models available in the world right now. It supports over 90 languages, it can handle files up to 10 hours long and it does things like speaker diarization, which means it can tell who's speaking when, even with up to 32 different speakers and it also picks up non-speech audio events like laughter or applause, which is actually really useful depending on what you're transcribing. So let's begin and I'll show you all how it works and if you want to follow along you can click the first link down in the description below. Inside of 11Labs to begin transcribing your content all we want to do is go ahead and click on speech-to-text and here we can start beginning to transcribe our video or audio by clicking transcribe files in the top right. Here as you can see all I have to do is simply drag and drop my file. So let's say we had a YouTube video with multiple speakers, I could drag this in and I could drop it. As you can see we've got 30 minutes of audio, 240 megabytes and I can even preview the audio if I want to and if I wanted to upload something else I could go ahead. Now this video right here is in English and I could actually leave it up to the AI to detect the language or I could go and select it myself. Below that we can also choose to tag audio events. So if there's things like clapping, laughter or footsteps, Scribe V2 will detect that in our audio and it will tag the audio event with a timestamp. We could then go ahead and choose to include subtitles and we can add key terms. For example if you have specific brand names or unique names that you want to make sure that the AI gets right you can go and add those right here. So for example we could go and add 11 labs. Next we simply click upload files and as you can see our file begins uploading and once it's done 11 labs will then transcribe it using Scribe V2. And while it's processing let's go ahead and transcribe some more audio. This time I'm going to drag in the audio track to one of my YouTube videos. So imagine I wanted to turn the YouTube video into a blog or have a transcript so my viewers can follow along, I can do so. Once again I can go through the same events. This time I don't want to tag audio events but I do want to include subtitles and we're also going to include 11 labs as one of the key terms and then I simply click upload. And we can literally just see that the video that's 30 minutes long has just finished transcribing. Let's go ahead and click on it so we can take a look. Here we have the full transcription of our video. So at the top we've got the first line of the person being intro'd onto the scene. We then have the music while the speakers sit down on the stage and then as you can see we're switching back and forth between the two speakers. I can go in and edit this text just like I would anywhere else and I can even follow along with the exact video and if we want we can choose to run a spell check just to make sure that all of the transcription is correct. And after running a spell check we can see that it's removed the stutter from which and turned it just into which and likewise with pleasure above. So we can go and accept those and we can then go and preview our video. And as you can see we now have an accurate transcription and we can go ahead and click on export and here we can export it as a text file, a pdf, a docx, json, html and even srt or vtt. And if we want to render this as srt or vtt we actually need to have opted in subtitles when transcribing. So if we go back to my youtube video which should now be finished as we can see right here. If I open this up here on the top right I can click export and here we can download as srt and vtt and that is how to transcribe your content. And if we go back the last thing I want to show you is scribe real-time v2 which is the fastest most accurate transcription model in the world. So if I click on try the demo and then we simply click transcribe we are now talking on camera and as you can see it's transcribing what I'm saying in real time straight into 11labs. And this is a transcription api that you can go connect to any product or tool that you're building to get real-time transcription for your live content. If you have any questions about how to transcribe your audio or video with 11labs let us know in the comment section down below. And if you enjoyed this video and you want to see more please hit that like button and don't forget to subscribe. Thanks for watching.
We’re Ready to Help
Call or Book a Meeting Now