Maximizing YouTube's Automatic Captions for Accurate Transcriptions
Discover how to leverage YouTube's automatic captions for transcription tasks. Learn tips for optimizing audio quality and using Tunes to Tube for audio files.
File
Semi-Automatic YouTube Transcription
Added on 09/06/2024
Speakers
add Add new speaker

Speaker 1: Hi there, my fellow linguists. Recently I've searched for and tested automatic transcription software, including the top players in the field, such as Nuance and Speechmatics and others. However, the results turned out to be grossly disappointing, even though I was testing a simple audio file in English. Then suddenly I discovered an unexpected prowess of YouTube's automatic captions, which is surprising because most of this software is based on the same Google's API. Anyway, I hope this short tutorial will be useful for those of you who, like me, are facing a thankless job of hours of transcription to perform in various languages. So, for starters, let's go over some basics. Automatic captions on YouTube are available in English, Dutch, French, German, Italian, Japanese, Korean, Portuguese, Russian and Spanish. A word of warning though, the captions won't work if, for example, the audio is too complex for it to process, if the language in the video is not yet supported, obviously, if the video is too long, but in my experience it worked for longer files, like 30 minutes, 60 minutes, etc. The video should have good sound quality. If there is a long period of silence at the beginning of the video, then make sure to chop it off, otherwise the captions are not going to appear. And if there are multiple speakers whose speech overlaps, in my experience these automatic captions worked perfectly well with two speakers. Now, since YouTube only processes video, what to do if you have just audio files? Here is a simple solution, a Tunes to Tube service, where you can combine any picture and your mp3 file to upload a video on YouTube. Let's see. So you sign in with Google and allow. Good. So now you can upload any files you want, like image and mp3. Okay, so here I've uploaded a random mp3 file I found online and the picture. If you don't have a picture lying around somewhere on your desktop, you can create background image with this service. Then you have to verify that you're not a robot and create video for your YouTube account. It might take some time. And here we go. So after a couple of minutes your video should be online, so you just refresh the page. And don't forget to then go to Video Manager and change the view from public to private. Let's face it, for data protection you need to be the only one who can view it. For those of you who would like to transcribe a video, it's even easier. You just upload a video from the inside of your account and you're done. The last step is just waiting, because once the video is uploaded, then automatic captions are added, but it might take some time. In my experience, for example, this 8-minute video took literally maybe up to 5 minutes. Now I tried sometimes with 10-12 minute videos and those took up to an hour. So basically it depends. So you go in Settings, choose the auto-generated subtitles and then see what happens. You can see it's pretty great. So when Speechmatics on UNS gave me 20 to 30 percent of correct transcriptions, YouTube sometimes went up to 70. In this case you can see it's nearly 100%. Now what do you do next? You choose the subtitles option and they usually ask you to set up a language which is most spoken in the video, which I did here. And you can see the automatic captions are ready for you to go through them. And you do by pressing, clicking on Edit and the best part of it is it's time-aligned. So it's really easy to just go over and see

Speaker 2: if that was correct. So then you try to find the place where they were not right

Speaker 1: with the captions and just edit that. For example, here probably he meant Ray as a surname. So you would change that and you would go like this till the end. Then you can save your changes and here you're going to have your semi-automatic kind of transcription which you can just save and download. You can download in different formats and then easily just open in Microsoft World. So here is how it looks and it's very nice and handy. So good luck with that and all the best. Bye.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript