Unlock Speech-to-Text with Whisper in Google Colab
Learn to transcribe and translate audio using Whisper AI. No coding skills required; just follow along on Google Colab for free access to powerful AI tools.
File
TURN ANY SPEACH INTO TEXT WITH AI (GOOGLE COLAB) 3 MINUTE WHISPER AI TUTORIAL
Added on 01/29/2025
Speakers
add Add new speaker

Speaker 1: Hello everyone and welcome to the coding branch. Today we'll be learning how to use a speech-to-text AI tool called Whisper. Whisper is incredibly powerful because you can transcribe an audio file into a readable text file and not only that but it's trained on over 95 languages and you can translate as well. All of that for four easy payments of $0. But first, why did the Java developer quit his job? Because he didn't get a raise. Alright, now that the crappy programmer jokes are out of the way, if you enjoy learning how to use AI tools, subscribe to the channel. I'll be breaking down how to use all of the latest AI tools so you can build the next Skynet. I'll be hosting the code in this video on Google Colab. Google Colab is extremely handy because you can run Python code in the cloud for free and since I'll be setting everything up for you, you don't even have to know how to code. I'll leave a link in the description below so you can get access to it. If you would like to further read into this, I'll also leave a link in the description below for the GitHub that explains what I'm about to show you and more. I've created a simple script here that today is going to enable us to upload an audio file and read it as a text file. So we'll get a good understanding of the transcribed version of the AI. In a later video, I'll show you how to use the translate portion of it. Let's start by going to the Google Colab. Then we'll go up here to the top left-hand corner, hit runtime, hit change runtime type, and make sure you have GPU chosen. If not, you'll probably get an error. Once you have that, it's as easy as clicking play to install the packages. Then we're on to the hardest part and that is finding an audio file to upload. For this, I actually read Satoshi Nakamoto's announcement email of Bitcoin. If you haven't heard of Bitcoin, you've probably been living under a rock, but if you'd like to see some Bitcoin related programming on the channel, let me know in the comments down below. But for now, I'm just going to upload the audio file. Once you've got your file uploaded, you can either type the directory in here just after the word whisper, or to make it easy on you, you can simply right click on the file you just uploaded and hit copy path. Then just paste it here. Once you're past the hardest part, just hit play again and voila. Let the AI do some magic and once it's done, you can check back into this side panel where you can see it's generated a bunch of files, including a JSON, an SRT, a TSV, a TXT, and a VTT. If you'd like to just read the file you just created, click on the TXT file. To download it, just right click on the file and hit download. Either way, you'll see the AI did an excellent job at transcribing the entire paper, brought to you in part by Satoshi Nakamoto. It's that easy. If you have any questions about the process, let me know in the comments down below. And if you enjoy this type of shorthanded content, leave a like and don't forget to subscribe on your way out. Stay foolish and stay hungry.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript