Install Whisper.ai on Google Drive via Colab Tutorial

Convert Your Audio To Text

4.9/5

3743 customer reviews

Learn to install Whisper.ai for transcribing audio via Google Collaboratory. Follow steps for setup, uploading files, and exporting formats efficiently.

Whisper AI on Google Colaboratory How to Install and Use It Online

Added on 01/29/2025

Speakers

Add new speaker

Speaker 1: In this video, I'm going to show you how you can install Whisper.ai on your Google Drive with the help of Google Collaboratory. So let's get started. Before starting the tutorial, I'm going to explain what Whisper.ai is. And if you know what Whisper.ai is, then skip to this timestamp to begin the tutorial. So Whisper.ai is an advanced technology that can convert speech into text. It is developed by OpenAI, a company that specializes in artificial intelligence. Whisper.ai works by taking an audio file and splitting it into 30 seconds of chunks, which are then converted into a log mill spectrogram. This data is then fed into an encoder, which uses machine learning algorithm to convert the audio data into text. As of now, Whisper.ai can recognize almost 99 languages, including English. It is completely open source, meaning that anyone can use it or modify it for their own purposes. So the first thing you want to do is open up your Google Drive. In case if you don't have a Gmail, you can create a new one. So after you are in your Google Drive, click on this new button and click on more and then click on connect more apps. In here, search for Google Collaboratory and click on the first option that says Collaboratory. Now click on install, click on continue and after installation, it will say Google Collaboratory was connected to Google Drive. Click on OK. Now you can close this window and now click on that new button. Go to more and inside there you will see Google Collaborator. Click on it and it will redirect you to Google Collab. At first glance, Google Collab can seem overwhelming, but do not worry, it is not as hard as it looks. So before installing Whisper.ai, there are two things you need to do. First, click on runtime and then choose change runtime type. Select GPU in here and click on save. Once that's done, you need to rename your Google Collab file so that you can find it again easily. Now if you go down to the description of this video, you will find three lines of code. Copy the first two lines of code and paste it here. What this line does is it uses the first line command to install Whisper.ai from GitHub and then it checks for an update and after that it will install a module named ffmpeg. Once you copied and pasted, click on this play button right here. It will begin installing Whisper.ai on your Google Collab file. It will take around 20 to 25 seconds. My process was done within 19 seconds. So after you install Whisper.ai, click on this folder icon right here and in here you can upload your audio files. So I'm going to upload this test audio file that I have recorded. It will give a warning saying this runtime's files will be deleted when this runtime is terminated. Meaning that whenever you close this collaboratory window, it will cancel out and reset everything to its default. Meaning that you have to re-upload and rerun all these codes. So click on ok. Once your file is uploaded, click on this plus code button and in here copy the third line from the description and paste it in this section within the colons. Write your audio file name with the extension. What this line of code does is that it calls Whisper.ai and gives him this audio file and tells it to use the media model to transcribe this audio file into text and subtitle formats. The reason why I used media model is because there are a total of five models in Whisper.ai from tiny, base, small, medium and to large. So the higher you go, the much time it will take on transcribing or translating your audio files and also it will increase in size and from my experience and I have seen a lot of other people use the same media model because it is faster and yet it is really accurate. In this line of command what you can also do to make it work a little bit more faster is if your audio files are in English then after the model medium you can put a stop and then en referring to the English model only. If your audio records are in different languages like Spanish, Portuguese, France or any other language then make sure to leave it as just the medium model. Once you've done that click on this run button right here. Now it will download the medium English model and start transcribing the audio file. Now this process does take some time depending on your audio length and the model you use. So the transcribing process for my audio file is done and it matches completely. Now before closing the collab on the left hand side you can see that it has produced some files. All of these are different types of files that it has transcribed and exported from my audio file. If you want to use this as your video subtitles then you can download the SRT or the VTT file or if you just want to export it as JSON, TSV or normal text format you can also download them by clicking on these three buttons and choosing download option. You can see that it has transcribed my audio file and also set the timestamps of where my text should be seen on the video. Now before you are closing this make sure to download your files that you need and always remember if you want to re-upload a file then you can always upload it but if you close this then you have to rerun the line of commands you run firstly because it resets everything to its defaults. Thank you for watching this video. If this video has helped you then make sure to leave a like and subscribe to the channel. If you have any queries to ask then leave them in the comment section and I'll try my best to reply as fast as I can and again thank you for watching.

Summary

Generate a brief summary highlighting the main points of the transcript.

Generate

Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate

Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate

Enter your query

Submit

Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate

Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate

Back

Forward

{{ Math.round(speed * 100) / 100 }}x

Select Audio file

Convert Your Audio To Text

Secure and Encryption, NDA

4.9/5 3743 customer reviews

1/736

Verified Order

“Price is fair, accurate transcriptions and user friendly.I would recommend.”

Robert

Oct 20, 2025

“I am delighted I chose your service. The human interpreter did all I needed. I chose GoTranscript because of the time I saved by having this done. Thank you.”

Alfred

Oct 16, 2025

“So far, OK ”

Steve

Oct 15, 2025

“All good, the transaction is correct, but the waiting time was three times longer than advertised. ”

Edgar Giovanni

Oct 14, 2025

We Trust in Human Precision

Value-Driven Pricing

Trusted by Global Leaders

GoTranscript

24/7 Customer Support