Easy Audio Transcription with Whisper AI Tutorial (Full Transcript)

Learn to transcribe audio using Whisper AI on Google Collab, effortlessly and efficiently, without stressing your computer. Step-by-step guide included!

Download Transcript (DOCX)

Speakers

Add new speaker

Speaker 1: In this video we are going to transcribe an audio track in text thanks to Artificial Intelligence. Maybe you might need to rewind a lesson or transcribe the audio track of your podcast, or maybe create subtitles for your video. For all these uses, the platform, the tool we will talk about in this video is perfect. Let's talk about Whisper AI, created by a company that I don't know if you know OpenAI, have you ever heard of it? In this case, obviously, we will use Whisper AI in a very soft and light way, in the sense that we will not install it as an application on the computer, because this makes the computer heavier and requires a good processor, but we will install this platform, in fact, using Google Collab, we will use a cloud support without using our internal processor. But don't worry, I will show you step by step how to do this, it is very simple, so I will share the screen and we will do it together. The platform, in this case, is free, open source, so you can do it indefinitely. I am Igor Papos, on this channel I often create practical tutorials like this, so AI guides, digital marketing tips, other stuff for creators, professionals and entrepreneurs, so subscribe to the channel to be updated, and of course I also leave you the links to my Telegram channel and various social media, so take a look. Without further ado, let's share the screen and let's do the guide step by step. Friends, first of all, if you don't have a Google account, create it, it's free, but if you already have a Google account, go to Drive and go to New, Other, Connect to other applications, and here go to Google Collaborators, this one, and go to install it. Once this is done, if you go to New, Other, you can open this application here, which at this point is installed, in a kind of new workspace, ok? Perfect. So here, don't be scared, in the sense that it looks like a complex thing, but you see that it is a fairly simple thing. We give a new name to this file, so transcription file audio, just to trace it. After that, to make this workspace more convenient, let's go here to Runtime, and we change the runtime, we get help with a bit of CPU, let's say, it's related to hardware, and let's go save it, ok? At this point, let's say, we are ready to install Whisper, but we won't actually install it on our local computer, but we will do it thanks to Collab. Now let's go get a code, which I'll leave you later. Here it is, I stick this line in here. Basically, what does it do? It takes Whisper from GitHub directly, and then it also installs FFmpeg, which basically helps you transcribe audio and even videos, if I'm not mistaken. Obviously, this stuff is not installed, as I said, locally, but it is installed directly on Collab, and at this point, all we have to do is put Play, Run in here, and let's run it for a moment. Let's see how long it takes, depending on the speed of both the connection and the GPU to which it rests, but it doesn't take very long, just a few seconds. Ok, here we are, it finished my transcription, yes, my installation, and it took us 1 minute and 24 seconds, so a little more than a minute. Ok, at this point, let's go here in the folders, and basically here, what can we do? We can load our audio track. You can also do it with a drag and drop. Clearly, alternatively, go and load directly here. I have my clip here. Ok, it says that they will be deleted after the runtime, and ok, I load it here. Perfect, it basically loaded my audio track, so imagine your podcast episode, your university lesson that you want to transcribe, or anything else. At this point, I need to add another small line of code in here, so that this stuff can actually be transcribed. I'm going to insert this piece of code here, which I also leave in the description. So, basically, this makes the transcription take place. Pay attention to one thing. As you can see, here, among the quotes, there is the file name. Obviously, in my case, the file is called Digital Marketing Clip MP3, so I put this here, but if your file is called Pippo, obviously you will have to write Pippo MP3. As for the model here, there are several models that you can insert, as the elaboration goes from the lightest to the heaviest. The lighter, the less precise, the heavier, but the heavier the transcription system. In this case, we leave Medium. I have always found myself quite well, ok? At this point, we can go with the transcription. So, we put here Run, and we wait for the elaboration. By the way, while he's working on it, you see that he has already identified the language automatically, so there is no need to specify it in this case. There are some ways to even do the transcription in another language, so you upload an audio in Italian to tell you, and he does the transcription, but he translates it to you in English to tell you. He doesn't double it, but he obviously translates the text to you. Ok, ok. Basically, here it is. The elaboration, the transcription, is finished here. But at this point, I should also generate nearby, you see, I went to update because this section was not updated, also the generated files, the TXT files, but also the SRT files, possibly for subtitles, and so I can go here directly to download all these files. I'll show you what it's about. I go here, download. In this case, I'm downloading the TXT, and I'm going to see it. Let's see a little. So, here. What are currently the most requested figures from companies in the field of digital marketing? There are several acronyms out there, which are in quotes, we understand well, however, what are the key figures that companies are actually looking for and that companies are willing to pay more for. We'll talk about it in this video. So, I want you to notice something. Usually, the transcriptions are done a little badly from the point of view of punctuation, pauses, etc. Here, however, everything is perfect. Even the spaces, for example, I'll show you, the space between a comma and, let's say, a new sentence, a point, a new sentence, the precision with the commas, the punctuation. Really, in my opinion, this is the most effective transcription system that you can find on the market. Well, friends, I hope this tutorial has been useful to you. I also leave you another video that is not about audio transcription, but about transcription and the organization of vocal notes with another very interesting tool. I leave you the video here. Subscribe, of course, to my channel to be updated. There are all the links to the newsletter, Telegram and my social media. See you next time. Bye.

Summary

Generate a brief summary highlighting the main points of the transcript.

Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Key Takeaways

Extract key takeaways from the content of the transcript.

Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Enter your query

{{ secondsToHumanTime(time) }}

Back

Forward

{{ Math.round(speed * 100) / 100 }}x

{{ secondsToHumanTime(duration) }}

Select Audio file