Install WhisperLarge Turbo Model Offline on Windows
Learn to install OpenAI's WhisperLarge Turbo model offline using Node.js and npm on Windows, enabling transcription in the browser with a simple setup.
File
Whisper Turbo on Windows - Easy Tutorial for AI Transcription Locally
Added on 01/29/2025
Speakers
add Add new speaker

Speaker 1: Hi everyone and welcome to your channel. In this video, we are continuing our coverage of this newly released OpenAI's WhisperLarge Turbo model, which is also available on HuggingFace. In this video, we are going to install it locally on the Windows operating system. We will be installing it totally offline in the browser, where you just have to run it in the browser and you can upload your own files and as I showed you in another video, you can even do the live transcription. And I will be walking you through step-by-step tutorial as how to do that. If you are interested in learning more about it as how to download and install it fully on your local system through API or through code or to run it in the browser in Linux, please search my channel. I have done heaps of videos on it in the last two days. Plus, I also have covered various Whisper models. Now, if you don't know what Whisper model is, Whisper is, as this page mentioned, is a model for automatic speech recognition or ASR and it is also quite good in speech translation. You can also do real-time live transcription as I showed you in another video. This newly released Whisper model is a fine-tuned of the previous model with a slight hit on quality degradation, but due to its size, it is quite performant and fast-paced. You can run it in your browser with the help of transformer.js and you can achieve 10 times real-time factor, which means that you can transcribe 120 seconds of audio in around 12 seconds or so. So how does it work? The way it works is that it primarily uses this format, which is called as ONNX format for the model. ONNX stands for open source. ONNX stands for Open Neural Network Exchange and it is an open standard format for representing trained models to be interoperable and optimized for edge devices and CPUs and commodity hardware. And of course, you can't download and load mega heavy models onto your browser. You need something lightweight. And that is why if you go to the files and versions on Hugging Face, you will see that the model size is quite small. It's not that huge at all. Whereas if you go to the actual model and you check the files and version, you would see that the tensors are around 1.62 GB. Whereas again, if I come back here, you will see that no file is more than MBs. There is nothing in the GBs, which is quite good. And if you go to the ONNX one, again, you see everything is MB. And there are various variants of it, where if you want, you can go with the higher one, but you don't have to. And I believe this browser one just uses the 200 or 300 MB file from this list. And that's about it. And that is what we are going to see shortly in our video. So before I do that, let me introduce you to the sponsors of the video, which are AgentQL. That is a structured query language, which enables you to do a lot of good things. For example, you can turn any web page into a data source with its Python SDK. And you can also use their live debugging tool. And then you can scrape and interact with web content. It works on any page. It is quite resilient and reusable. Plus, it structures the output according to the shape of your own queries. It is quite a robust alternative to Fragile Xpath and DOM CSS selectors. So do check them out. Okay, so that's said and done. Let's go and try to get this thing installed. There are two prerequisites, by the way, which you need to meet in order to install this WhisperLive3 model. First and foremost, you should have Node and npm installed because behind the scene, what it is doing, it is using transformers.js to run your browser through JavaScript. So that is why you would need the node and npm. In order to get it installed, just go to this nodejs.org. From there, you see that in the download page, once you click on green button, click on prebuilt installer and then click on download. And it is going to download it on your local system. Let's wait for it. So if you click on your top right, you will see that the MSI has already been installed. You can download it. I have clicked on it. Just click next. And then the usual Windows stuff where you just click next, next and next. Just automatically let's go. So I think it's just using all the default ones. It not only installs node, but also installs the npm. And that is all we need to do. And that is all done. And here you can, it is telling you what it is doing. Just click on it. Press your any keys on your keyboard. It is installing the additional tools here. Shouldn't take too long. So let's wait for it to finish. The node is installed. Next up, you need to make sure that you have installed git. If you don't know how to do it, just search for git for Windows and then download it. And again, next, next, next. It is going to install the git for you. Next up, let's open PowerShell. Or maybe I think the best way is to open it as an administrator so that we won't encounter any issues here. Run as administrator. And that is done. Okay. Let me deactivate my previous conda environment. Let me clear the screen. Let's create a maybe folder on our desktop. Okay. So I'm in this. This is my web GPU directory which I have just created on my desktop. Next up, let's git clone this whisperweb GitHub repo. It shouldn't take too long. That is done. Let's cd into it. And now we are into this whisperweb directory. Now here we need to perform installation with node. So before that, make sure that you have node installed by doing node dash dash version, npm dash dash version. And then, of course, we already have seen that git is working. So this is my git version. Let's clear this screen. And now let's use npm to install everything from this repo. Let me run it. Let's wait for it. It is going to take a bit of a time. All the packages are installed. Next up, let's run it with npm run dev. And you can see that the application is now running at my local host at port 4. Let's go to the Chrome browser and open it. So from here, you can, as you can see, you can just give it a URL to any audio file or you could upload a file from your local system or you can even record. So, for example, if I go to from file, just maybe play this gfk1, click on transcribe audio and it is transcribing it. There you go. How quick that is. Very quick and very easy. Similarly, you can provide a URL or as I said, you can record it from your microphone. And as we saw earlier in my other video, you can export it as a text and you can export it as a JSON. Also, if you want, you can also record it by click on record and start recording. I'm not sure because I'm on the VN, so I'm not sure if my microphone is detected. But let me try in front of you. That's not doing anything because my microphone is attached to my local laptop, not with a VM. And that is where I'm recording this video. But if you are running it on your local system, this should natively work as I already showed you in my this video, which I just did today. I'll just click here. I think it is not published yet. So it will be published in 15 minutes where I have done the live transcription. With the microphone. So you should be able to see it shortly. So that's it, guys. I hope that you enjoyed it. Let me know what you think. If you like the content, please consider subscribing to the channel. If you're already subscribed, please share it among your network as it helps a lot. Thank you for watching.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript