Explore OpenAI Whisper API and Build a Web App

Convert Your Audio To Text

4.9/5

3727 customer reviews

Learn about OpenAI's Whisper API, create a web interface, and deploy it on Replit. Embark on building transcription and translation apps efficiently!

OpenAI Whisper Tutorial Audio to Text Translator Website Project

Added on 01/29/2025

Speakers

Add new speaker

Speaker 1: In today's video, I will tell you about the Whisper API of OpenAI. I will show you how the Whisper API works in OpenAI. And after this, we are going to make a web interface, which we will make on Rapplet. In order to see how these APIs work. So let's go to the computer screen and let's get started. Intro Guys, before moving forward, I want to tell you about the bounties of Rapplet. If you come to the bounties section of Rapplet, then you will get a lot of projects. Which you can finish and earn bounties in Rapplet. So do check them out. And it can be very exciting to connect with people. And along with that, you can earn cycles here. Which translates to real money. So do check out the bounties. Alright guys, so I have entered my computer screen. And now I am going to go to Rapplet.com. And after going to Rapplet, I will make a new REPL here. After logging in. And I would like to tell you that I am using Rapplet here. Because it becomes very easy for me. And here I will click on Create REPL and create a Python REPL. So I will write Python here. And here I am going to write OpenAI Whisper. And by doing Create REPL here, I basically want to tell you how Whisper works. But before we talk about Whisper, I would like to take you to OpenAI. And we will log in to OpenAI. So that you can clearly understand what OpenAI is. And why we should use it. So I will log in to my OpenAI account. So as you can see, I have logged in to my OpenAI account. And here the first thing I am going to do is I am going to create an API from this account. So here I will name it Whisper. And here I will create a secret key. And after creating a secret key, I will copy this secret key. One thing I would like to tell you is that this secret key will not work. If you try, then don't take my secret key. You have to upgrade your account. And by upgrading the account, you should not have any problem. Because I don't think $4 or $5 is too much. If you actually want to learn AI and want to do other things. So what I will do here is I will keep my secret key carefully. So I do one thing, I keep it here carefully for now. And look here, as I am trying to paste it, I am being told to press the tab to use the secret key. So I will press the tab. And here I will say, I will keep its name as OpenAI key. Okay. I have made it a safe secret. So whenever I want to use it, I can do it like this. Okay. So I have inserted this code. You can see. Import OS. And here I have inserted this. As you can see, my secret is equal to. I will close it. And here you can see that Import OS. And my secret is equal to OS.environ. And in this way, I can use the environment variable. I would like to tell you that to use the API of OpenAI, you will have to upgrade to a paid account. Yes, I know that I used to say that you don't need to upgrade to a paid account when they used to give trials. But now the trial scene is a little over. So I don't think $5, which is around 300-400 rupees, will give you more problems. So you guys actually upgrade. Because what we are going to do is very dangerous. And there is no need for the API of OpenAI for Whisper. If we use chat gpt or use other APIs, then we need it. But there is no need for Whisper. So now what I will do here. I will write Whisper here. And you see here. I will write Whisper OpenAI. I will write Whisper OpenAI and I will write GitHub. And after writing Whisper OpenAI GitHub, you can see here that GitHub.com slash OpenAI slash Whisper. And they have given this GitHub link. Now what I will do is that their documentation, which is very straightforward according to me. And this Whisper is such a powerful thing that you guys will also say that why didn't I know about such a dangerous thing before? First of all, I will show you a demo quickly. So what I will do here is that I will copy the Python usage. And I will copy it here. And here we will install it before pip. So we do it step by step. First of all, what we will do is that we will install it from GitHub here. So what I will do is that I will come to my shell and paste it. Right click, paste, enter. And I am installing Whisper here with the help of pip. And as soon as I install Whisper, you guys see how we can actually use it. What is Whisper? What is OpenAI Whisper? First of all, let's talk about this. So OpenAI Whisper is a model for transcription and translation. Okay, I will show you here. So what is Whisper? First of all, let's talk about it. You have a given audio. You want to get the transcript of it. So if you want to get the transcript of it, that is, whatever I said, you can get it. So you can use Whisper. Along with that, you can translate it in another language. Suppose I said Whisper is an automatic speech recognition system. So it can be translated. Whisper in Hindi is an automatic speech recognition system. Do you understand? And Whisper is so well trained on data that it actually converts it into English. That is, it understands which word to say in English and which word to say in Hindi. But it's not like if you use Google Translate, you don't feel that it's a translated text. Come on, man. But Whisper can actually amaze us. So I'm going to tell you about this in detail here today. So here you can see that the Whisper we installed is being installed. And here you can see that it's taking a lot of time, things will be installed. So while it's working here, what I'm going to do is record a recording. And I'll just say a very simple thing here. Hello, my name is Harry. And I am here to tell you about Whisper API. OpenAI Whisper is an amazing thing. And it is something that you will definitely use if you can understand how powerful it is and how to use it. And I'll stop it. By the way, this is a Windows recorder, which comes by default. If you are in Windows 11, you will know that it comes by default. Maybe I can save it by pressing Ctrl S. No, I'll have to click here and save it. I think I'll have to click here and save it. No, I guess it saves automatically. So to locate it, I've never used it much. I think I can do right click, show in folder. Yes, recording 2. So the recording 2 will come here. You can see. Recording 2. I will rename it. I will rename it. Let's say I write 1.mp3 or audio.mp3. And I will upload this audio.mp3 on replit. So I will upload this audio.mp3 on replit. Not to mention that you can do the same thing locally. But I am using replit here. So basically what I will do is that I will translate and transcribe this audio. So you can see here that it is written about Whisper. And as soon as I click on API here, I will click on Overview. In fact, I guess I will have to click on Docs. After clicking on Docs, I will have to find Whisper's API somewhere here. So you see here that Whisper API. I do one thing. I search Whisper. And here. Transcribing audio with Whisper. So all these packages have been installed here. So here you can see that I have installed Whisper. Now what I will do here is that I will also install OpenAI. If I do pip install OpenAI, then OpenAI will also be installed. But because I am accessing this repository and I am referring to it, then I will follow it exactly. So what I will do here is that I will take imports up. And here you can see that I want to transcribe audio.mp3. And I want to print the result. So let's do it. Let's hit run. And you can see here that I did nothing. I recorded an audio and put it here. It got this. After this, I simply imported the OS here. I imported Whisper. I loaded the load-based model of Whisper. I tried to transcribe the model. And I tried to print its text. Let's see how it performs. And my OpenAI API has not been used at all yet. But it says, I will do pip install numpy here. Let me see if it works. Looking in indices. Now I will run it again here. So you guys see that this should run. And by the way, I will tell you that I have not used the OpenAI API in this. Now you see that it downloaded the base model of Whisper. And it is saying that no such file or directory ffmpeg. So it basically needs ffmpeg to run. So here you can see that it is giving the error of ffmpeg. So what I will do here is that I will simply go to packages for ffmpeg. And as soon as I double click on packages. And I will search for ffmpeg. And as soon as I search for ffmpeg, I will install it here. Okay. And after installing, ffmpeg will be installed here. And here ffmpeg has started installing. So packages in replin are basically used to improve the functionality of your code. If you have to do this in Windows, then you will have to install ffmpeg manually. The same is the case in Linux. But here you simply have to install this package. And this ffmpeg will install for you. So let's wait for ffmpeg to install. And as soon as it is done, I will come back. So as you can see that it is complete. Now I will simply run python3 main.py. And I am waiting for it. Another way to install ffmpeg in replin is that you just write ffmpeg. And after that select ffmpeg.bin and hit enter. And here see adding ffmpeg.bin to replin.nix. And it will add ffmpeg to replin.nix file. And after that, it will install ffmpeg for you. So you can see that it is downloading and installing ffmpeg. And it is reloading shell. So let's see here. man ffmpeg Let's just write ffmpeg and see again. And yes ffmpeg is running now. So I will come to main.py and click on run. python3 main.py And if you have a very good laptop, then you can do the same thing on that laptop. And see my whisper model is running. By the way, let me tell you that the API key of OpenAVI is not required here. If we do any work related to chat gbt api, or if we use gbt4 or gbt3, then we may need it. But not at all now. See here it is saying hello my name is Adi and I am here to tell you about whisper API. OpenAVI SMA is an amazing thing. And it is something that you will definitely use if you can understand how powerful it is and how to use it. Wow, this is amazing. So this is how you can use whisper without the help of OpenAVI. Now I will tell you that if you go to your OpenAI documentation and copy this code, then we can actually use it like this. Again, I am saying it again and again and I know I am saying it again and again. Let me quickly do pip install openai here. And our OpenAI package will be installed. Earlier we were using whisper package, now we are using OpenAI package. As you can see here, I have imported OpenAI here. And here I have written audio file is equal to open. And here I will write audio.mp3. That's it. This audio.mp3 is kept here. Transcript is equal to whisper1 and audio file. That's about it. I don't need to do anything. I will run it here as soon as you see that it will work. So I will run it here. I have already installed ffmpeg. This package is a little different than OpenAI package. So you should know this thing. Now see here that it needs API. Why does it need API? Because it is actually using OpenAI hardware. In the old whisper model we were using, OpenAI API was not required. So here we will simply import os. And I will write here api is equal to os.getenv And my key was in the name of OpenAI key. And I think this should be good. So instead of doing this here, I use it here. OpenAI.api key is equal to os.getenv Oops, what did I do? I have to copy from here. OpenAI I will type it here. OpenAI.api underscore key is equal to os.getenv OpenAI key. Now I run it. And now see that it will work. Because my OpenAI key is in my secrets. And I have not done anything here. So I will print it here. I have not printed my transcription here. So I have to print transcript. So I print the transcript. Transcript And as soon as I run it, you will see that I will get my transcript here. So basically we are using hardware. Whose hardware are we using? We are using OpenAI hardware. Because of which he is telling us to give our OpenAI API key. So you can see here that I did it in text. Hello, my name is Ari and I am here to tell you about Whisper API. Wow, this is amazing. How much fine-tuned they have given. OpenAI Whisper is an amazing thing and it is something that you will definitely use. You can understand how powerful it is and how to use it. Oh my God, this is exactly accurate. We just saw what was there when we were using Whisper API. He made a little mistake. He wrote my name is Alice. If I am not wrong. Now you see here that we are running this on our hardware. This OpenAI is running on its hardware using your API key. So definitely OpenAI has fine-tuned it. OpenAI has made this model. So definitely OpenAI knows what are the caveats. What is right and what is wrong. So they did a great job. Now you see the amazing thing here. Here I can translate. I can translate in English. So if I say something in Hindi, then I can translate that audio in English. Let's try this out. Let's record an audio. Hi, my name is Harry. And today I will teach you how to use OpenAI. That's it. I saved it as Recording 2. I will show in folder. And here I will name it Hindi.mp3. And I will upload it on replit. By dropping it here. And here instead of saying audio.mp3, I will say I will comment it out. So that you can use both codes. And we will copy this code. In fact, we just need these two lines. And now you see the amazing thing here. I said audio file is equal to Hindi.mp3. Hindi.mp3 And here we are using Whisper 1 model. And let's see the transcript. I said here in Hindi. And what did you say? And now let's see. By running it. Hi, my name is Harry. And today I will teach you how to use OpenAI. Wow, and do you know what I said in Hindi? See this. Hi, my name is Harry. And today I will teach you how to use OpenAI. This is really very impressive. This is really very impressive. Now let's use all this and make an app. And let's convert this repository into a Flask app. So to convert it into a Flask app, I will create a file named main Flask.py. And if I want, I can use GhostWriter. And to use GhostWriter, I will simply import from Flask. I will write from Flask. And I will write import. And so many things will be imported here. But there is a better way. I will click on generate. And I will write give me a Flask minimal template. Which serves a form that takes an audio file as an input and uploads it to the static folder. Let's see. I don't know if this is a code or not. Oh wow, oh wow. So I have written from Flask, import Flask, request, redirect. I will check it. It's not like I will start running it and see why it's not working. No. Actually, I will accept the suggestion. And after accepting the suggestion, let's see if the server is also running or not. So I have run it here. And here the uploaded file I am getting here. I will have to understand this code a little. And definitely I will explain it to you. So the code generated by GhostWriter I will definitely explain it to you. And here as soon as I run main.py In fact, I will rename it. I will rename it old main.py And I will name it main.py I guess for some reason main.py is running. main.py Let's run it now. So I have run python main.py I have a port 5000 opened Use the 0.0.0.0 to open the port to web traffic. So let's do one thing. Let's use 0.0.0.0. So here host is equal to You will have to do all these things manually. GhostWriter will not do all this for you. But despite that, GhostWriter has done so many things for us. So here you can see everything is done by AI. You can say, why are you sitting here? But no, I am also sitting here to explain all this to you. And I hope you are also getting help from sitting here. As soon as I choose the audio file, I will upload it. So I will select this audio.mp3 and let's see what happens. As soon as I click on upload, it says there is no static directory. So let's make a static directory. And now let's do it again. And yes, file uploaded successfully. Here you can see I got audio.mp3. Now let's come to the code and understand how the code is working. So basically what is happening here is that I have used get and post. Then I have imported the OS. I have configured an upload folder here. Then I have said that the slash endpoint has both get and post. If post comes, we will upload the file. But if get comes, we will return this template. You might want to, you know, update this template and you can use bootstrap etc. But for now, I will leave it as it is. And what I will do is, when my method is post, I will get this file. And I will throw it here. And at the same time, instead of not redirecting, I can do one more thing. I can say that, serve the transcript of this file by using jsonify. So let me show you what is happening here. I am returning the URL for uploaded file and I am redirecting it. So here file uploaded successfully is coming. As soon as I redirect it, here it is saying that it is going to uploads.mp3 So I have to fix this. Why is it going to uploads? That is something that I have to check. So what I will do here is, I will make it static. Instead of uploads, I will make it static. And here, basically what is happening is, URL for uploaded file is going here. So I have to fine tune this. It is showing me not found. And if I make it static, it is telling me to upload. So I will do one thing, I will remove this root. And what I will do is, as soon as my file is uploaded, I will transcribe this file which is coming from file name with the help of OpenAI. The code is written here. So let's go. Let's import this code in main. I have imported the OS twice. So I will import the OS once. And after doing this, what I will do is, these two lines which are making the transcript, and yes, I am doing English to Hindi. I am doing Hindi to English. Sorry, I am doing Hindi to English. So what I will do is, I will come to my main. And here, as I have written file name here, return redirect, so I will comment it out. I don't want to redirect. Instead, what I want to do is, I want to return transcript. So instead of saying this, what I will do is, I will import Jsonify here. And what I will do is, Jsonify, what I will do is, Jsonify transcript. That's it. Now after doing this, I will get my transcript of audio. Now let's see what problem is coming here. Have I done something wrong? I have done return here. So I have to return Jsonify transcript. That's good. Okay. And here, why is it showing redirect? Let's try to run the server. Run terminated triple quote. Okay. Have I done something wrong here? Return triple dot redirect. Okay. So I have to remove this return. I don't know why this return came from. But I think we are good to go now. So here, I will upload the file quickly. And which file will I upload? Hindi.mp3. I will upload Hindi.mp3. I will click on upload. This file will be uploaded first. And then I will expect to get the transcript. So let's see what happens. And I am waiting for the transcript. So here it is. Hi, my name is Harry. And today I will teach you how to use OpenAI. Amazing. Amazing. So I want you to make this web interface better. I think you can make this web interface better. Now I will do one more thing. If I want to do this in any other language, then how will I do it? If I want to do this in any other language, then how will I do it? If I want to do this in any other language, then how will I do it? Now you can see here that we currently support the following languages through both transcriptions and translations endpoint. So what I did here is, instead of digging the documentation, I told Ghostwriter to change this code to translate to Bengali. And as soon as I run this, you can see here that target language is equal to bn. I will press ctrl z. In fact, what I had to do was, I will edit it here. I will say change this to translate to Bengali. And after this I will have to accept it. When it will generate this thing, I will accept it. So here will be the option of accept suggestion. And I have accepted it. So you can see that I have changed it such that target language is Bengali. Let's try to run it again. So here I will remove this return. I don't know where this return comes from. I will run it. And after reloading it, I will upload my Hindi Hindi.mp3 I will upload it. So the file will be uploaded first and then the model will be run. And here it is telling me my name is Hari and I will teach you how to use OpenAI even after doing target language bn. So let me see what is wrong. In fact, let's do one thing. Let's do target language bn here and see. Okay, so it seems that language is equal to I will have to put. But let me try. Not target language, I will have to write language only. Let's try to run it. And what I will do here, I will write python and oldmain.py. Let's see. And here it is saying that you have not provided API key. Okay, let's do one thing. Here I will change target language and I will write language here. Why this one did not work? Because I had set the environment variable of replit here. That's why it didn't work. But again, it will work here. And here you see, as soon as I upload my API, let me show you. Sorry, not API, my file. Similarly, hindi.mp3 will be uploaded as soon as I do it. It is saying invalid request error. Translation language en. It is saying value is not a valid enumeration member permitted en. Now if you want to translate it in Bengali or any other language, it seems that you will have to use GPT-4. And to use GPT-4, what you will simply do, the translation here, as I show you here, the translation you will use it. And here simply what I will do, I will copy it. And instead of saying that I want to translate it directly in Bengali, what I will do here, I will paste this code. Now you see what it is here. How it will work. Response is equal to openai.chat completion.complete sorry .create. So here I will go in ABI reference. Then I will go in models. Sorry, I will go in completion. I am looking for an example of GPT-4 completion basically. And I want to use Python. Completion. Create. And here say this is a text. This is indeed a text. Let me do one thing. I will use GPT-4. Because if I am not using GPT-4, then what is the use? So what I will do here in message, I will do something like this. In message, role, as it was written here, in examples, when I was translating, role was system and after this, this is user. So I will do one thing. I will put a system here. I will put a user. So I will copy it from here. I will copy my messages from here. And here I will paste it. So here came role, system, context, what is it? You are a helpful assistant. In fact, I will do system and put it here. And I will say you have to translate it in Bengali. Here I will paste it. And I will make French in Bengali. And after this, what I will do, I did this role system. And after this, I will put another object in it. I will put another object by putting a comma. And this time what I will do, I will make role user. And after this, I will put, I will make role user. And what will be the content here? My transcript will be content. And after this, what I will do, the response here, return jsonify response. I will return the response. Let's run it now. And here it is saying that there is a problem. What is the problem? Let me see what the problem is. Dot complete. Messages came here. And role was system. After this, role became user. Let me see what the problem is here. And the problem here is that I have to write messages is equal to. And as soon as I write role is not defined, messages is equal to. And I have to put it in double quotes. And as soon as I put it in double quotes and run it, it should run. And yes, it indeed ran. And I will open a new tab. This time I will choose our Hindi again. And I will upload it. And let's see what happens. Whether it gives me in Bengali or not. If something goes wrong, then we will fix it. Now it is saying is not of type string. It is saying invalid request error. My name is Harry and today I will teach you how to use OpenAI. Is not of type string. So what I have to do, I have to do transcript.text. I think this will fix it. Because I have to give text. Which makes sense. Let's continue. Okay, this time I hope it will run. And as soon as it will run, oh wow. It actually gave me in Bengali. If anyone here knows Bengali, then confirm me in the comment below and tell me whether it is written correctly or not. Tell me quickly. Here model gpt is for use. It will be correct. It will not be wrong. We can do it in other languages. You can make language as an input. And it will work like a charm. It will work like a charm. So I just hope you guys are having fun. In such videos. And you can fix its UI very well. And tell me in the comment section below that this app is actually working or not. Do one more thing. If anyone wants to do this work, then I invite you. Do one thing. Give an input of a language here. And then choose the file and upload it as soon as it is translated into that language. How much fun it will be. An audio can be translated into any other audio. Now I tell you how we can upgrade this app. Such that we can give language input. So what I will do here is I will make a variable named language. Like I told you here how you can request.files and take the file. What I will do here is I will write a very simple method. I will write a simple code. And I will write here language is equal to request.formlanguage What I will do here is I will come to my html. In fact, the html is I might want to render template. So I will write here return render template index.html And I will have to make a folder here with the name templates. And I will make here index.html template. I will paste this string. And I will say ghostwriter something like this. You can use it in any way. I will say edit it. Use styles to make it beautiful. Also add an input tag with name language in the form. Let's see if it changes or not. I am really very excited that it works well or not. And if it does, then how well it does. And let's see if it changes our form well or not. It has started adding a lot of styles. Which might make it look good. But time will tell what happens. Name is equal to language. I am impressed. I am impressed. So let's run this app. And definitely in the new tab I will prefer to run it. So our server has started as you can see. And I guess I didn't import the render template. So I will have to import render template. Render template and yes. Now it should work. We ran into an internal error. The problem is that I didn't write the render template. Let me start again. Hopefully this should work. So it says index.html is not found. So the render template has the first argument that index.html is not found. Templates. Oh. I put the template in static. I should have put this template folder outside. So I will move it outside. I will stop and run it again. And it should work now. Oh wow. It works. But again why the styles are not working. I will have to check this. And why it is using the old index.html. I will have to check this too. So let me check the index.html. Okay. So index.html is same again. I think I will have to edit it again. I will write in front that add an input with name language in the form also style this form using css add a navbar make it responsive. Okay. So I am using ghostwriter here and I will wait ghostwriter to finish this code and ghostwriter is really very amazing. You can use any AI but I have used ghostwriter because it is built into replit and why should I not use ghostwriter. Obviously it is very good. It writes very clean code. And I like it a lot. You see it has written name is equal to language. I think behind the scenes it uses 4. Very impressive. I will stop it. And here I have to accept. I did not do accept suggestion last time also. I did this mistake because of which my code was not updated. But this time I think we will stop it. Okay. It has restarted. It has restarted. And let me see what the problem is. It seems that vertically it is centered. Navbar we will have to correct it. So what is the problem in Navbar. Let me see. Let's see what styles it has added here. So here I want to see why Navbar did not stick at the top. Display flex is installed in the body. Which doesn't sound good to me. If display flex is installed in the body then I think justify content center should not have been installed. So let me fix it. I will fix it later. So I don't want justify content center in the body. And I have to put star here and margin should be 0 and padding should be 0. I will make padding 0. Okay. And I will restart it. Along with that I will do one more thing. Anchor text I will make it white. Let me show you. So inside Nav I will make its color white and along with that I will make text decoration none. Okay. Now it looks good. I will make text decoration none and color white. And I can make its UI even better. But I will leave that thing to you. Okay. Inside Nav, do all these things. Okay. And after doing this let's see what happens. And definitely you can make style.css separately. See I have changed it. If you don't have about page then make it yourself. This form can be made better. But again, let's see if it works or not. Let's say I write French in language and here I will put my Hindi.mp3 and I will click on upload by writing French. And let's see if it gives me French text or not. I will close it here. And let's see if it gives me French text or not. It is giving me Bengali text and I have to see why it is giving. Okay. So I got it why it is giving me Bengali text. Because in main.py I have hardcoded Bengali here. I should make it fstring and by making it fstring here I should write language like this. Language is my variable which I have taken from above. And let me see if it will work or not. So I have given it a dictionary here. It has string in it. Okay. So it should work. Let's restart. Okay. Let's restart everything. And here I will write French and I will run it from here. I actually prefer it to run from here but the thing is that many people like to see it in full screen. So it is a matter of choice. Okay. I clicked on upload. Let's see if it works or not. Otherwise we will run it from here. Let's run it from here. Okay. I will upload it. And I will close it from here. And you can see that it has put this thing in front of it. Now again I can make the UI better. By giving some vertical spacing in it and by making the buttons better many things can be done. If you do the prompting properly it will be better. Okay. If I want I can write CSS and make it better. But again I will leave that on you. Now what we will do is we will deploy this app using replit deployments. So to deploy it first of all I will stop it. And if I will definitely reload it it won't work. I will stop it again. And now what I will do is I will come here and click on deploy your project. What I will do here is I will deploy it on reserved VM and click on setup your deployment. And I will use 0.5gb CPU and 2gb RAM and move forward. Purchase and deploy. And deploy your project. Okay. So it is saying what secrets do you need. I need OpenAI's key because I am using it here. So I need it here. Build command is optional. Run command python3 main.py. Okay. So this will work. Now here what you want to do is you want to deploy it using web server. And as soon as I will deploy it what will happen that it will start deploying. So it is preparing the deployment. And it will definitely deploy in some time. So I am waiting for it. And the deployment has started. So it takes some time. It will build, deploy and many more things. And finally it will deploy. Okay. So let's wait for it. And let's see if it will deploy or not. And definitely it will. So I am just waiting for it. Now while it is deploying, I will show you one more thing. If you will write here openai whisper github So I will show you that you can run it locally. Without using openai key. So this is what I want to show you. So click on code and it is saying sign in required. I will download it as zip. And after downloading the zip file I will open the zip file. So you can see here whisper main 1.zip I will extract it here. And all the files will be extracted. And now what I am going to do is open with code. And here I will In fact I will open with code inside this folder. And I just want to show you that how you can install it on your computer. So what you have to do is install instructions are given here. I will show you. Here you have to pip install openai whisper In fact you have to do this. You don't have to clone the repository for this. But again you can do it like this. You can do it like this. And here it is running this command. And finally whisper will install it. And once you have installed whisper then you can use it like this. I will show you. You can use it like this. whisper japanese dot wav minus minus language japanese like this. So here whisper minus minus help. So what you will do is you can translate it in Japanese if you want. You can use many models. You can use model medium or model large. But again you will need resources. Then you can see the settings to use GPU etc. You can explore it. I don't save it. I will open this folder and show you. You can just write whisper like this. And see whisper is installed. It works like a command line utility. And it is running now. You can write minus minus help on it. I interrupted the keyboard. Whisper minus minus help. Okay. And now it is telling how to use it. So you can see it in the documentation. And accuracy is given here. Like how much accuracy it has for Kannada. You can do that. By the way, you can bring the audio in English too. And finally what you can do is you can use GPD 4 with 4GPD API. Sky is the limit. It is taking some time to deploy. While it is deploying, I will tell you one more interesting tool. I will tell you about the tool. The tool is so dangerous that you will clap. And write in the comment section how did you like the tool. Okay. For example, this is my video. Is DPN the new VPN? This is my video. In fact, I am playing it. I will play it. And after playing it, I will copy its URL. And I will write Whisperjax in the new tab. And this is the link of Hugging Face. The second link of Whisperjax. Sanchit Gandhi slash Whisperjax. I will open this. Big shoutout to the creator of this spaces. And you have 3 options here. You can record something with the microphone. I will try it. I will allow it. Hi, my name is Harry and I am a very good guy. Stop recording. Let's see if the audio has come. Hi, my name is Harry and I am a very good guy. Okay, my audio has come. I will translate it. I will submit it. This is my Hindi audio. He will convert it into English and give it to me. Hi, my name is Harry and I am a very good boy. Wow. This was a very low wow. Now I will show you a dangerous wow. Put the YouTube URL here. Check the return timestamp. Submit it. And look here. You will be shocked. After seeing the result. It will return with the full timestamp. And finally I can go to chat gpt. I can tell chat gpt to generate a proper SRT for it. So I will do one thing. First of all, I will show you what it generated. After that, what it generated, I will show it to you after adding it to my YouTube video. Guys, today we have talked about a device which I have never seen before. And what did I say there? We will talk about a device which I have never seen before. So I said that I have never seen this device. Guys, today we will talk about a device which I have never seen before. And you can see here that it has a full timestamp. You can copy it from here. And the timestamps that have been generated by using the large model of whisper, you can deploy it on your computer without opening it. And when you do this, you will also enjoy it. Guys, you will also enjoy it. I am telling you. So sky is the limit. Sky is the limit. You will enjoy it a lot. I will close it now. And I will come to my deployments and see whether my replay deployment is finished or not. So all these tools, you must check out and use them. And by copying this thing, you will give a prompt to chat gpt to download a proper SRT file and give it to me. And chat gpt will give it to you. Otherwise, you can write code in python or you can get it written in chat gpt to convert it to SRT or convert it to some other subtitle format code. Okay. So sky is the limit. What should I tell you guys? So I hope that you must have understood how all these things happen. Okay. I am waiting for my replay to be deployed. I will give it some time. When it is done, I will come back. Okay. So guys, as you can see, it has been deployed here. And I also got the deployment URL here. If I go back, I got the deployment URL here. So if you visit this URL directly, then you will reach here. So guys, as you can see, I have put this URL here. And our UI is working. I would like to tell you one thing here that I will remove my OpenAI key. So in fact, I have put my OpenAI key here. I will delete it from my OpenAI account. And your key will not work. You will have to put your key on main.py. So please go ahead and add your key. Don't use my key because it will exhaust. And I can't recharge it again and again in my OpenAI account. So you guys take your paid account if you are using it. And if anyone is on a trial of OpenAI, which most of the people won't be because it's been a long time since you guys have been making such projects. So I will tell you to recharge for $5 and you will enjoy it. I can assure you that you will like it a lot. Once you recharge for $5 and you will be able to make such good projects and all the restrictions will be removed. It's going to be worth it. So if you learn all these things, you will have a lot of fun. So I hope you guys liked this video. I will put all the relevant links in the description below. That's it for now guys. Thank you so much guys for watching this video. And I will see you next time. Bye.

Summary

Generate a brief summary highlighting the main points of the transcript.

Generate

Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate

Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate

Enter your query

Submit

Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate

Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate

Back

Forward

{{ Math.round(speed * 100) / 100 }}x

Select Audio file

Convert Your Audio To Text

Secure and Encryption, NDA

4.9/5 3727 customer reviews

1/732

Verified Order

“I haven't used the customer support yet, but the interface, guides, and easy access to the contact buttons are promising. The output is also really accurate and well-executed:)”

keziah

Aug 15, 2025

“Service is very fast and easy. I noticed a few errors but they were minor. I like your service.”

MICHAEL TRENT

Aug 12, 2025

“Excellent service!”

DanutM

“Excellent service, thank you very much!”

Samantha Cava

Aug 11, 2025

We Trust in Human Precision

Value-Driven Pricing

Trusted by Global Leaders

GoTranscript

24/7 Customer Support