Learn Speech to Text Conversion Using Python and Google API

Convert Your Audio To Text

4.9/5

3721 customer reviews

Discover how to convert live speech and audio files to text using Python libraries and Google API. Step-by-step guide with code examples and application demo.

speech to text using python I Auto transcription Speech Recognition

Added on 09/06/2024

Speakers

Add new speaker

Speaker 1: Hello friends, welcome to Coders Palace. In this video, I will explain you how to convert speech to text using Python and Google API. Basically, we will be writing a code for implementing a very popular technology known as speech recognition. But before starting, make sure that you have following Python libraries installed in your system. These libraries are Speech Recognition, PY Audio and Port Audio. So make sure that you have all these libraries installed in your system. I will explain you how to convert a live speech to text and as well as how to convert an audio file to text. Firstly, I will start with how to convert a live speech to text. Since I have all the required libraries installed in my system, so i will directly start writing the code the first thing that we need to do is import the speech recognition library so for that i'll write import speech underscore recognition as sr and then we'll make use of the recognizer function of the speech recognition library and we will refer it using the variable r, so we will write sr.recognizer, ok, now the next job is we will have to capture whatever we are saying, ok, so for that we will write with sr.microphone okay as source and we will store whatever we are saying in a variable audio and this will be equal to r.listen listen to what source which is nothing but our microphone ok so now we have captured whatever we have said in a variable audio now we will have to pass this to the google api and google api will generate the output for us which will be a text that it will predict ok so we will make use of a try and will print the output of the google api let's say here we will write system predicts ok and now we will call the google api recognize underscore google and we will pass this audio to the google API okay now in case if something goes wrong then we'll make use of exception to handle it okay and let's say we'll write something went wrong okay so now we are all set to test it now i will speak some sentences and we'll see uh how uh how this few lines of code is able to convert our sentences to the text okay so we'll run it okay i'll save it hello hello hello check so friends you can see that it is working 100% properly ok thanks to the accuracy of the google we will try with some more sentences google how are you yeah you can see again that just by writing few lines of code we are able to develop something which is actually very nice okay so friends as i told you that in this video i will also explain how to convert the audio file to text okay so for that we just have to add few more lines we will import the system library of python and since we are making use of audio file so we need to store it somewhere let's say we will store it in a variable file name and we will pass our audio file as the first argument so let's say sys.argv1 and here instead of microphone we will have to write audio file and here file name ok so friends now we are all set to test with the audio file. I will make use of the command line for this. Okay, so we'll run Python and then speech to text. Okay, this is our code. And this is the audio file. Make sure that you have your audio file in dot wave format. This I have captured and I have spoken something. Hello. Hello. Check. Hello. Hello. Hello. Check. Check. Check. Something like that. So let's see is Google able to predict it is our system working properly or not. Yeah, you can see that it is working properly. Okay. Entrance using the QT which is a framework for developing application. I have developed the application for converting speech to text. Okay, I am making the use of the same code which I have explained in this video. Okay, so this is the application that I have developed. Here you can see that we can click on get started and then here is option for help. It is basically telling us how to use this application ok we can try it out hello hello hello check see it is working properly we can also give import some audio file i have it on my desktop our audio file test1.wav ok yeah so we can see again that it is working properly we can also save the text on this particular window for that i will just click on save text and whatever and wherever I want to save it, let's say I will save it on the desktop and we'll say name it text only and let's say click save now we'll go to desktop and we'll see whether we have that text or not yeah we can see here that this is our text we'll open it using notepad let's see yeah so whatever was there on the video has been saved to this file we can also clear our window using this clear button ok so friends you can see that this application is working very fine and very properly and it is very nice and friends in In case if you want to use this particular application then I have uploaded this on my github the link for which I have given in the description you just go on this link and you just clone it after cloning unpack it and after unpacking just click on this particular file speech underscore recognition dot exe ok and that's it you will be able to use it Suppose if you want it for your college project or something like that and you want it to be yours so here I have a developer info section basically on clicking it we get the information about the developer and here I have given my information so if you want that your name should be here then you just have to pay me $5 I will change it to your name and I will also provide you with the source code and I will also remove this quarters Palace product from here so that this product will be yours so friends thanks for watching please do like subscribe and share and bye bye till the next time