Convert Audio to Text Using Azure Speech API
Learn to convert audio files to text with Azure and Postman, and explore language conversion features. Ideal for beginners and tech enthusiasts.
File
Using Postman to Interact with Azure Speech API Convert audio to text
Added on 01/29/2025
Speakers
add Add new speaker

Speaker 1: Hello everyone. So, this video is about how you can convert audio file to text using Azure Speech API. And here we are not going to discuss anything theoretical, rather just one idea is to just showcase how we can use Postman to make a call to REST API. So, first of all, the very first thing which we need is we need to create a resource in Azure portal, which is for Speech API. So, click on Speech Services. Click on Create. And here you need to provide the resource group. Here you need to provide the reason which is very close to you and the unique name for your speech service instance. Once this is done, the last thing you need to do is you need to search, select the tier. You can go for either free or the standard one, whichever works best for you. You can choose it. And if you're not sure, you can click on this hyperlink and it will provide you more details about the pricing. So, once all these seals are populated, you need to click on Review plus Create. And once it is created, you will see that the instance is created like this. And you will see that there are endpoints and keys generated for you. Now, next thing is let's go to the Postman. And here the very first thing, if you want to interact, then we need to generate the token. So, for generating the token, we need to shoot the post request. And here we need to provide the URL. So, this is the URL. And here we need to replace this particular thing with the region in which the instance was created. So, we have created the instance in West US. So, I'm mentioning here West US. So, if you're not sure which region you belong to, you can go back to your instance and have a look at this. So, this is the one which we need to replace. Okay. So, let's go back to Postman. Okay. So, I have replaced this with West US. And another thing is we need to update the header. So, in header, we need to add a key, which was generated in our Azure portal. So, let's quickly go to Azure portal again and grab this key. And I will paste it over here. So, now we are done with the initial things. Next thing is we need to click on Send. So, sending the post request will generate some token here. So, let's quickly grab this token as we may need it. And we will open another tab for the post request. And now we need the actual URL where we are going to hit our API call. So, for that, this is the URL. And here again we need to replace the region. So, I will set it to West US. And the one thing which we need here is we need to define the key and the content type here. So, for defining the key, I will go to header and add the key here. So, the value of the key is the one which we have just grabbed it from our Azure portal. So, I will quickly paste it over here. Next thing is we need to provide the content type. So, for content type, we can type this. And it would be audio file because we are trying to convert audio file to a text. So, here do make sure that whatever format you are writing. So, as of now, Speech API works with web files. So, that's the reason I'm picking up this one. Now, the one thing is the authorization header. So, here we need to select the bearer token and just grab this value. And paste it over here. So, we are done with initial thing. The one thing which is remaining is the parameter. So, which language do you want to convert it to? So, here we need to define language parameter. And let's say you want to see the text in English. So, here I have defined the parameter language. And the last thing which we need is we need to provide the input file. So, our input file will be here in the binary format. So, I will quickly select that file. So, sample.wav is my file. I can also play that file for you. So, that you will get to know what's in that. But what if somebody decides to break

Speaker 2: it? Be careful that you keep adequate coverage. So, this is the WAV file which is recorded

Speaker 1: in English. So, let's quickly go ahead and click on send. So, it will take a few seconds. And then we will get a response. So, here at the bottom, you can see that recognition status is success. And this is the text which the speaker was speaking. But if somebody decides to break it, be careful that you keep adequate coverage. So, this is what it is in that. And this is the duration. Now, let's say you want to convert the same text to some other language. I can take another language as Hindi. So, if you will click on send, it will convert the same English audio and will write your text in Hindi format. So, let's give a few more seconds here. So, the speech will still remain the same. The only thing is the visualization will be in different language. Now, you can see that although the text is still same, like decide to break you before you keep quiet and all those. So, the only idea behind is how you can convert your audio and how you can generate text out of your audio file. So, this was the intention behind this video. And I hope you enjoyed this. Thank you.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript