Demo: Google Cloud Speech-to-Text API Setup
Learn how to enable and use Google Cloud Speech-to-Text API for transcribing audio files with high accuracy. Follow steps for setup and execution.
File
Demo of Google Speech to text Tutorial-33 TheEducationByte
Added on 01/29/2025
Speakers
add Add new speaker

Speaker 1: In this lecture, we'll see a quick demo of Speech-to-Text API. So go to cloud.google.com. Make sure you're in the right account. If not, click on the top right corner to change your account. Let's click on Go to Console. So we need to enable the Speech-to-Text API. So go to APIs and Services, Library. Search for Speech, and you can see Cloud Speech-to-Text API. So there's another one, Text-to-Speech as well. Just for your info, it's the reverse. And click on Enable. Okay. So you can either do this via gcloud SDK, which is more useful because you can do things programmatically, or you can simply go into Activate Cloud Shell. So we're going to use the built-in gcloud SDK within the Cloud Shell. Okay. So the first step that we're going to do is we're going to copy a file, an existing speech file into this Cloud Shell. This is given by Google as a sample. The steps are there in the resources section. So this is the exact command to copy. This is the file, Cloud Sample Tests and Speech in Brooklyn.flac. So I'm just going to run this command here. This is just going to copy that file over into our current directory. Click on Authorize. Okay. So it's copied this file. Similarly, the other thing you can also do is you can also upload your file to Google Cloud Storage and make the permission public and use it to copy here. I'll leave that as an exercise for you to figure out. Now, if you want, you can also click on Download File. And before you download the file, you need to know what the path is. So let's click on pwd. So this is the path and this is the file name. So let me build this. So the file name would be something like this, slash this one, right? So if I go back here and say ls, this is the file name, right? So you can simply download this. So copy the path and you can click on Download File and you can give that and say Download and let me download it here. Let me play it for you. Okay, I'm not sure if that was audible enough, but it was, how old is the Brooklyn Bridge? So now we can simply call the speech to text API to translate this file. So this is the command. It's very simple. It's gcloud ml speech recognize and then the path name and the language code we are going to use is English US. Again, notice here that we are just using the G storage directly. The only reason that we copied the file over to our shell is just to see what the file is all about. So let's paste it here. Let's run this and there you have the output. So I'm just making this bigger. So simple, right? Transcript is how old is the Brooklyn Bridge and the confidence level is 98%. That's pretty cool, isn't it? So a 98% confidence that the speech that we heard is this and it is exactly the same. So that's pretty amazing. Quick bit of cleanup. So we want to remove the file that we created. So you can just give rm space, rm space bro, then do a tab. Then it should give you the file name. If there are multiple files starting with bro, keep pressing tab, it will cycle between the files and remove this, you can exit out of the cloud shell. And just to make sure that there's nothing else that has been created, always it's good practice to go back to the project and see if any resources have been created. So nothing's created. So that's good. So that's a quick demo.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript