Convert Speech to Text in Python Using IBM Watson
Learn how to use IBM Watson SDK in Python to convert speech to text effortlessly. Set up your account, configure services, and transform audio files into text.
File
IBM Watson Speech to Text in Python Convert a speech into text in Python
Added on 01/29/2025
Speakers
add Add new speaker

Speaker 1: Hello techies, in this video, we are going to convert your speech to text in Python with the help of IBM Watson's Speech-to-Text Software Development Kit. This is your friend Stephen Raj, I am going to show the demo in this video. Without any delay, let's jump into IBM Cloud. Go to cloud.ibm.com, then login into your IBM Cloud account, if you don't have IBM Cloud account, you can create a new one. As soon as you log into IBM Cloud, you will land on IBM Cloud dashboard. Now we are going to provision our very first IBM Cloud Speech-to-Text API. To do so, go to the search bar, search for speech-to-text, then click on speech-to-text. This will take you to the creation page of IBM Watson Speech-to-Text Service. While we are using IBM Cloud over other cloud providers in the market, the main advantage of using IBM Cloud is, IBM Cloud is offering a free plan. In this free plan, you can get a free access to almost all the IBM Cloud services forever. Scroll down, enter a name for the IBM Watson Speech-to-Text Service, then click on create button. Then go to service credential, click on new credential, then click on add button. That's it. We successfully set up IBM Watson Speech-to-Text Service. Now we are going to use this IBM Watson Speech-to-Text Service in Python to convert your speech into text. To do so, I'm going to use Jupyter Notebooks. You can go with any Python IDE. To use IBM Watson Speech-to-Text Service in Python, we need to use two IBM Python modules, that is IBM Watson and IBM Cloud SDK Core. To install these two IBM Python modules in your system, enter pip install ibm-underscore-watson. To install IBM Cloud Core SDK in your system, enter pip install ibm-underscore-cloud-underscore-sdk-core. Now let's import Speech-to-Text version 1 from IBM Watson. From IBM Watson, import Speech-to-Text version 1. Now let's import IAM Authenticator from IBM Cloud. From IBM Cloud SDK Core, .authenticator, import IAM Authenticator. With the help of API key, we can access our IBM Watson Speech-to-Text Service in Python. To do so, I'm going to define a variable, then call IAM Authenticator, then enter the API key. You can find your API key in IBM Cloud. Go to IBM Cloud, click on Service Credential 1, then copy the API key. Now we are going to configure our Speech-to-Text Service in Python. To do so, I'm going to define a variable, then call our Speech-to-Text version 1. Then for Authenticator, I'm going to pass our API key. Now we are going to set a service URL for our IBM Watson Speech-to-Text Service. To do so, call our IBM Watson Speech-to-Text Service, then set service URL. You can find your service URL in IBM Cloud. Back to IBM Cloud, click on Service Credential 1, copy the URL. That's it. We successfully configured our IBM Watson Speech-to-Text Service in Python. Now we are ready to convert the speech into text in Python with the help of IBM Watson Speech-to-Text Service.

Speaker 2: This is the audio file that we are going to convert into text.

Speaker 1: To do so, let's open the audio file in Python, with open, enter the name of the file, then mode of the file. Now I'm going to define a variable to store the result. Then call our Speech-to-Text IBM Watson Service, .recognize. Then pass the audio file, then pass the audio format, that is audio slash mp3. Then finally, get the result. Now let's have a look at our result. That's it. We successfully converted the audio file into text file. Hello techies, this is Heflin Steven Raj. In this video, I'm going to convert your speech into text. That is the result for the audio. Now let's play the audio once again.

Speaker 2: Hello techies. This is Heflin Steven Raj. In this video, I am going to convert a speech into text.

Speaker 1: That's it. We successfully converted the audio into text with the help of IBM Watson Speech-to-Text Service in Python. With this, we came to the end of the video. I hope the video will be useful for you to convert your audio file into text in Python with the help of IBM Watson Speech-to-Text Service. For more videos like this, please subscribe to my YouTube channel. You can contact me on all the social media. The link will be given in the description section. You can visit my portfolio website. The link will be given in the description section. Thanks for watching. Have a great day.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript