Efficient Batch Transcription with Webhooks and Assembly AI

Convert Your Audio To Text

4.9/5

3720 customer reviews

Learn to create a batch transcriber using webhooks and Assembly AI, improving speed and efficiency in processing multiple audio files at once.

Transcribe Multiple Files Synchronously using Webhooks with AssemblyAI

Added on 01/29/2025

Speakers

Add new speaker

Speaker 1: Imagine you had a directory full of audio files like this and you wanted to transcribe all of them using Assembly AI's API. You might have done something like this. Loop through the directory audio and then transcribe each file one by one. You might have felt that this method of transcribing multiple files was slow and that's because transcriber.transcribe waits for each file to be finished transcribing before proceeding to the next. And that means that you only transcribe one file at a time. Instead today we'll be building a batch transcribing app that uses webhooks with Assembly AI to transcribe many files at a time. In fact we'll do this while still retaining the file name of each file. Let me show you how it'll work. So first we spin up our server and then in a new terminal we run our submitter script. And there you have it. Our transcriptions are starting to come in. You can see them here in the transcripts folder. Let's dive in. Today we'll be covering using webhooks together with Assembly AI. But first what even are webhooks? Well webhooks are basically automated messages sent from an app to a server when something happens. You can think of it like a text message notifying you of when a delivery has reached your house. Instead apps send webhook messages to servers to trigger follow-up actions. For example Assembly AI will send a webhook message to your server once your transcript is complete. Webhook messages are also HTTP requests that contain a message or a payload. The main advantage of using webhooks with Assembly AI is that you no longer need to repeatedly check if your transcript is complete. Instead Assembly AI server will send your server a message notifying you once your transcript is done processing. Another great benefit of webhooks is that you can include custom parameters such as a file name to label your transcript requests. This will be useful for our demo later. In today's demo I'll be using FastAPI, ngrok and Assembly AI to build our batch transcriber. FastAPI is a modern and fast framework for building APIs with Python. Ngrok is an easy way for developers to expose their development servers to the public Internet and Assembly AI is the best choice for automatic speech recognition. So how our app will work is that we have a directory full of audio files that we use a Python script to submit to Assembly AI. We will use the public URL that ngrok provides us to configure those transcripts to have a webhook URL that points to that URL. Then Assembly AI will transcribe our files and send a webhook message to our public URL which will then trigger a function in our FastAPI server to retrieve those transcripts and save them as text files. Before we build our application it helps to understand what webhook messages from Assembly AI look like. To do this we can use a website called webhook.site. It's an easy way for you to inspect any incoming HTTP requests and they provide you with a unique URL that you can specify as your webhook URL in your transcript requests to Assembly AI. Let's go ahead and give that a try. So what I'll do is I'll create a config, specify it as a AI transcription config and then I'll add a webhook URL for the URL that webhook.site provided us and I'll make it a string. Then I'll transcribe just one file just so that we can test and see. All right let's give it a try. We can switch on over to webhook.site to see if our server received any webhook messages. All right there we have it, a webhook message saying that our transcript is complete and giving us our transcript ID to retrieve our transcription from. Great. Now let's make a similar request but this time we'll specify the file name of the file we're trying to transcribe in the custom parameters of our URL. We do this by including a slash question mark and then file name equals a string of the file name. Just for example. Let's run and see what happens. Great it's come in and here you can see on the query strings we have file name Spanish.mp4 Perfect. All right now that you have a basic understanding of what webhook messages from Assembly AI look like, let's build a simple app that programmatically retrieves these webhook messages from Assembly AI, retrieves the transcript and then saves the transcript into a text file. First we'll have to create a new virtual environment. Then we want to activate our virtual environment and then install our requirements. In case you are curious, the reason we installed FastAPI.org is because we need other dependencies such as uvicon to run our server later. Great. Let's initiate our server. So we import FastAPI as well as Assembly AI and then we create our app. We will need an endpoint that accepts posts requests and then we'll define an asynchronous function to retrieve our transcripts. Like I mentioned before, we're expecting a file name in the query parameter and then we want to specify the body as result. To do this we need to create a model for result and then import base model. Our result contains two things, a status as well as a transcript ID. Great. When we retrieve the webhook we want to check that the transcript is completed and then we want to use Assembly AI to get by ID. Next we will write the transcript to a text file. Great. And don't forget to include your Assembly AI API key in your script. Nicely done. Now we can run our server to check that it's working. And if all goes well you should see that uvicon is running on your localhost 8000. Nice. Now let's write our submitter script. Before we start you'll want to sign up for your free ngrok account. You can do so from the link in the description. Once you get your auth key, include it in your script and then we'll specify our ngrok token. Then we'll create a listener and we want it on port 8000 and we will pass our token. Then we extract the public url using listener url and just for debugging we'll put print public url public url. Next we'll want to submit each file to Assembly AI. To do that we import To do that we import os and then we loop through each file in os.list directory audio and then we'll want to do transcriber.submit file don't forget to import assembly.ai of course we want to specify the webhook url to be the public url from the previous step so we'll create a config specify webhook url specify webhook url to be public url and then since we want to include the file name as well we'll just convert this to an f string and include it as a custom parameter then we'll include the config in our submit. Lastly, because we don't want our listener to close once our files are done submitting we'll open a for loop and wait for user input and wait for user input and then break. All right once you're ready let's run our script I forgot to specify my Assembly AI API key let's give that another try okay so I forgot to include audio in the file directory but now that I've added that in this should be fixed. All right let's run our script so we should start seeing webhook messages coming in and there you have it. Webhook messages are coming in and files are being saved in our transcripts directory. That's how you build a batch transcriber. Great job. In conclusion, today we have learned what webhooks are, what the benefits of using them are, and how to use them together with Assembly AI. We've also built a small app demonstrating the use of webhooks together with custom parameters. I hope you found that helpful. All the code used in this tutorial is in the description and if you have any questions at all feel free to reach out to us at our discord community or in the comments below. I'm happy to answer any questions that you might have. That's all for me today. Bye.