Using Assembly AI to Identify Speakers in Audio

Convert Your Audio To Text

4.9/5

3718 customer reviews

Learn to use Assembly AI API for speaker identification in audio. Upload files, fetch transcripts, and distinguish speakers effortlessly.

Assembly AI Speech to Text API (Part 1) - Extract Speakers and Transcription

Added on 01/29/2025

Speakers

Add new speaker

Speaker 1: In this bubble tutorial video, I'm going to show you step one of how you can use the assembly AI API to extract different speakers and the text that they say from audio. So we're going to be using the API to upload an audio file and then get a transcript back but know who said what in the transcript. But before I launch into that, did you know that we have got videos that you cannot find on YouTube exclusively available to our members at PlanetNoCode.com. This is going to pick up on some earlier videos where I was using the assembly AI API. And so if you need a bit of a recap on each of the individual steps, you can go back and check out those videos. But I am going to be explaining what's going on here, which is that I'm in the bubble API connector and I've added in an API called assembly AI. I've added in my API key into the authorization field, private key in header, and I'm making a post request to the assembly AI API. And this is the end point here. It is an action so that I can run in a workflow and I'm sending as JSON. And within the body, I've got one body parameter that I've made dynamic and that is I have to provide assembly AI with a public, open, accessible audio file or video file for them to be able to fetch and turn into a transcript. So I've uploaded a audio file to the bubble app storage and here is the link directly to it there. The only step that I've really done differently from my earlier assembly AI videos is I've added in this value into body, speaker underscore labels is true. And so if I initialize this call, and this will serve as a good recap for how the assembly AI API works, I get back an ID and check out my other videos for how you can get this all automated running for a webhook. But right now I'm just doing it in the API connector to demonstrate that all of the steps. So I'm going to copy the ID because this is the unique identifier for the transcript. So once assembly AI has finished processing the transcript, either you provide them with a webhook, which I've demonstrated in other videos, or you go and you look for the transcript using this ID. And so for this, I'm just going to look for the transcript. So I'm going to go down to my get process transcript ID, and this is all covered in the assembly AI documentation, but I've laid it out here in the bubble API connector. So I'm going to paste the ID into there and then initialize the call. And this is where I get back my transcript. So you can see here that my transcript starts with, hello, my name is Bob, I'm speaker one. And then someone else says, hello, my name is Emma, I'm speaker two. So if I scroll down to utterances, I can see that it begins to group them. And so I then have in utterance number one, I have, hello, my name is Bob, I'm speaker one. And then bubble only shows you one example, but if I go to raw data, oh, it's going to be a long way down. Where is it? We then have utterance two, hello, my name is Emma, I'm speaker two. So part one, I'm just showing how to get the response back that contains the data in JSON for identifying different speakers. Stay tuned for part two, I'm going to show you how you can start to process this through the bubble database and get the different parts of your conversation and display them.

Summary

Generate a brief summary highlighting the main points of the transcript.

Generate

Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate

Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate

Enter your query

Submit

Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate

Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate

Back

Forward

{{ Math.round(speed * 100) / 100 }}x

Select Audio file

Convert Your Audio To Text

Secure and Encryption, NDA

4.9/5 3718 customer reviews

1/730

Verified Order

“I needed an interview transcribed accurately and I was happy with the quick turnaround. ”

Jen

Jul 20, 2025

“Very accurate transcription, fast service, easy to use and order, thank you!”

Gabby

Jul 15, 2025

“I am beyond happy with this service, which I am using it produce interview transcripts for my dissertation research. The interface is easy, the customer service was prompt and informative, the transcript is accurate, and the pricing is wonderful. I will recommend GoTranscript to anyone who is in need of affordable human-powered transcription services.”

Justin McDonald

Jun 29, 2025

“great work. quick and professional”

christian oradesky

Jun 28, 2025

We Trust in Human Precision

Value-Driven Pricing

Trusted by Global Leaders

GoTranscript

24/7 Customer Support