Using Assembly AI to Identify Speakers in Audio
Learn to use Assembly AI API for speaker identification in audio. Upload files, fetch transcripts, and distinguish speakers effortlessly.
File
Assembly AI Speech to Text API (Part 1) - Extract Speakers and Transcription
Added on 01/29/2025
Speakers
add Add new speaker

Speaker 1: In this bubble tutorial video, I'm going to show you step one of how you can use the assembly AI API to extract different speakers and the text that they say from audio. So we're going to be using the API to upload an audio file and then get a transcript back but know who said what in the transcript. But before I launch into that, did you know that we have got videos that you cannot find on YouTube exclusively available to our members at PlanetNoCode.com. This is going to pick up on some earlier videos where I was using the assembly AI API. And so if you need a bit of a recap on each of the individual steps, you can go back and check out those videos. But I am going to be explaining what's going on here, which is that I'm in the bubble API connector and I've added in an API called assembly AI. I've added in my API key into the authorization field, private key in header, and I'm making a post request to the assembly AI API. And this is the end point here. It is an action so that I can run in a workflow and I'm sending as JSON. And within the body, I've got one body parameter that I've made dynamic and that is I have to provide assembly AI with a public, open, accessible audio file or video file for them to be able to fetch and turn into a transcript. So I've uploaded a audio file to the bubble app storage and here is the link directly to it there. The only step that I've really done differently from my earlier assembly AI videos is I've added in this value into body, speaker underscore labels is true. And so if I initialize this call, and this will serve as a good recap for how the assembly AI API works, I get back an ID and check out my other videos for how you can get this all automated running for a webhook. But right now I'm just doing it in the API connector to demonstrate that all of the steps. So I'm going to copy the ID because this is the unique identifier for the transcript. So once assembly AI has finished processing the transcript, either you provide them with a webhook, which I've demonstrated in other videos, or you go and you look for the transcript using this ID. And so for this, I'm just going to look for the transcript. So I'm going to go down to my get process transcript ID, and this is all covered in the assembly AI documentation, but I've laid it out here in the bubble API connector. So I'm going to paste the ID into there and then initialize the call. And this is where I get back my transcript. So you can see here that my transcript starts with, hello, my name is Bob, I'm speaker one. And then someone else says, hello, my name is Emma, I'm speaker two. So if I scroll down to utterances, I can see that it begins to group them. And so I then have in utterance number one, I have, hello, my name is Bob, I'm speaker one. And then bubble only shows you one example, but if I go to raw data, oh, it's going to be a long way down. Where is it? We then have utterance two, hello, my name is Emma, I'm speaker two. So part one, I'm just showing how to get the response back that contains the data in JSON for identifying different speakers. Stay tuned for part two, I'm going to show you how you can start to process this through the bubble database and get the different parts of your conversation and display them.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript