Exploring Transcription with Assembly AI & Langflow
Learn how to transcribe audio with Assembly AI, using Langflow to convert speech into text efficiently, generate subtitles, and more with API-enabled features.
File
How to Use Assembly AI in Langflow for Automated Transcription and More
Added on 01/29/2025
Speakers
add Add new speaker

Speaker 1: Assembly AI is one of the leaders in transcription services. So you can convert speech into text and they have many different products available on their platform and we can use Assembly within Langflow. And the way we can get started is to first make an account with Assembly AI. And once you get started, you will be provided with an API key. This is something we're going to need in Langflow. Now, back in Langflow, there are a few different components available. And in this example, we are using the start transcript component. Within this component, we can provide a file. And what I did is I uploaded a one hour talk by Andrej Karpathy. And this is intro to the large language models. And this is an MP3 file. So after adding the API key and then the audio file, we can select a model that we want to use for the transcription. There are two different options available here, best and nano. Now, after you select the model, you can either have the language detection on or leave defaults. So I left everything in default and then I started the task. And once we run the flow, we get a transcript ID. And attaching this component with the Assembly AI pull transcript component, we can now get the results. And if we were to look at the results available from this component, there are quite a lot of fields that we can see as a result of this component. And some of the most important ones you can see is the text from the transcript. As you can see, it's quite a large file. And all of that was converted from speech to text easily by Assembly AI. It just took a few seconds. And then we can see word level timestamps if needed. As what was spoken at what time, the starting and end time for that, and also the confidence. If there are multiple speakers, then it also identifies the speakers for us. And then we can also see the utterances at different times. So there is also word, there is a full text, and there is some additional information available here. Now we can use this data for many different things. One is we can parse the transcript so we can just look at the full transcript that was available from this video or in this case, this MP3 file. And then we can also run to get subtitles. And this could be used for any services where we want to add subtitles in different formats. So there is the SRT and the VTT format available. And the way this looks, so I ran it for SRT. We have basically the timestamps as well as the sentences, those were converted from those timestamps. And we can see that it goes on for the full length of the audio file. And then if needed, we can also convert that to VTT. Last thing is that if you have credits available in your Assembly AI account, you can also perform a summary of the audio or you could perhaps do some additional tasks. So, for example, in our case, we could say that create a summary of the transcript. We could also say that create a blog post from the transcript or perhaps an essay from the transcript. So we can get creative with the available information since the transcript of the file is now available and we can utilize that text for many different purposes. The flow and the components should be available in the store. Be sure to add your API key in all of these components wherever it says to add the API key. If not, it might throw some errors. And there are also some additional components available. You can check those out based on your use cases as well. Give it a try and let us know if you find it helpful.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript