20,000+ Professional Language Experts Ready to Help. Expertise in a variety of Niches.
Unmatched expertise at affordable rates tailored for your needs. Our services empower you to boost your productivity.
GoTranscript is the chosen service for top media organizations, universities, and Fortune 50 companies.
Speed Up Research, 10% Discount
Ensure Compliance, Secure Confidentiality
Court-Ready Transcriptions
HIPAA-Compliant Accuracy
Boost your revenue
Streamline Your Team’s Communication
We're with you from start to finish, whether you're a first-time user or a long-time client.
Give Support a Call
+1 (831) 222-8398
Get a reply & call within 24 hours
Let's chat about how to work together
Direct line to our Head of Sales for bulk/API inquiries
Question about your orders with GoTranscript?
Ask any general questions about GoTranscript
Interested in working at GoTranscript?
Speaker 1: Okay, I really wanted to present, uh, this is, this is the demo. Okay. So the, in 2.6, um, our team from Mconf contributed something, which is, uh, live, uh, captions, which, uh, use Google web speech, which is a really, it's a really simple to integrate API from Google, where, uh, every browser will, um, uh, send, uh, their audio into Google. This works on Chrome, only on Chrome, right? And, uh, the audio goes to Google servers. They use their magic AI stuff, transcribe it and send it back. And then, uh, the blue button will have the audio for each user and we can, uh, do live captions. This is all good, but, uh, it uses Google, which, uh, uh, is appropriate proprietary system, and also this is really, um, painful for people all around Europe and actually in Brazil with our new legislation because of the data privacy. So the idea here is let's use, uh, one of the newer open source AI systems so that we can host them ourselves. And that's what we did. And we didn't do it before because this technology is really new. And, um, but now, uh, me and Arthur, which is kind of shy, he's somewhere, but he really helped a lot. So we just took, um, took something called Vosk, which is, um, a transcription server that actually uses something called CAUD, uh, on the, on the background. And so what we're doing technically is really simple. We just intercept the call on FreeSwitch, which is our audio server, the audio server that BBB uses. And, uh, we use something called mod audio fork. We intercept the call and we send this directly into Vosk and Vosk sends us back, um, JSON messages with information from, uh, transcription information. It's not, um, perfect as I guess you're seeing, especially when, uh, when you, you're saying lots of, uh, nonsense made up words that programmers use. So this is probably something that's, oh, no, it stopped. Okay. It's back. Uh, so this happens with Google as well. So, uh, be, be kind with me, but, uh, so this is running. Yeah. The idea is this is running Vosk and this is transcribing my audio in real time. This would work for all the users in the session. We can get their audios independently. So that's pretty cool. And here on the side, we have, um, oh, I'm not sure why it stopped, but the idea is that, um, the way that, uh, most transcription systems work is that they will give you partial results, which as you see, uh, can change over time because, uh, the AI maybe is more sure, uh, of a better translation the more you speak, right? So, and, uh, we displayed those partial results. And then when it's, uh, kind of sure that you finished the sentence, because maybe because you stopped talking for a while, or maybe because it just thinks it should end there, it will give you something that like a final result. So the idea is that we, this is an Etherpad, which is actually kind of, um, uh, it's an old system that we had for transcriptions, whereas just, uh, someone, a transcriber would be transcribing them, manually typing them. And, uh, this is kind of, um, a mashup of both systems. So when we have the final result, it will just be printed, uh, actually added, appended to the Etherpad. Um, it's a live demo, so something went wrong here, but, uh, until, uh, until a few moments ago, it was just sending the messages there. And the idea is that after the messages have passed, someone could just edit them and fix the, some bad translations, bad transcriptions and stuff. But I think that's it. Um, going forward, um, the, the components that we did to kind of glue this together, uh, we're making it configurable so we could have something, uh, other transcription servers. So the other candidate is, uh, Whisper, uh, OpenAI Whisper, which is something that maybe if, uh, we got lucky, we can get working before the end of the afternoon, because it's kind of, um, a matter of, um, doing the, the handshake. It's expecting and sending the data in the correct format, but it shouldn't be too hard, but that's the demo. That's it.
Generate a brief summary highlighting the main points of the transcript.
GenerateGenerate a concise and relevant title for the transcript based on the main themes and content discussed.
GenerateIdentify and highlight the key words or phrases most relevant to the content of the transcript.
GenerateAnalyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.
GenerateCreate interactive quizzes based on the content of the transcript to test comprehension or engage users.
GenerateWe’re Ready to Help
Call or Book a Meeting Now