Discover Riverside's Free AI Transcription Tool: Accurate, Fast, and Easy to Use
Explore Riverside's new free AI transcription tool that supports over 100 languages. Learn how it works, its accuracy, and how to use it for your audio and video files.
File
How to transcribe audio to text for FREE - Riversides new AI transcription tool
Added on 09/07/2024
Speakers
add Add new speaker

Speaker 1: Riverside just released a completely free transcription tool using AI where you can upload any audio or video and it can transcribe it in over a hundred different languages. This tool is completely free. It is built on top of Whisper, which is another tool from OpenAI, the same company that built Chat GPT, GPT-3, GPT-4. Later in this video, we're going to see how accurate it is. We're going to take an output from the transcripts and compare it to a transcript that has been proofread and see how the accuracy matches up. But for now, let's jump into the tool and just get an overview of how it works. So if you head over to riverside.fm slash transcription, you can see this is a super simple interface. It's really cool. You actually don't even need an account or anything. You just drag and drop your file. So I'm going to do an old podcast episode of ours, which is actually an interview with Jarvis about GPT-3 from a year or two ago when they were releasing tools for it. And so we just drag and drop the file and it'll do audio or video files and then we'll click start transcribing and it will upload the file. This file is a little under five gigabytes, so you could see it's just taking a little bit of time. This is also about a 50 minute interview. I haven't seen any listings on this if there are actually any limits, file size limits or duration limits, but I imagine there's some sort of limit. But this is pretty much like an average size interview and a pretty big file and it's uploading it in the background. So let's just fast forward through this once the upload is done. All right. So the file uploaded that took about, I don't know, four minutes, five minutes, and now it is transcribing. It is processing it in real time as we can see how it works. So it's actually processing the audio in real time. We can see the text and we can see the timestamps that it's generating as it's going. Now this is sort of kind of for looks right now. This isn't anything that we're going to do anything with right now. We're just going to let it process and we can see how it's going in real time. So we're just going to let this finish transcribing and then we're going to come back in and see what we can do with the finalized transcript. All right. So actually I had just stopped recording. I thought it was going to keep going because it just says eight minutes, 45 seconds, but now it's saying the transcript's ready. So I don't know if this other part is just kind of for show of like it processing in real time, but on the bottom there was a little percentage thing. And so that moved way quicker. So it was about an hour interview and it transcribed the whole thing in one to two minutes. We could check the footage to see how long it took and put the number up here. Okay. So what can we do with this? One thing is we can copy it to a clipboard and paste it wherever we want. And if we do this way, we see that we have time code markers, but we don't have speaker labels. So this video was two different speakers, me and the person I was interviewing. So the one thing that this transcript won't do is it won't identify speakers. You won't get speaker labels. So if you need that, you have to load it into something else that can do speaker identification like Descript. I'll have a link here on another tutorial that we've got on how to import transcripts from a separate source into Descript. So you can check that video out here on how to do that, but it does have time codes. So if you just need to kind of save this as a reference, put it in something that you search for later. Great option. You could do that here. If you want to throw it in Evernote or Notion or something else, other options we've got, we can click download and then we can download this in two different formats. One is a text file that is a transcript, and this is going to be slightly different than what we just saw with copy and paste. And then a little pop-up to try Riverside, which is totally fair because this tool has been completely free so far. By the way, if you do want to sign up for Riverside, put in the code Joey30 to save 30% off whatever plan you choose. You're welcome. All right. So if we open up that text file, we can see we are literally just getting a raw output of the text. No time code markers, nothing identifying it. It does say speaker up here at the top, but it's not identifying when the speaker changes. It is just literally the raw output of the text. No paragraph breaks, just text. And then the other cool thing is also we could download an SRT file, which is captions. And so this is something that we would use if we wanted to caption a final video and import it into YouTube. I would have to do comparison, but I'm going to guess that this Whisper version from OpenAI is much more accurate than what you would find in YouTube's auto transcription, which is notoriously not very accurate. So if you want to get your stuff transcribed for captioning, you could download the SRT file and then upload it to wherever you are hosting your video, YouTube, Vimeo, any platform they all support SRT files, LinkedIn, Facebook, they all support it. Now a couple of downsides of this free service. So it is free. So like you can't really complain about it, but one you can't preview back the audio and test its accuracy. You're going to have to load it into another source. Again, you can do that in Descript to check and see how accurate the audio is. Also you can't change any of the text if you see any inaccuracies as it's transcribing, you can't change anything. You've got to take your text, put it somewhere else and then change it. Again, you can do that in Descript. But yeah, I mean, this is a phenomenal tool for being free. You just load your audio in here. Also the other thing that is a little bit weird is if you are a Riverside user, you don't have to be logged in. This is completely separate. There's no way to push your podcast episodes and get this transcribed here. I would imagine that this is going to be integrated in the backend of Riverside sometime soon. As I'm recording this, there is a Riverside announcement next week where they're releasing a new version of their app. So I would expect to see some sort of integration with this transcription built into the app. But as of right now, as I'm recording this, it's a completely separate thing from the Riverside app. All right. And so lastly, let's see how accurate this transcript is. I'm going to go to this free online comparison tool. So I'm going to take this text that has already been proofread that we used in the episode, paste it here. And then I'm going to take the output from Riverside, which was using Whisper, which is from OpenAI, paste it here. And let's see how the two compare. And looking at this, it is, I mean, look, so the overall similarity says 88%. Seems pretty accurate. And just kind of scanning this, most of the differences look like punctuation points, like the period was added or a comma was added. The other issues I'm seeing are like proper words. Ironically, it got GPT wrong and said it was GTB3, which is funny because it's built on a system that created GPT-3. Some of the proper names, Austin Distal, Distal. So that's another sort of con, if you could add a con, was that you can't upload your own custom vocab library so that it is aware of specific proper words. But yeah, I mean, for free, for running through this, from what I can see, it looks like most of these differences are just different capitalizations, different punctuations, a comma where a comma wasn't added, and a handful of some proper words. I mean, look, it got California, it got other things correct. Got spelling of the letters correct, did not get the website correct, but that's okay. Got Jarvis correct. Got Austin correct. SEO got correct. I mean, I'd say did a great job. Any type of auto transcript you get, you're going to want to have to proofread it anyways to make sure it is accurate. But for this, for free, awesome starting point, definitely some of the most accurate transcriptions I've seen. Look at this one. It got Entrepreneur Alliance with the proper capitalization correct. That's cool. Super impressive and super handy. So go check that out. Riverside.fm. Slash transcription. 100% free. Upload your audio, upload your video, get it transcribed and use it however you need to use it. Thanks for watching. If you found this useful, make sure you give this video a thumbs up and hit the subscribe button for more useful videos like this one. Also, we've got a whole course on how to use Riverside if you are doing podcasting and video podcasting, which also you should be doing video podcasting because YouTube is only getting bigger with podcasts. So be sure to check that out. Thanks for watching. I'll catch you in the next episode.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript