Local AI Tool: Easy Transcription with Whisper AI

Convert Your Audio To Text

4.9/5

3727 customer reviews

Discover how to install and use Whisper AI locally to transcribe audio and video securely without uploading data. Perfect for efficient note-taking.

Whisper AI - Artificial Intelligence to Never Take Notes Again

Added on 01/29/2025

Speakers

Add new speaker

Speaker 1: Hey there and welcome back to the Sigma Engineering channel. The last few weeks I've been busy with meetings, constant meetings and lots of note-taking. Luckily for me I had a really cool AI tool that could transcribe most of my notes for me. It was easy, private, all done locally on my machine. No uploading any content, that was proprietary, confidential, etc. and I gotta tell you it's awesome and I want to share it with you all today and show you how to install it and use it. And we're gonna be using some github programs that utilize Whisper AI and so I'll leave a link in the description below to where you can find some of these programs to use them as we move forward. So before we get into the install process let's talk about how these different programs work. Basically it uses Whisper AI from OpenAI to be the backbone of the program and Whisper AI is awesome ASR tech. ASR stands for automatic speech recognition and this software basically allows a computer to convert spoken language into written text and so it uses advanced algorithms, machine learning to analyze audio recordings and transcribe what it's hearing into a text format and that is what these programs do. So in the paper, just some highlights, the paper shows that using a lot of supervised data for training is really important. Whisper uses a diverse data set that contains a large number of transcribed audio samples. Also on the note of zero shot transfer, Whisper approach involves training the model on one language and then applying it to understand and transcribe other languages without any additional language specific training. This is called zero shot transfer. Also trained well with resistant to background noise, other distractions, and it's trained on different languages so it can understand multiple languages without needing to separate training for each language. Whisper also can accurately transcribe long speeches or conversations even if they're hours long. So basically it kind of truncates different parts of your speech and then separates it out in a little clips and then puts it together. And you know when it comes to Whisper, accuracy is almost on par with human transcribers. I think they said it was about like in the 80% accuracy level which was pretty good. And so unlike other you know methods that use complex techniques, Whisper relies on straightforward supervised training and zero shot transfer to achieve its impressive results. So let's get back to the installation of transcribe anything. From the github page it should have the requirements to install the software. So you are gonna need Python. I recommend version 3.10 and I showed how to install that in previous videos. It's very straightforward but I'll put a link in the description below so you can get Python installed. And if you want to do this you just copy the URL, go to some folder on your desktop or on your computer, click on the Explorer icon in the link bar, type in cmd and then just say git clone and the link to transcribe anything. It should download everything you need and you're gonna see everything in this folder and then you're also gonna see the Python file setup. So if I hit cmd again here I'm just gonna say pip install transcribe anything and these instructions are on the github page and it will start installing the software on your computer. Once installed just type in transcribe anything, put a link such as a link to YouTube video and then let it do its thing. And what it's doing now is it's going to take that YouTube video, interpret what the audio is saying and convert it to a text format. Now that the transcription is complete you're going to see a folder and it'll say text and you know whatever the subject was or file or video. If you open that folder there should be a text file within that folder and this is the output of the transcription itself. And there you have it. I bet you didn't think you were going to get Rickrolled in this video. So now if you have local files, audio files, video files you can just place them in your transcribe anything folder and then just say in your command window just say transcribe anything and I'm going to use this test file. It's test.mp3 and then it will transcribe that mp3 file for me. So if you have a note you took on your iPhone, one of those voice memos, you can put it through this program and transfer that into text.

Speaker 2: Hi, my name is Steven. I will read any text you type here.

Speaker 1: And everything checks out there. Now for the final part of the video let's put this program through its paces on one final ultimate test.

Speaker 3: This is the Micro Machine Man presenting the most midget miniature motorcade of Micro Machines. Each one has dramatic details, terrific trim, precision paint jobs, plus incredible Micro Machine Pocket Playsets. There's a police station, fire station, restaurant, service station and more. Perfect pocket portables to take anyplace. And there are many miniature playsets to play with and each one comes with its own special edition Micro Machine vehicle and fun fantastic features that miraculously move. Raise the boat lift at the airport marina. Man the gun turret at the army base. Clean your car at the car wash. Raise the toll bridge. And these playsets fit together to form a Micro Machine world. Micro Machine Pocket Playsets so tremendously tiny, so perfectly precise, so dazzlingly detailed you'll want to pocket them all. Micro Machines are Micro Machine Pocket Playsets sold separately from Galoob. The smaller they are, the better they are.

Speaker 1: For this one, we're going to use a program called Buzz. This utilizes Whisper AI and I'll leave a link to the GitHub repository in the description below. I think this one is pretty good for Windows users, Mac users, Linux users, there's a one click installation. So there's nothing complicated here. And it has an easy drag and drop UI where you can upload your recordings, anything you need. So overall, this is a much simpler way to utilize Whisper AI on your computer. Once it's done installing, just open it up and run Buzz on your PC. And once you start it up, you'll see a UI where you can import your files and we're going to import that video we just saw and see how it does. Once the file is selected, just choose the model you want. And tiny will work for this small video, I want to export a text file. And once it is in queue, you'll see the progress. And once complete, you can either double click on it or find the text file in the directory where you chose your media file. So again, you can double click on it, you can see how it broke up all the transcription, you can export from here as a text file. Again, it already put one in the core directory where I chose my file. And as we can see, here is the transcription in all its glory. Buzz works really well utilizes Whisper AI and does great audio or video to text. So it's great for taking a meeting, making notes out of it. And from here, you know, you can copy and paste these notes into a local LLM, such as Llama2. If you're running UbuBuga, then you can use that. And I'll talk about that in a future video, where you can run a LLM like ChatGBT on your local computer and keep everything private. So there's no data going on the web. Hope you enjoyed this video. If you did, please like and subscribe. It helps the channel. Look forward to seeing you next time. Thanks. Bye.

Summary

Generate a brief summary highlighting the main points of the transcript.

Generate

Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate

Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate

Enter your query

Submit

Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate

Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate

Back

Forward

{{ Math.round(speed * 100) / 100 }}x

Select Audio file

Convert Your Audio To Text

Secure and Encryption, NDA

4.9/5 3727 customer reviews

1/732

Verified Order

“I haven't used the customer support yet, but the interface, guides, and easy access to the contact buttons are promising. The output is also really accurate and well-executed:)”

keziah

Aug 15, 2025

“Service is very fast and easy. I noticed a few errors but they were minor. I like your service.”

MICHAEL TRENT

Aug 12, 2025

“Excellent service!”

DanutM

“Excellent service, thank you very much!”

Samantha Cava

Aug 11, 2025

We Trust in Human Precision

Value-Driven Pricing

Trusted by Global Leaders

GoTranscript

24/7 Customer Support