Improving LibreSpeech Model with Better Language Models
Enhance word error rate by rescoring with RNNLM in Kaldi, using basic RNNLM or a PyTorch-based model for improved performance.
File
Dan Kaldi 9 How to improve WER in LibriSpech model
Added on 01/29/2025
Speakers
add Add new speaker

Speaker 1: Hello, this is Daniel Povey and today we're going to ask him, we trained LibreSpeech model using call these scripts. What is the next step? What can we do now to improve its word error rate? Hmm. Well, so when you ask that question, I'm going to assume that you trained like to the very end of the run.sh, like the chain system. So, I mean, already that's a pretty good system. But if you want to improve the word error rate further, I think the main thing you can do is to use a better language model. So, like the default decoding in Kaldi is what I think with a foreground language model, that script should be testing with a foreground. That's as good as you can get from an N-gram language model, you know, a graph-based decoding. But you can improve that by rescoring with an RNNLM. There are some scripts in there to rescore with an RNNLM. So, this is a Kaldi-based RNNLM. It's not one of those PyTorch-based transformers or something. So, I mean, it's a pretty basic RNNLM. These days people can do better. And we do have some scripts somewhere in Kaldi that you can run a PyTorch-based RNNLM. But I think I would recommend to use the Kaldi one for now simply because there's fewer things that can go wrong. Will we do rescoring with this new RNNLM? Yeah, you'll do lattice rescoring. We don't normally do decoding in the first pass with the RNNLM. So, you decode the entire utterance and then you rescore the lattice. Okay. Thank you. Okay. Bye. Bye.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript