Mastering Sequence to Sequence Learning: Implementing Language Translation with RNNs and LSTMs

Convert Your Audio To Text

4.9/5

3727 customer reviews

Learn to implement sequence to sequence learning for language translation using RNNs, LSTMs, and bidirectional RNNs. Understand data processing, model training, and more.

Encoder And Decoder- Neural Machine Learning Language Translation Tutorial With Keras- Deep Learning

Added on 09/26/2024

Speakers

Add new speaker

Speaker 1: Hello all, my name is krishnak and welcome to my youtube channel. So I guys I think you like this new setup and probably I'll be recording all my videos like this in the upcoming videos itself. And in this particular video, we are going to implement a sequence to sequence learning with the help of machine learning, language translation, and I'll be showing you how we can take up the data set, how we can do all the things as we go ahead. And for this, please remember guys you need to be very very good at the basics that is you need to know RNN, you need to know LSTM RNN, you need to know bidirectional RNN. And this particular video is more to show you or make you understand how a code is actually implemented to do language translation. So I'm going to refer this particular blog guys amazing blog it has been written and it is provided to everyone. You can just read and understand this whole thing if you have your basics right. Okay. And again, as I said that we are doing machine learning language translation. So far, first of all, from where we are getting the data set, just click on this. Okay. In this particular scenario. Okay. And this blog, you know, they have actually done English to French translation. So English to French data set is already present over here. And why they have actually taken that because there is a huge data set for this. You know, they are more than 100K records with respect to different different English and French words. So we are going to use this and probably I'll also run this whole thing and show it to you. Now, let me just go to the architecture and make you understand what all things we have to actually do. For that, I'm just going to quickly rub this, you know. So rub this all so that I'll be able to able to explain you everything. Now, fine. Let's go ahead and try to understand this. Now, guys, if you have seen my encoders and decoder video or sequence to sequence learning, there are two main important components. One is the encoder. So this is basically my encoder. Okay. Another one is basically called a decoder. Okay. Now, we have seen that in encoder, we just have to provide our words. All the processing will actually happen. There will be a context vector that will be W that will be created. Once this context vector is actually created, then it is passed to my decoder layer. And with the help of that, we are actually creating our output. One important thing to note over here is that in our encoders, we do not consider the output over here. We skip those out. We don't require this output. Right. Because we are actually more focused on getting the context vector. Okay. So what we are going to do is that I'm going to make you understand how that code is actually implemented. If you just copy and paste that code, it will work. But we still need to understand how we can design this encoder to decoders. Right. Now, one more thing that you need to consider over here is that as you know that we are passing character like A, B, C. Right. We are passing all these particular characters. And this is specifically character to character translation. Now, when I pass this particular character, we also need to take care that how it is actually passed. It should be passed in the form of vectors. And in this particular example, we are going to convert that into or actually send these characters in the terms of one hot representation. One hot representation basically means what? Suppose you have 73 characters. Okay. And in English, specifically in this particular example in English, there are 73 unique characters. Now, if I want to represent A with some vector representation, I will basically have 73 features. And based on that, I suppose that A wherever it at whichever location it is present, I'll keep it as one remaining. All the characters will be zero. That is how A is basically represented. So similarly, we are going to do in this particular way. And remember, guys, please make sure that you have your basic concepts of LSTM, RNN, and you know how to implement all those things. So it will be pretty much important. It will be also easy for me for me to make you understand. Okay. Now, what we are going to do first of all. Okay. Remember, we are going to make sentences into three number arrays. One is the character input, encoder input data, decoder input data, and decoder target data. Now, what does this basically mean that guys? We obviously need to provide this input data. And as you know that in LSTM, we need to provide the input. It has to be provided in the three dimension. Right. So this is basically my decoder input. Right. Decoder input. And this is all my decoder target data. Decoder input is basically nothing but the vectors or the context vectors that have been generated by the encoders. So we are going to use that. Okay. Then we are going to. Okay. The next step over here, you can see that they have given on what 3D array of shape. Basically, number of pairs is like how many number of sentences I have. What is the maximum sentence length and what is the number of English characters? So if you really want to give in the form of one representation, we basically have to consider this. This is basically the total number of characters that is actually present. Because we are going to provide this into a one hot vectorization of the English sentence. So let me just go back. And this is how my data set looks like guys. You can see this is my English word. This is my French word. This is my English word. This is my French word. This is my English word. This is my French. So like this, you have huge amount of data set. We need to remove this. This is not required. So for that, we will be doing some kind of processing. Currently in my system guys, I have TensorFlow 2.0. So whenever you have TensorFlow 2.0, that basically means Keras is integrated within them. If you don't have TensorFlow greater than 2.0, at that time, you just have to remove this particular code that is TensorFlow. From TensorFlow, just remove it. And you can just write from Keras.models import model. From Keras.layer import input LSTM. Now we are going to initialize some of the variables like batch size. We are going to initialize epochs. We are going to initialize some latent dimension. This is basically the feature dimensionality space that we are going to consider. How many number of samples to train on? We are going to take it as 10,000. Okay. Here we are just specifying some values itself so that we'll be training on that much of data. Now, the first step after this here, I have actually provided my data set path. In this particular step, what we are doing is that we are reading the input and the target data. Input data is basically all my English sentence. Target data is basically all my output sentence. And then with respect to each and every input sentence, we are reading all the characters, all the unique characters. That is what we are doing. So here you can see there is something like input characters and target characters. So input characters and target characters will give you all the English words. So if I go and print out input characters, so these are all my English words. You can see over here. Okay, guys, I think this is a simple for loop. I think you will be able to understand. Okay, so input characters and target characters before input and target characters. You can see that. First of all, you have opened that particular file. Then you are traversing through each and every lines. You are doing the split with respect to the input text and target text input text. If you want to really see guys input text. So here is all my English input text. You can see over here. Similarly, if you want to go and see the target text, you will be able to see all the targets test. And this remember the slash n is put because we need to indicate that which is the end of that specific sentence. Because then only we will be able to understand that how we would be training our neural networks. How will be training our LSTM layers? You know how we will come to know that. Okay, this is the end of the sentence with respect to that. So this target text input text we have got. And similarly, we have got input characters and target characters. Target characters. If you see guys, this basically indicates that all the French characters that I have in my data set. Okay, so target characters. You can see these are all the French characters that are present. Okay. And if you really want to see the length, it will be probably somewhere around. I guess 90. I just think it's okay. 93. Okay, guys. So next step is actually to take all this particular input characters, target characters, and that's encoded tokens. Like how many the length of the input characters you can see over here. I'm printing it. You know, I'm actually finding out what is the maximum sequence length and what is the decoder sequence. So all these values are there. So we have the number of samples basically say that how many number of sentences I have. Number of unique input tokens basically indicate that how many unique characters I have in my English language over there. Based on the data set that I have. 93 is basically the target characters, the length of the target characters. What is the maximum length for input sentence? So basically I have all my English sentences. From that, the maximum length of the English sentence is somewhere around 16 words. Similarly, the maximum sequence length for the output, which is basically my French, the words is basically 59. Okay. Now, the next step, I'm assigning each and every token. So token basically means like 0, 1, 2, 3, those kind of indexes to each and every characters based on the sorted list. And this is what that specific code will do. Okay. So if I execute this and if I show you what is the input token index. So input token index, what it will do? How many English characters are present? Based on that, in the sorted order, it will start putting up indexes. So here you can see that for the space, it has put 0. For this, it has put 1. For dollar, it has put 2, 3, 4, like this. So total number of 70. Last character, it is given a 70 index. So the total number of characters over here is basically 71. Similarly, if I go and see for the target, target token index, you will be able to see that. Yes, I'm having somewhere around 92. Okay. Now, the next step is basically to show you how we can do one hot representation by using number. Initially, for the encoder input data, I am going to use NP dot zeros. I'm going to put all zeros values based on the dimensions that I've given over here. So remember the first dimension. Again, if I go to go in this particular block, the first dimension, it says I have to put number of pairs. Number of pairs basically means how many total number of sentences are there. I'm going to put that. The second parameter is basically what is the maximum English sentence length. That also I know that. Right. Then what is the number of English characters? That also I know that. Right. So here I'm going to put as max encoder sequence length, the total length of the input sentences, and total number of encoded tokens. So that will actually give me like how many number of English sentence words are actually present. Right. So here is those values. This is basically my sequence. So you can see that my sequence length, max sequence length for English is somewhere around 16. The unique input tokens, that is 71. That is why I use number of encoded tokens over here. Right. Similarly, for the decoder input also I have to do in that particular way. And for the target data also I have to do in that particular way. Always remember this will be same in decoder input data. Instead of using English, I'll be using French over here. So max decoder sequence length, max number of decoder tokens, everything over here will be basically taken as the French characters. So what will be the French character length? If I go and see over here. Right. So this is the decoder sequence length. This is basically of the French that we are trying to see. It is somewhere around 59. Right. How many number of unique tokens are actually present in the French characters? They are 93. So I'm going to use that. Similarly, in this decoder target data also I'm going to use the information of the decoder or the French character itself. Now, this particular code will actually help us to do the one hot representation. It is simple guys. Suppose I have the words. Let me just quickly show it in front of you. OK. Suppose. And first of all, I'm just going to drop this. OK. Suppose. OK. Suppose I have some words. OK. I have some words like W.O.R.D. Right. Suppose I just have these four words. Now suppose in my first sentence I have my W word. So W will be given the value as one and remaining all will be given as the value is zero. This is what is one hot representation basically says. And that is what I'm going to do over here. Right. But remember here the target length is completely different for English. It is somewhere around. How much we saw. If I go and see in the top. For English it is somewhere around 71 unique tokens are there. But the sentence length is somewhere around 16. Right. So considering the 16 length and considering the 71 unique tokens wherever that particular token is present that will become one and remaining all will become zero. So that is what this whole code is doing. OK. That is what this whole code is doing actually. You have to understand this particular code guys. Just try to print that. You know. So here you can see for T comma characters in enumerate input underscore text. What is input underscore text. OK. I'll just show you what is input underscore text. You can see that. OK. First of all it is traversing inside this. So input underscore text is one one sentences like this. OK. One one sentences like this. Now when I see this. This is basically replacing wherever that particular character is there that is getting replaced by one. OK guys. I'll not deep dive more into the code. Just try to understand this. OK. Just try to go line by line and try to understand it. It will be always beneficial to you. OK. So that is how. But remember this particular whole code is actually doing the one representation. OK. So after you do this what we are going to do is that then we will start creating our LSTM layer. Now LSTM layer initially will ask for the encoder input shape. So the encoder input shape I'm giving it as number of encoder tokens. Then in LSTM I have taken what is my latent dimension. I had initialized over here. The latent dimension is 256. This is just like our timestamps that we are going to consider. And then we are going to take a return state as true. One very very important thing is happening over here. Just understand. If I go to my wordpad. Remember I am not taking all this particular outputs. OK. I don't want this output because that is not how encoder and decoder actually work. Right. This is not how it actually works. So what I'm doing actually over here. See over here. So first of all I will go and initialize this. I have initialized my LSTM layer. Now this encoder when it is taking the encoder input it gives us three values. One is the encoder output which we have to skip. We don't require it. One is the state underscore H which is basically my hidden cell and one is the cell state. OK. Now if I go back over here guys I will be requiring this hidden cell information or whatever the output I'm getting over here with respect to the complete timestamps as I'm going ahead. OK. So I don't want I don't want this output. I don't require any of this output. So what I'm doing over here you can see that I'm just taking the state H and state C and I'm putting inside my encoder state variable. OK. Encoder state variable. Now similarly what I do with respect to the decoder inputs. Now in decoder inputs I know what is my decoder number of decoded tokens. I absolutely know that. Again I've created a latent dimension created all the LSTM like that. OK. But here now we will be much more concerned in getting the output itself. I don't have to worry about the other things. OK. So now here I'm focusing on getting the decoder outputs. OK. Now once I get the decoder outputs decoder output basically means this information. This information I want to get this information. And this is the layer that I'm actually creating in my second step. Right. In the first layer I took this. OK. I wanted this value this value and this output value. I have taken that I stored it in a variable. And now from the decoder LSTM I'm just interested in the decoder output. Right. Then I'm using a dense layer and finally I'm actually getting all the outputs with respect to that. OK. Now this is what is with respect to my encoder inputs and my decoder inputs and my decoder outputs. Now I will be taking all this particular value and considering in creating a model. So if I give this values inside my list you can see my encoder inputs and decoder inputs. This both will be almost same but just with timestamp plus one. And then finally I'm getting my decoder. After that I'm compiling I'm fitting the data with some validation split with some hundred epochs. You can see that. And probably this will be able to give you around 87 percent accuracy. Remember guys this will also work in your local laptop and it will work in proper. Like within 15 minutes it will be able to do the hundred epochs. Since this is a character to character implementation. OK. Now this was how my model got trained. Now one assignment I really want to give it to you guys is that please try to understand this code. Where I'm actually generating the sentences. OK. You can see that this is basically generating the sentences. So just try to understand this. I'm probably. I know it will be a little bit difficult but I want you to read the blogs and understand this how we are doing the sampling and how we are generating. OK. I've explained you till here like how the model is actually generated. This is pretty much simple. Refer the blogs. And the reason why I'm telling you is that understand. And probably if you don't know about LSTM don't go over here first of all. Then only you'll not be able to understand. Because in the next model which I'm going to discuss which is about attention models there again we need to make some architecture changes. You know instead of using this encoder over here. Right. I will be using this will be getting changed to bidirectional LST. OK. So yes guys. This was all about this particular video. Please do let me know whether you have any queries. But I would be very very happy to help you out. But please give a try. Give a try to understand this code which is actually changing your sentence or which is running your code. This is what I want you all to explore. This is hardly around 50 to 20 lines of code. But I really want to make make you understand and think if you have understood this the other part will be pretty much easy. So yes guys. This was all about this particular video. In my next video deep learning will be coming with attention models and we'll discuss about attention models and how it actually works. So yes guys. This is all about this particular video. I'll see you in the next video. Have a great day. Thank you. Bye bye.

Summary

Generate a brief summary highlighting the main points of the transcript.

Generate

Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate

Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate

Enter your query

Submit

Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate

Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate

Back

Forward

{{ Math.round(speed * 100) / 100 }}x

Select Audio file

Convert Your Audio To Text

Secure and Encryption, NDA

4.9/5 3727 customer reviews

1/732

Verified Order

“I haven't used the customer support yet, but the interface, guides, and easy access to the contact buttons are promising. The output is also really accurate and well-executed:)”

keziah

Aug 15, 2025

“Service is very fast and easy. I noticed a few errors but they were minor. I like your service.”

MICHAEL TRENT

Aug 12, 2025

“Excellent service!”

DanutM

“Excellent service, thank you very much!”

Samantha Cava

Aug 11, 2025

We Trust in Human Precision

Value-Driven Pricing

Trusted by Global Leaders

GoTranscript

24/7 Customer Support