Understanding Translation Memory: Key Concepts and Benefits for Translators (Full Transcript)

Explore the fundamentals of translation memory, its components, and how it enhances translation efficiency, quality, and consistency for translators.

Download Transcript (DOCX)

Speakers

Add new speaker

Speaker 1: Ciao, I'm Massimo from SDL, and today I'm here to talk about a translation memory. So what is a translation memory? It's a language pair database that stores segments of text which have been previously translated so they can be recalled for later use. I don't know how much sense this made to you, so we're going to try to explain all of this. A few keywords here, database, hopefully you know what that is, but it's a place where things get stored. We are going to talk about segments, obviously text, and recalling, so bringing that back. And I'm explaining this because a lot of the people in this industry are not English speakers, native English speakers, and so kind of just make sure that our terminology is as clear as possible. What is a segment? So this is a question that comes up, and especially if you're new to the industry, new to translation, you might actually not know what a segment is. You know, you might figure it out. So in a way, a segment can be different things, and this is what gets stored inside the translation memory. It could be a sentence, and this is what you probably relate the most with. It could be a phrase, and actually there's not much difference from a sentence and a phrase. I think it's a bit of semantics, it's kind of terminology, what word works best for you. It could be a paragraph, and actually a paragraph is a little bit different than a sentence. Typically, a paragraph could be a combination of sentences, and traditionally in the translation memory, you store these two, sentences and phrases. You can also choose to store entire paragraphs, but obviously the more you store, the harder it is to match in the future, and we're going to look a little bit more what that means, but remember that kind of thought of storing sentences. So after a full stop, after a semi-column, for example, but perhaps if you have a very long paragraph with three phrases or sentences inside, you can store those, but you might decide not to do that. Translation headings, of course, a title, the beginning of a document, that is also what gets stored. So in essence, a segment is a finite piece of words, a set of words that gets stored into a translation memory. The second piece is, what is a translation unit? A translation memory is made up of translation units, and sometimes people call this a TU, translation unit. So what's a translation unit? It's, in essence, the building block of a translation memory. So the translation memory stores the source text, so what you've got to translate, and the translation of that sentence. So typically, if you look at a translation memory, you'll find many translation units, and each translation unit is going to be, hello, I'm Massimo. Ciao, sono Massimo. So you have the English and its translation. That's what a translation unit is. And what then is going to happen is that the translation memory can search all these translation units, and as you translate, it will find similar translations that you've made before for a specific sentence, and suggest them to you as translations. So you might wonder, how does this help? So it kind of helps in many different ways. So the classic is that you never have to translate the same sentence again. So there's different benefits to it. You might kind of see immediately what's the benefit of not having to translate the same sentence again. First of all, to re-translate the same things, it's boring, so you don't have to do it again. Sometimes you have to watch it. Sometimes you have translated something, and you do need to actually have a different translation because of different contexts. So you always need to check those translations. But in essence, when you translated something, you don't want to translate it again. It does save you time, but also it's about quality. You might have spent a lot of time translating a sentence and getting it just right. Not just right for you, but also you might have a client. There might be somebody else who reviewed that translation, and at the end of the process, everybody was 100% happy with that translation. You want to make sure that that translation is the one that gets used again. So it's about quality and consistency and making sure that the best translation is reused again and again. Obviously, all of these translations get stored for future use. So you're building an asset, in essence. You're building this database made out of translation units, which contain all your previous sentences that you translated, and it's something you build for future use. There are people who've been building millions of words. It's not a joke. There's translators who have a million translation units. Can you imagine? And they can reuse them, so you can save them a lot of time, and also ensuring consistency and quality. One of the nice things about a translation memory is the more you add, the more you can recall. And at the end of the day, the interesting piece is that you're actually not doing more work. You're translating as you were translating before. The difference is that you're storing that translation for the future, and it just kind of happens automatically. It's really quite easy to add and build a translation memory. It can get big pretty quickly. If you think that a typical translator translates 2,000 words a day, that's what the industry sort of agrees on translating new words. Of course, it can be a lot less than that. If you have a slogan, a marketing slogan to translate of 10 words, it might actually take you half a day to translate that slogan. But typical content, commercial content, instruction manuals, technical documentation, the average is 2,000 words a day. Those are 2,000 words a day that you can store in the database for future use. So it does help you translate faster, because you have stored all of this. But I really want to stress that the speed that it offers you is also very much about giving you more time for when you are translating that new content so that you can create the best possible translation for it. And of course, it takes away the monotony of translating the same sentence again and again. For one marketing slogan, that could be one document where the same sentence is repeated hundreds of times. And so that's where a translation memory can really make a difference. So lots of information so far. Translation segments, units, and so on. But I think hopefully you got the concept of having your translation stored in translation units. But how does it all work? So once you get your new text to translate, a CAT tool, and we have a video on a CAT tool, is going to look at your entire database, look at all those translation units, and try to find something that matches. And it happens instantly. It's very, very quick. So you're not sitting there waiting for the translation memory to find a translation. It's pretty much instant. So I just use the word match. And this is kind of the essence of a translation memory. What it's trying to do is trying to match the new sentence you have to translate with the sentences you translated before. And different things can happen. So there are four types of matches. So let's look at them. So the first one is context match. So this is the highest quality match. This is where something that you have translated before, but not only you translated that sentence, but the translation memory can recognise that you translate that sentence and the sentences around it. So in that case, the system can say, hmm, not only I translated this before, but what's around it is also the same. So that's a context match, because the chances are that that is going to be a really good match. The second match is the 100% match. This is something that you might hear quite a lot in this industry. So this is the more traditional match. It's something you translated before in the same format as well. If something was bold, it needs to also be bold, or you might have to fix the formatting, for example. And so that's the 100% match. In this case, the context is not taken into consideration. So it's just looking at that sentence. And if what's around it is not the same, the system will just say, this is 100% match. Typically, you need to check 100% match, because there might be surprises. And if any of you speak different languages, masculine, feminine, maybe something was a heading, something was in the middle of a sentence, you might have actually need to use different translations. And that's why the 100% match is going to be pretty accurate. But you do need to cast an eye on 100% match. Then we go down a level to what we call a fuzzy match. And in fact, the industry calls a fuzzy match. So that's something below a 100% match. Typically, you might hear about 70%, 80%, 90%, 99%, 95% match. So that's clearly a sentence which is similar, but not identical. And the differences could be, you know, it could be a word different. It could be a formatting difference. So various little things, which means that the translation memory will suggest that translation, and then you can fix that translation, add the missing word, replace a word, restructure a little bit that sentence to make it work. One of the newest matches is the fragment match. So, you know, we talked about paragraphs and sentences and so on. So within a sentence, there might be fragments. So you might have a sentence with 10 words, but there might be three or four words, and those are the ones that repeat themselves. So you might be translating an entire sentence, and you might have that feeling that you translated those three or four words. A fragment match will look for those pieces and say, you know, you actually did translate those four words. And here's how you did translate those four words. And then you can decide whether to use them or not. You know, when I said earlier, the kind of length of the sentence and the paragraphs, when you look at this, obviously, the bigger the sentence, if you have a sentence that long, with long, long, basically long wordings, getting a match, it might be a little bit harder. If you are in a situation where you have short sentences, it's easier to get a match. The statistics are kind of in your favour. So that's why typically what you store is sentences. Even if there's an entire paragraph with three sentences, you kind of store the individual phrase sentence level rather than an entire paragraph. So you kind of give yourself a better chance to get a match. So one question that gets asked often is, can I work with more than one translation memory at the same time? So what other people ask then? So over the years, I've seen all sorts of different approaches. People might store everything they translate, no matter what client, is it marketing, is it legal? And everything goes into translation memory. And you kind of work from there. But other people create different translation memories for different clients, different type of content. But you might still want to be able to use them all because you never know where a match might come from. So of course, you can create as many translation memories as you like and use them all at the same time. And in fact, a cat tool can be quite smart and prioritise one translation memory versus another. So if you've got a marketing translation memory and you're translating a marketing document, but you might also have the instruction manual from the same client, you might actually turn them both on because you might actually find some matches from both. But the marketing translation memory is the one that has got top priority over perhaps the technical documentation. When you're starting with translation memory, I told you that you can build it pretty fast. You know, you start translating your 2000 new words a day and the translation memory will build up. But you might have been a translator for quite some time and you might have a bunch of documents that you want to import in the translation memory. So can you actually make a translation memory from your previous work? And the answer is that you can. So there is a tool called Alignment. It is a tool, a feature, a functionality that actually allows you to create a translation memory from your existing translation. So you might have a document in English, a document in Italian, and you translated them manually. You can take those documents and import them into a translation memory. And what you're doing, you are aligning them. You're trying to align the English sentences with the Italian sentences. So you're going to get with a side by side view where you see all these sentences. You know, languages are tricky. So maybe three English sentences turn into a one long Italian sentence. Coming from Italy, I know I used to have to chop my sentences more. When I moved to the UK, my sentences were that long. I had to really kind of force myself to be a bit shorter. So with Alignment, you actually can sort of manipulate and try to align all the English with the Italian in my case or whatever language pair. One good point of translation memory that it can store any language. There is no constraint. Any language can be imported and worked with a translation memory. So no matter what language combination you translate in, everything can be imported in a translation memory. So hopefully you've got a bit of a sense of what a translation memory is, what's inside. You hopefully know what now is a segment, how all of that works together to give you suggestions as you translate. And one of the nice things of translation memory is that even if you're starting from scratch, either you build it pretty quickly as soon as you start doing your first translation, or you could even import what you've worked at, you've done before. So you create this asset that is going to last you for the future.

Summary

Generate a brief summary highlighting the main points of the transcript.

Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Key Takeaways

Extract key takeaways from the content of the transcript.

Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Enter your query

{{ secondsToHumanTime(time) }}

Back

Forward

{{ Math.round(speed * 100) / 100 }}x

{{ secondsToHumanTime(duration) }}

Select Audio file