Creating Subtitles with Amazon Transcribe: A No-Code Guide for Video Content
Learn how to use Amazon Transcribe to create subtitles for your videos without any coding or advanced AWS knowledge. Perfect for video creators!
File
Amazon Transcribe Video Snacks Creating Video Subtitles Without Writing Any Code
Added on 09/30/2024
Speakers
add Add new speaker

Speaker 1: Hi, everyone. I'm Jason O'Malley. I'm a partner solutions architect here at Amazon Web Services. I have a background in media and entertainment. And today in this Video Snacks episode, I'm going to walk you through how to use Amazon Transcribe to create a closed caption and subtitle outputs for video content with no coding and no advanced AWS machine learning knowledge required. So let's jump in. Now, many people will use the terms closed caption and subtitles interchangeably to describe the transcribed text at the bottom of the video player window. Within the Amazon Transcribe service, these files are called subtitle files. And that is how we will refer to them for the remainder of this video. Now, two of the most common file formats for subtitles are SRT, which stands for SubRip Text, and then VTT, which stands for Web Video Text Tracks Format. These text files will typically be uploaded to the video manager of your destination platform and become selectable within the video player. Now, if you are new to Amazon Transcribe, this is an AWS service that makes it easy for customers to convert speech to text. Using automatic speech recognition technology, customers can choose to use Amazon Transcribe for a variety of business applications, including transcription of voice-based customer service calls, conduct text-based content analysis on audio and video content, and even the generation of subtitles on audio and video content, which is what we will focus on here. Now, having subtitle creation built right into Amazon Transcribe is a very helpful feature for video creators because it offers the ability to accelerate the subtitle creation workflow that traditionally can be a time-consuming and manual process. Video content without subtitles risks creating a negative experience for those that are non-native speakers and those that are consuming the content in an environment where sound is not available. Now, one primary difference between closed captions and subtitles is that the closed captions also feature speaker identification, sound effects, and music description. These are not added automatically as part of this workflow, but they can be manually added during the review and editing phase, which we will describe at the end of this video. Now, Amazon Transcribe can help solve many of these problems, and I will walk you through the steps involved to create these video subtitles now. First thing you're going to do is you're going to log into the Amazon Web Services console. Make sure you have your desired region selected. For this demo, we will create an Amazon S3 source bucket for the media file and a logically separated destination bucket for the resulting subtitle and transcription outputs. To do that, we will navigate to the Amazon S3 console. You will then select Create Bucket. You will need to give your bucket a globally unique name. We can leave the default settings here. It is recommended to enable server-side encryption and versioning on your buckets, and we will repeat the same process for the output S3 bucket. We will give that another unique name and copy our previous settings. It's worth noting that at the time of this recording, the maximum input size for files into Amazon Transcribe is 2 gigabytes. If you have a file that exceeds that amount, I recommend checking out AWS Elemental Media Convert, which is a media conversion service that you can use to create an audio-only output with a much smaller file size. Please see that service documentation for more information. Now, here on my desktop, I have our demo video, and it happens to be the Amazon Transcribe service video from our YouTube channel because transcribing the Amazon Transcribe video was the most meta idea I could come up with. Here's a quick sample of that video.

Speaker 2: Amazon Transcribe takes a huge leap forward using deep learning technology to quickly and accurately convert live or recorded speech into text at a fraction of the cost.

Speaker 1: Now, as you can see, the video is playing locally on my desktop, but I do not have any subtitles, so I'm going to need to turn to Amazon Transcribe to create some. Now, to upload your video that is to be subtitled, pick that up, select it from your local computer, and drag it into the S3 console. I will use the default settings, the file uploads, and now you have your file ready in Amazon S3. From here, we will head over to Amazon Transcribe to start the transcription job. First, navigate to the left rail to select Transcription Jobs. Then, click on Create Job. This job name will show up in the job queue, and it will also become the resulting file name when the object is created in S3. For language settings, my source video is US-based English, so I will select that. Now, custom language models that you'll see here, these are great because they can be trained to improve the transcription accuracies for your specific use case. So, for example, you can provide Amazon Transcribe with industry-specific terms or acronyms they might not otherwise recognize. To create a custom language model, users can upload a large amount of domain-specific text as training data, and then transcription text as tuning data. We actually have an entire Amazon Transcribe Video Snacks episode dedicated to custom language models, so I'm not going to dive any deeper for this demo, but I do encourage you to watch that Amazon Transcribe Snacks video as well. We will skip down to the input data. Users will have the option to either copy and paste the S3 location or browse for the object. I'll click here to browse S3 and select our file. Now, under output data, users will have the option to either use an Amazon Transcribe managed bucket, and in that case, the output will be removed after 90 days, or they can use a customer-managed S3 bucket. In this case, the data will be retained for as long as the customer chooses. Since we created our own bucket, we will choose this option. Skipping down to the bottom, this is our big moment. We see our options for subtitle file formats, .srt and .vtt, and the great news is selecting .srt or .vtt outputs in your Amazon Transcribe job configuration settings adds no additional cost, so I will select both. This section of the configuration will also prompt you to select a starting index numerical value for the subtitle sequence so that it will start with either a 0 or a 1. It will be important to check with your destination platforms to see what formats, .srt or .vtt, are accepted and whether an index value of 0 or 1 is preferred. If your platform does not specify a start index, then it is recommended to select 1 as it is more common. When you are finished, click Next. For the step 2 of the job creation, you will see audio identification, alternative results, automatic content redaction, vocabulary filtering, and custom vocabulary. For subtitle creators in particular, I recommend you check out custom vocabulary as an option to explore in the future. The way custom vocabulary works is it allows our users to improve transcription accuracy for one or more specific words. Now this topic is another one which we have in Amazon Transcribe Video Snacks, so I'm not going to dive any deeper in this demo, but I encourage you to watch that Video Snack as well to learn more. You can click Create Job and the Speech to Text Transcription job will begin. When the transcript is finished, you will see the indicator turn to green. We can find the output data location by selecting the blue link. This will take us to our bucket where you will see the output of the file with the job name and the postfix of .srt or .vtt. With the blue box selected, the option to download will become available. Choose a location on your desktop to save the file. The resulting output is a plain text file, and that means that any text editor can open the file to preview and edit. I will use my built-in text editor on my desktop. As you'll see side by side, .srt and .vtt are very similar with just some subtle differences. In both, you will see a series of numbers, time codes, and text. The numbers indicate the order in which the next subtitle will appear on screen. The time code indicates when during playback the player will display the text, and the text is the transcript. Within the text editor, you can review the transcript and make any wording changes. When you finish making changes, you can save your file, and just like that, your subtitles will be updated. It's that simple. Now that you have your subtitle file on your desktop, you can preview the file. I happen to have the open source VLC player here, and VLC has a helpful feature that it allows the application to automatically detect a subtitle file if it's in the same folder as the source video and shares the same name. And open in VLC, and what do you know? There is our video with subtitles.

Speaker 2: Amazon Transcribe takes a huge leap forward using deep learning technology to quickly and accurately convert live or recorded speech into text at a fraction of the cost.

Speaker 1: And after you try this subtitling workflow, if you want to take it to the next level, then I suggest you check out the content localization on AWS solution. This is an application built upon the AWS Media Insights Engine developer framework for building applications that process videos, images, audio, and text with machine learning services on AWS. The AWS content localization solution has an infrastructure as code CloudFormation template that can be deployed into your AWS account in just a few clicks. Once deployed, this solution enables you to not only subtitle your videos, but also translate your videos as well, thanks to the integration with Amazon Translate. And best of all, the subtitles are able to be previewed and edited right within your browser and then downloaded to your computer after. Just search for content localization on AWS solution and look for the deployment guide to get started. And that is the entire process. You have now created a subtitle file with Amazon Transcribe, and you can now upload the subtitle file with your video to the destination platform and all without writing any code or any advanced AWS machine learning knowledge required. Good luck in your subtitle creation process, and I encourage you to look for other ways to use the power of Amazon Transcribe to accelerate your day-to-day processes. I am Jason O'Malley, and thanks again for watching this Amazon Transcribe video snack. We'll see you on the next one.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript