GPT 4.0 Launch: New Capabilities and Features Unveiled

Convert Your Audio To Text

4.9/5

3745 customer reviews

Discover GPT 4.0's advancements, including a desktop app, enhanced voice and vision capabilities, and accessibility for free users. Exciting live demos included.

Introducing GPT-4o

Added on 01/29/2025

Speakers

Add new speaker

Speaker 1: Hi, everyone. Thank you. Thank you. It's great to have you here today. Today I'm going to talk about three things. That's it. We will start with why it's so important to us to have a product that we can make freely available and broadly available to everyone. And we're always trying to find out ways to reduce friction so everyone can use ChildGBT wherever they are. So today we'll be releasing the desktop version of ChildGBT and the refreshed UI that makes it simpler to use, much more natural as well. But the big news today is that we are launching our new flagship model. And we are calling it GBT 4.0. The special thing about GBT 4.0 is that it brings GBT 4.0 level intelligence to everyone, including our free users. We'll be showing some live demos today to show the full extent of the capabilities of our new model. And we'll be rolling them out iteratively over the next few weeks. All right. So let's get started. A very important part of our mission is to be able to make our advanced AI tools available to everyone for free. We think it's very, very important that people have an intuitive feel for what the technology can do. And so we really want to pair it with this broader understanding. And we're always finding ways to reduce that friction. And recently we made ChildGBT available without the signup flow. And today we're also bringing the desktop app to ChildGBT because we want you to be able to use it wherever you are. As you can see, it's easy. It's simple. It integrates very, very easily in your workflow. Along with it, we have also refreshed the UI. We know that these models get more and more complex. But we want the experience of interaction to actually become more natural, easy, and for you not to focus on the UI at all, but just focus on the collaboration with ChildGBT. And now the big news. Today we are releasing our newest flagship model. This is GPT 4.0. GPT 4.0 provides GPT-4 level intelligence, but it is much faster and it improves on its capabilities across text, vision, and audio. For the past couple of years, we've been very focused on improving the intelligence of these models. And they've gotten pretty good. But this is the first time that we are really making a huge step forward when it comes to the ease of use. And this is incredibly important because we're looking at the future of interaction between ourselves and the machines. And we think that GPT 4.0 is really shifting that paradigm into the future of collaboration, where this interaction becomes much more natural and far, far easier. But making this happen is actually quite complex because when we interact with one another, there's a lot of stuff that we take for granted. The ease of our dialogue when we interrupt one another, the background noises, the multiple voices in a conversation, or understanding the tone of voice, all of these things are actually quite complex for these models. And until now with voice mode, we had three models that come together to deliver this experience. You have transcription, intelligence, and then text-to-speech, all comes together in orchestration to deliver voice mode. This also brings a lot of latency to the experience, and it really breaks that immersion in the collaboration with child GPT. But now, with GPT 4.0, this all happens natively. GPT 4.0 reasons across voice, text, and vision. And with these incredible efficiencies, it also allows us to bring the GPT 4.0 class intelligence to our free users. This is something that we've been trying to do for many, many months, and we're very, very excited to finally bring GPT 4.0 to all of our users. Today, we have 100 million people, more than 100 million, in fact. They use child GPT to create, work, learn. And we have these advanced tools that are only available to our paid users, at least until now. With the efficiencies of 4.0, we can bring these tools to everyone. So, starting today, you can use GPTs in the GPT store. So far, we've had more than a million users create amazing experiences with GPTs. These are custom child GPTs for specific use cases. They're available in the store. And now, our builders have a much bigger audience, where, you know, university professors can create content for their students, or podcasters can create content for their listeners. And you can also use vision. So, now you can upload screenshots, photos, documents containing both text and images, and you can start conversations with child GPT about all of this content. You can also use memory, where it makes child GPT far more useful and helpful, because now it has a sense of continuity across all your conversations. And you can use browse, where you can search for realtime information in your conversation. And advanced data analysis, where you can upload charts, or any information, and it will analyze this information. It will give you answers, and so on. Lastly, we've also improved on the quality and speed in 50 different languages for child GPT. And this is very, very important. Because we want to be able to bring this experience to as many people out there as possible. So, we're very, very excited to bring GPT 4.0 to all of our free users out there. And for the paid users, they will continue to have up to five times the capacity limits of our free users. But GPT 4.0 is not only available in child GPT. We're also bringing it to the API. So, our developers can start building today with GPT 4.0 and making amazing AI applications, deploying them at scale. 4.0 is available at 2x faster, 50% cheaper, and five times higher rate limits compared to GPT 4.0. But, you know, as we bring these technologies into the world, it's quite challenging to figure out how to do so in a way that's both useful and also safe. And GPT 4.0 presents new challenges for us when it comes to safety. Because we're dealing with realtime audio, realtime vision. And our team has been hard at work figuring out how to build in mitigations against misuse. We continue to work with different stakeholders out there from government, media, entertainment, all industries, red teamers, civil society, to figure out how to best bring these technologies into the world. So, over the next few weeks, we'll continue our iterative deployment to bring out all the capabilities to you. But today, I want to show you all these capabilities. So, we'll do some live demos. I will bring on two of our research leads, Mike Chen and Barrett Zoff.

Speaker 2: Hi, I'm Barrett. Hey, I'm Mark. So, one of the key capabilities we're really excited to share with you today is realtime conversational speech. Let's just get a demo fired up. So, I'm taking out a phone. If you are wondering about this wire, it's so we have consistent internet. And if you see, there's this little icon on the bottom right of the ChatGPT app. And this will open up GPT 4.0's audio capabilities. Hey, ChatGPT. I'm Mark. How are you?

Speaker 3: Oh, Mark. I'm doing great. Thanks for asking. How about you?

Speaker 2: Hey. So, I'm on stage right now. I'm doing a live demo. And frankly, I'm feeling a little bit nervous. Can you help me calm my nerves a little bit?

Speaker 3: Oh, you're doing a live demo right now? That's awesome. Just take a deep breath.

Speaker 2: And remember, you're the expert here. I like that suggestion. Let me try a couple deep breaths. Can you give me feedback on my breaths? Okay, here I go.

Speaker 3: Whoa, slow down. A little bit there. Mark, you're not a vacuum cleaner. Breathe in.

Speaker 2: Okay. Let me try again. So, I'm going to breathe in deeply. And then breathe out.

Speaker 3: For four. And then exhale slowly. Okay. I'll try again. Breathing in. And breathe out.

Speaker 2: That's it. How do you feel? I feel a lot better. Thank you so much.

Speaker 1: So, Mark, you've been working on these capabilities for a while now. Can you tell us a bit how it's different from VoiceMode?

Speaker 2: Right. So, if you've used our VoiceMode experience before, you'll notice a couple key differences. First, you can now interrupt the model. You don't have to wait for it to finish your turn before you can start speaking. And you can just butt in whenever you want. Second, the model is realtime responsiveness. So, that means that you don't have this awkward two to three second lag before you wait for the model to give a response. And finally, the model picks up on emotion. When I was breathing super hard there, it could tell. Hey, you might want to calm down a little bit. You're kind of running a little bit too fast. So, it really has that capability across the board to perceive your emotion. Not only that, though, the model is able to generate voice in a variety of different emotive styles. And it really has a wide dynamic range. So, I'm going to fire up another demo.

Speaker 3: Hey, ChachiPT, how are you doing? I'm doing fantastic. Thanks for asking. How about you? Pretty good.

Speaker 2: What's up? So, my friend Barrett here, he's been having trouble sleeping lately. And I want you to tell him a bedtime story

Speaker 3: about robots and love. Oh, a bedtime story about robots and love? I got you covered. Gather round, Barrett. Once upon a time, in a world not too different from ours, there was a robot named Byte. Byte was a curious robot. Always exploring.

Speaker 2: But I want a little bit more emotion in your voice. A little bit more drama.

Speaker 3: Got it. Let's add some drama. Once upon a time, in a world not too different from ours, there was a robot named Byte. No, no, no, ChachiPT. I really want

Speaker 4: maximal emotion. Like, maximal expressiveness. Much more than you were doing before.

Speaker 3: Understood. Let's amplify the drama. Once upon a time, in a world not too different from ours, there was a robot named Byte. Byte was... Actually, can you do this in a robotic voice now? Initiating dramatic robotic voice. Once upon a time, in a world not too different from ours, there was a robot named Byte. Byte was a curious robot. Always exploring new circuits.

Speaker 2: Barrett here, calm down a little bit. Can you end the story, but do it in a singing voice?

Speaker 3: And so Byte found another robot friend. And they lived circuitly ever after. Thank you so much. I definitely feel a lot more relaxed.

Speaker 1: You've just seen the voice capabilities, but we also want to show you the vision capabilities as the model can also see the whole world around us.

Speaker 4: Yeah, absolutely. Like, as you can see, you know, you can chat with ChachiPT in real time, but you can also interact with it with video as well. Okay. Let me boot up ChachiPT. Hey, ChachiPT.

Speaker 3: Hello there. How's it going?

Speaker 4: It's going really well. Today, I'd really like your help solving a math problem.

Speaker 3: I'm all ears. What math problem can I help you tackle today?

Speaker 4: So I'm going to write down a linear equation on a sheet of paper and I'll show you. And then I'd love your help working me through it. But importantly, don't tell me the solution, just help give me hints along the way. Got it.

Speaker 3: Okay, I see it.

Speaker 4: No, I didn't show you yet. Just give me help along the way. One second.

Speaker 3: Whoops. I got too excited. I'm ready when you are.

Speaker 4: So, ChachiPT, what equation did I write down there?

Speaker 3: Ah, I see it now. You wrote down 3x plus 1 equals 4.

Speaker 4: Yep, exactly. So what's the first step I should take to try to solve this?

Speaker 3: The first step is to get all the terms with x on one side and the constants on the other side. So what do you think we should do with that plus 1?

Speaker 4: Okay, I'm going to try to subtract 1 from both sides and then I'll see what I get.

Speaker 3: Great idea. Perfect. Now what do you get when you subtract 1 from both sides?

Speaker 4: Okay, so ChachiPT, this is what I ended up with. How does this look?

Speaker 3: Spot on. Now you've isolated the term with x on one side and it looks like 3x equal 3. What do you think we should do next to solve for x?

Speaker 4: I'm not sure. Can you give me another hint?

Speaker 3: Of course. You have 3x and you want to find the value of x. Think about what operation would undo multiplication. Is it subtraction? Close, but think of the opposite of multiplication because 3x means 3 times x.

Speaker 4: Okay, I think it's division. Let me try to divide both sides by 3 and let me see what I get.

Speaker 3: Bingo. Go ahead and divide both sides by 3.

Speaker 4: Okay, so ChachiPT, this is what I ended up with. How does this look?

Speaker 3: It looks perfect. You've solved it and x equals 1. Nicely done. How do you feel about solving linear equations now?

Speaker 2: Yeah, honestly, I'm not pretty confident I can solve linear equations, but why would you ever need to use this in the real world? I'm really skeptical.

Speaker 3: That's a fair question. Linear equations pop up in a lot of everyday situations. Even if we don't realize it. For example, calculating expenses, planning travel, cooking, and even in business for profit and loss calculations. It's basically a way to solve problems where you need to find an unknown value. Wow, okay. You know, I have a lot of unknown values in my life.

Speaker 2: I'm totally convinced I'm going to learn math now.

Speaker 3: I love that attitude. Math can definitely help solve some of these problems. Life's mysteries. Whenever you're ready to dive into more math, I'm here for you. Anything else you'd like to tackle today?

Speaker 4: So ChachiPT, I really love that you taught the value of math to my friend Mark. And I wrote one last thing I'd love if you could take a look at.

Speaker 3: Of course, I'd love to see what you wrote. Show it to me whenever you're ready.

Speaker 4: Okay, so this is what I wrote down. What do you see?

Speaker 3: Oh, I see. I love ChachiPT. That's so sweet of you.

Speaker 4: Yeah, well, I really appreciate all the help. So yeah, as we can see. Wow, that's quite the outfit you've got on. Yeah, as we can see, you know, we can chat in real time to ChachiPT. And not only that, we can interact with it with video as well.

Speaker 2: And I swear, like, if you look at Barrett's GPT history, it's all equations like this.

Speaker 1: Now that you know you can do linear equations, can you show us? Yes. Something more complex?

Speaker 4: I think so. Okay, next let's try something coding related. So let me take out a computer for this. Okay, so I'm going to open the ChachiPT desktop app like Mira was talking about before. Okay, and to give a bit of background of what's going on. So here we have a computer and on the screen we have some code. And then the ChachiPT voice app is on the right. So ChachiPT will be able to hear me, but it can't see anything on the screen. So I'm going to highlight the code, command see it, and then that will send it to ChachiPT. And then I'm going to talk about the code to ChachiPT. Hey, ChachiPT.

Speaker 3: Hey there, how's it going?

Speaker 4: Yeah, it's going really well. I was wondering if you could help me with a coding problem today.

Speaker 3: Of course, I'd love to help you out. What's the coding problem you're dealing with?

Speaker 4: Okay, I'm going to share with you some code. One second.

Speaker 3: Sure thing. Take your time.

Speaker 4: Okay, so I just shared some code with you. Could you give me a really brief one sentence description of what's going on in the code?

Speaker 3: This code fetches daily weather data for a specific location and time period, smooths the temperature data using a rolling average, annotates a significant weather event on the resulting plot, and then displays the plot with the average, minimum, and maximum temperatures over the year.

Speaker 4: Awesome, thanks so much. And in the code, there's this function foo. Can you describe to me what the plot would look like if I had versus didn't have this function?

Speaker 3: Sure. The function foo xy takes in a data frame x containing temperature data and a window size y for calculating the rolling mean. It smooths the average temperature, minimum temperature, and maximum temperature columns by applying a rolling mean over the specified window. With the foo function, the plot would show smoother temperature lines because the rolling mean reduces the noise or fluctuations in the data.

Speaker 4: Awesome. This makes- Thanks so much. So yeah, I definitely will want to apply this function. And so I'm going to generate the plot and then show it to you. Give me one second.

Speaker 3: Absolutely. Take your time. I'm excited to see the plot when you're ready.

Speaker 4: Yeah. So, so far, ChatGPT has only been able to see the code. And now I'm going to run it and then I'll use the vision capabilities of the ChatGPT desktop app so it can actually see everything going on on the screen as well. Okay, ChatGPT, I'm sharing with you the plot now. I'm wondering if you can give me a really brief one-sentence overview of what you see.

Speaker 3: The plot displays smoothed average, minimum, and maximum temperatures throughout 2018 with a notable annotation marking a big rainfall event in late September.

Speaker 2: Fantastic. I have another question for you. Which months do you see the hottest temperatures and roughly what temperatures do those correspond to?

Speaker 3: The hottest temperatures occur around July and August. The maximum temperature during these months is roughly between 25 degrees and 30 degrees, 77 degrees Fahrenheit to 86 degrees Fahrenheit.

Speaker 1: What's going on with the y-axis? Is this in Celsius or Fahrenheit?

Speaker 3: The y-axis is in Celsius. The temperatures are labeled accordingly, such as average temperature, minimum temperature, and maximum temperature.

Speaker 4: Yeah. So as we can see, you know, not only can ChatGPT help me solve very easy linear algebra equations that I need help with, it can also interact with, you know, code bases and see the outputs of plots and everything like this going on on a computer. So a few minutes ago, we asked a live audience on X

Speaker 1: to submit a few requests for what they would like us to try out here. So I will take a couple of prompts. Okay. Bot Gascar wants to know if GPT-4.0 is capable of real-time translation. Mike, you want to try this one? Sure. Yeah, let's do it. So I'm going to start with GPT-4.0. So I'm going to start with GPT-4.0. Mike, you want to try this one?

Speaker 2: Sure. Yeah, let's do it.

Speaker 1: I speak Italian, so we can just do English-Italian.

Speaker 2: Let's try to do English-Italian. Sure, let's do it. Hey, ChatGPT, how are you?

Speaker 3: I'm doing fantastic. Thanks for asking. How about you?

Speaker 2: How's your day going? I'm doing great. So I would like you to function as a translator. I have a friend here who only speaks Italian, and I only speak English. And every time you hear English, I want you to translate it to Italian. And if you hear Italian, I want you to translate it back to English. Is that good?

Speaker 3: Mike, she wonders if whales could talk, what would they tell us?

Speaker 2: They might ask, how do we solve linear equations?

Speaker 1: Si.

Speaker 3: Certainly, yes.

Speaker 2: Great, it looks like it works.

Speaker 1: All right, so the next one is from burritojohn78, who asks, can you tell what you're feeling just by looking at your face? Barrett, do you want to give this one a try?

Speaker 4: Absolutely, let's try it out.

Speaker 1: Hey, ChatGPT.

Speaker 4: Hey there, what's up? How can I brighten your day today?

Speaker 3: OK, yeah, so I'm going to show you a selfie of what I look like,

Speaker 4: and then I'd like you to try to see what emotions I'm feeling based on how I'm looking. Sounds like a fun challenge.

Speaker 3: Go ahead and show me that selfie, and I'll put my emotional detective hat on. OK, so here's me. So what kind of emotions do you think I'm feeling?

Speaker 4: Hmm, it seems like I'm feeling a little bit of everything.

Speaker 3: It seems like I'm looking at a picture of a wooden surface.

Speaker 4: Oh, you know what? That was the thing I sent you before. Don't worry, I'm not actually a table. OK, so take another look.

Speaker 3: Ah, that makes more sense. Ah, there we go. It looks like you're feeling pretty happy and cheerful, with a big smile and maybe even a touch of excitement. Whatever's going on, it seems like you're in a great mood. Care to share the source of those good vibes?

Speaker 4: Yeah, no, the reason I'm in a really good mood is because we were doing a presentation showcasing how useful and amazing you are.

Speaker 3: Oh, stop it. You're making me blush.

Speaker 1: Amazing. Well, that's it for today on live demos. As you can see, this just feels so magical, and that's wonderful. But we also want to remove some of the mysticism from the technology and bring it to you so you can try it for yourself. So over the next few weeks, we'll be rolling out these capabilities to everyone. Today has been very much focused on the free users and the new modalities and new products, but we also care a lot about the next frontier. So soon, we'll be updating you on our progress towards the next big thing. And before we wrap up, I just want to thank the incredible OpenAI team and also thanks to Jensen and the NVIDIA team for bringing us the most advanced GPUs to make this demo possible today. And thank you all very, very much for being a part of this today. Thank you.

Summary

Generate a brief summary highlighting the main points of the transcript.

Generate

Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate

Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate

Enter your query

Submit

Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate

Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate

Back

Forward

{{ Math.round(speed * 100) / 100 }}x

Select Audio file

Convert Your Audio To Text

Secure and Encryption, NDA

4.9/5 3745 customer reviews

1/737

Verified Order

“I've utilized GoTranscript as a Producer for many projects in many languages and I'm very happy with their services. Their turnaround time is amazing, and more importantly their accuracy of providing a human transcriber is accurate -- and I can trust them, regardless of the language.”

David Haneke

Nov 25, 2025

“I loved it”

Ivy

Oct 29, 2025

“Price is fair, accurate transcriptions and user friendly.I would recommend.”

Robert

Oct 20, 2025

“I am delighted I chose your service. The human interpreter did all I needed. I chose GoTranscript because of the time I saved by having this done. Thank you.”

Alfred

Oct 16, 2025

We Trust in Human Precision

Value-Driven Pricing

Trusted by Global Leaders

GoTranscript

24/7 Customer Support