Speaker 1: Hey guys, welcome to the show. Today, we're gonna do the ultimate AI comparison. We have IBM Watson, the super AI computer that was slapping people up in jeopardy back in the day, about 10 years ago. And we have Google, the creme de la creme of AI, the superhero. We're gonna be comparing them both in speech to text. We are using the online speech-to-text solutions that they have, the APIs. And we're gonna be testing them out. We're gonna be testing them out in accuracy, documentation, ease of use, support, pricing, everything you need to know. And I'll start off the show with accuracy because accuracy is probably what you wanna hear. And to be honest with you, they are both pathetic in 2021, not pathetic. I mean, yeah, they're pathetic, they're rubbish. Okay, I'm really harsh. But yeah, there was too many mistakes. Like the fifth word was just a mistake. Like, so it was very sad. I give the edge to IBM, but really, Google is just as bad as IBM, and IBM is just as bad as Google. But I do give the edge to IBM. So IBM Watson wins on speech. Let's just jump in and talk about it. So IBM Watson is saying it's not Dr. Nora's clinic, it's Delta North's clinic. Over in Google, it's managed to say doctor, but it's still Dr. North. It doesn't know how to say Nora or understand it. And it's also got clinic. So I'm giving one point to Google, got that right. Next up, we're gonna see what it says. You may know me as Dr. Nora, the cosmetic GP. So Watson has says, you may know, now he has talked to North, which was like 63. Google is saying you may know me as Dashond, which is a dog, offensive. Google, come on, you're meant to be PC-centric, number one. Or the cosmetics CC. So I'm gonna give Google a point because it's better than which was like 63. Mistakes, but one point there. But did you know?
Speaker 2: But did you know that I also have FSRA?
Speaker 1: But did you know that I also have? So Google has said, but did you know that FSRA, it jumps straight into the FS business and it says, it missed out that I also have. It missed a complete sentence out. Whereas Watson is saying the thing that I also have FSRET. This one says FSRA, it's meant to be a H. So I'm giving one point each because Watson has says I also have. It's got that and Google completely missed that out. But it's got, but did you know, it's got that part right. To the long term, the hormonal and the non-hormonal options. Google is saying option to the long time, the hormonal and non-hormonal options. So Google says there isn't a the, and Watson is saying there is a the. Google is saying it should be long time. Watson is saying long term. Let's see, how's it gonna play out?
Speaker 2: To you, ranging from the short term option to the long term, the hormonal and the non.
Speaker 1: So there's definitely a the and there's definitely a long term, not long time. Google, Skynet, you failed. Anyway, so I'm giving the accuracy award to Watson. But you know, they're all a bit here and there. Maybe if you had a different voice, it would be slightly differently. But you guys wanna know how much does this cost? What is better? So Google, they give you 60 minutes a month free. For free, 60 minutes a month. Watson, on the other hand, I signed up five years ago and they haven't changed their policy but they have given you more minutes. So now when I signed up this time around, they give me 600 minutes a month. So it looks like I gotta say that Watson is winning. Watson is winning that battle. But do not be deterred because after the free trial, how much does it cost? Google says 0.024. Watson says 0.028. But Watson also has a second tier which says 0.01. So maybe Watson could be cheaper if I understood it. Speaking of which, the documentation of Watson, it is a bit all over the place. That's the magical 0.01, right? Magical, I don't know what that means. I'm trying to understand it. And learn more about changing plans. Yeah, I wanna know. Tell me more about changing plans. Oh, sorry, this content is available, isn't available. Boom. Whereas if you go on Google, oh my God. They have a documentation city in any language you want. They got a G Cloud, C Sharp, Go, Java, Node.js, PHP. Their documentation city. One negative about that is they're kind of up to date. So the examples they give require like a newer version of Node.js. You should be using a newer version of Node.js anyway. They have a version 10 or above which is actually five years old. It's not that bad. But pretty much the API of IBM's solution hasn't been updated since I signed up five years ago or four years ago. I can't remember if it's been that long. Whereas Google's, they're constantly updating their documentation. You probably will have to update your code every five years. It's not like there and then. And there's so much to get lost. You probably might find it harder to get it going. I mean with IBM's documentation, usage wise, you just make a curl request and it gives you the results back. Google, yeah, it's a bit more complicated, a bit more evolved. It's not like complicated. I'll show you exactly both usage. So it's cool, Watson. You just make a curl request. You add in your API key. You specify the content type. Here is mp3 chunked. Sending my mp3 file and it's running. I'm using the API recognize. So far it's been 20 seconds. Google would have been done by now, 20 seconds. Okay, 45 seconds it took. That was Watson doing an mp3. It was slightly faster with FLAC. That one took 40 seconds. But I guess your mileage may vary. So just scheduling that it's going to be two times slower than Google. Google on the other hand, I've just downloaded their scripts. So they got a nice Node.js script. I'll go to the bottom. I'll give you a little vi, speech.js and you have to use npm to import the library. Pretty much just gives you lots of options on the screen. Let me make it slightly bigger so you can see. And pretty much when you call it, let me just exit. You can call it in several different ways. So this is the enhanced model. It spits out a bunch of nonsense. So forget that one. You can sync words and that'll tell you the words that are being used in the timestamp or the standard one is just sync. You run that for about 20 seconds and here I'm saying like another server, new server from Google. I'll allow that. Let's go another one. It makes a million calls to millions of servers. Go on, just go on. Yeah, I'll run that again. So I'm using version 12 Node. You need to use 10 and above. Speech.js sync. I'm specifying the file name. E is the encoding, mp3 and in about 20 seconds, you should get the results. And there it is. That's the results. Boom. Hello and welcome to Dr. North. Oh, super simple. You just copy example and it works but it's a lot more involved. Too many lines of code. When I'm uploading the data to Google, there's like 5 or 10 different servers that I'm uploading and talking to. API key here, that over there. IBM is kind of like just one and done. Next up, we're going to be talking about which has the best features and obviously Google has the best features. It's got too many features. I do like how simple Watson is. It gives you a confidence rating and it just tells you the sentence and you're done. You move on. Google has like two different models. It's got an enhanced model which by the way is nonsense. Home Hong Kong. Yeah, off off Thursday. That was weird. It just didn't work for me. I don't know what that's about. Maybe I'm using completely wrong. It's got a standard model. That one works. You can do word timestamps. It's kind of cool, kind of fun. Speed wise, Google was twice as fast. Yeah, and the last thing I've got to say is requirements. IBM Watson allows you to just use the speech to text without giving you your credit card details. Google requires you to give payment details. It says as soon as you sign up that you can cancel auto subscription, but I went to try to cancel my auto billing and they wouldn't let me. It's kind of like fixed maybe because I tried using PayPal or something like that. Whereas Watson, you don't need to give them a credit card to use their service. So the free plan is a free plan or it's Google. It's a free plan, but you need to give your credit card. So yeah, they both support mp3 files. I'd say IBM is probably easier to get up and running. You get free if you're a freeloader. If you're just a student just wanting to get up and running IBM Watson is cool because you get more minutes. Whereas if you're a paying user, then Google makes more sense. Actually, maybe for a student Google makes more sense anyway because you can put it on your CV and you know, your Google Cloud experience. You got a job. Here you go. Here's the keys to the car. Drive the company away. And probably Google overall could be cheaper long term because it's 0.024 in Australian and IBM is 0.028 slightly more expensive even though IBM does say 0.01 somewhere. I don't understand how that works. Long-term probably IBM won't be as supported as Google. It's been five years and IBM still on version one. Same, nothing's changed. All they've done is just up the minutes. But yeah, that was my experience of online speech-to-text. Next, we're going to be checking out offline solutions, running it on your own, processing power. Will it work out cheaper? Will it be better? Probably will be better. So we'll find out about that. Let me know what speech-to-text AIs you guys are using out there in the world. And I hope you guys found this video useful. Thanks for watching.
Generate a brief summary highlighting the main points of the transcript.
GenerateGenerate a concise and relevant title for the transcript based on the main themes and content discussed.
GenerateIdentify and highlight the key words or phrases most relevant to the content of the transcript.
GenerateAnalyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.
GenerateCreate interactive quizzes based on the content of the transcript to test comprehension or engage users.
GenerateWe’re Ready to Help
Call or Book a Meeting Now