[00:00:00] Speaker 1: Hi everyone, I'm Naima from Assembly AI, and today I'm going to show you a demo of Pharsa's Clicky. So Clicky already runs on Assembly AI's Universal 3 Pro streaming model, but today I'm going to show you an addition I made to it using Universal 3.5, which Assembly just launched last week. Universal 3.5 is Assembly's most accurate transcription model, and the reason why this is the best choice to power Clicky is because Clicky operates by taking a screenshot of what's on the page in order to walk you through what you're asking questions about, and this will work with Universal 3.5 in this demo to update all the things that are on the page to the key terms prompting in the transcription model in order for it to better understand what it's hearing. All of this works together in order to allow the model to completely understand what you're asking about, even if they're tricky words about code or things that might not be in the dictionary, the transcription model is still able to point you in the right direction, which is perfect for this product's use case. So for Clicky's use case, I'm able to constantly update what words the model is looking for and anticipating based off of whatever is on the screen. So to demonstrate that, I'm going to show you how the model is able to understand some random words from this page and point you in the right direction and allow Clicky to walk you through it. So here I'm hitting control option. Clicky, explain to me what PCMS16LE is and where it is in the code.
[00:01:21] Speaker 2: You can see it right there on line 464. It's being passed as the value for the encoding query parameter in the WebSocket URL. PCMS16LE stands for Pulse Code Modulation, signed.
[00:01:34] Speaker 1: So I'm going to cut it off there, but because it understood what was on the page, it knew exactly what I was talking about and it knew to anticipate that word. To show this even further, I created this widget, which allows you to see all the key terms that the model is constantly updating in the prompt as I'm scrolling. So it's looking at all the different things on the page and anticipating those words in the transcription. And so as you can see here, many other models would struggle to transcribe something like audio PCM16 data. It's that one word, it's that many words, what does it mean? But because here it's already in the list of key terms that the model is looking for right here, when I say, Clicky, point me to where audio PCM16 data is and what is it doing in the code? It knows what to look for and it can give you exactly what you're asking for.
[00:02:21] Speaker 2: So you can see it right there on line 191. It's being created by calling the converter on the incoming audio buffer. Then line 192 checks that it's not.
[00:02:30] Speaker 1: I'll cut it off again, but here at Assembly AI, we love Clicky and we're so excited to see what everyone builds with Universal 3.5.
We’re Ready to Help
Call or Book a Meeting Now