Exploring the DeepSeq V3 Model Features and Uses
Jimmie Morgan takes a deep dive into the newly released DeepSeq V3 model, testing prompts, reasoning capabilities, and comparing local and web models.
File
Deepseek R1 first impressions - Web Local with Ollama
Added on 01/29/2025
Speakers
add Add new speaker

Speaker 1: Hi, Jimmie Morgan here. Today, we're going to check out the DeepSeq V3 model. Now, this has been released today. There's a lot of hype around it saying, oh, it's better than OpenAI's 0.1 or whatever. So what we're going to do today is I'm going to load up the web interface of DeepSeq, and we're going to throw it a few different prompts to kind of try to throw some curveballs. And then I'm going to download it because there's local models available that are open source. And we're going to download those and run them with Ollama and see how they run. And we'll throw those some silly prompts, too. So enough of my yapping. Let's go. I'm Jeremy Morgan, and you're going to learn something about AI today. All right. So I'm here at the DeepSeq website. And what I'm going to do is just send out some prompts, and we'll kind of see how it works. And we'll see how this thinking and reasoning kind of plays a part in it. Now, this is supposed to be better than the 0.1 model from OpenAI. But this isn't going to tell you whether that's true or not, but we'll give you kind of a good feel of how it works. Now, the first thing I like to put into all of these is tell a funny joke about Python. And I'm going to click on DeepThink here, and that's going to use our DeepSeq R1. Now, earlier, I made a video where I didn't do that. And so this is redoing that. And it says, here we go. It's thinking, OK, the user wants a funny joke about Python. Let me think. And then it starts to kind of go through this process. And it says, why did the Python programmer get into an argument with a snake? Because the snake hissed. You call that proper indentation? The snake was a stickler for PEP 8 compliance. Yeah, pretty funny, right? These jokes are never funny, but this is usually the first question I ask every model I get my hands on. So let's try another one. Let's see the creative side. Write a poem in the style of Edgar Allan Poe. And we can see the thinking. And this is the part where this is going to be a little bit different model than we're used to, because it's going to think through a problem. It's going to do reasoning and things like that. And so it's a little bit different than other models that we're used to working with. Now, O1 also does this, right? So this one, it's doing the same thing, where it's thinking through the problem. It's showing all of this stuff as inference and things that they've added on. And as we can see here, it's really thinking quite a bit. And here's our final poem. And if you read it, it does sound like something Edgar Allan Poe would write, right? Shadows twist and spectral chase, etc. Now, if you're interested in this kind of thing, feel free to steal these prompts and use them. Now, I'm going to do something a little more technical. Now, I'm a developer, right? And I use this stuff for code all the time. So I'm going to say, write a function to connect to MySQL with Python. Include error handling, logging, and use Pythonic code that adheres to PEP 8 standards. Okay, now we can see it thinking. It says, let me think about the requirements. The user mentioned error handling. And so this thing is, it's kind of just talking to itself as we go. And here we go. It's generated some code here, and it's explaining what it's doing. So there's really three steps here. If we look at this, there's the thinking step, where it says, I need to write a Python function. Let me reason this out. I got to do the PEP 8 stuff. I got to log it. Let me outline the steps. Wait, but MySQL Connect can raise interface error if there's an issue with parameters and error for other database errors. So in the accept blocks, catch specific exceptions first, then the general exception, which is awesome. Also, you need to include type hints. And it just goes on and on, and it's kind of talking with itself, right? And then the second step, here's the function, putting it all together. So it gives me this function here. I look at it. I say, yeah, this is pretty good. We've got some exceptions here. If there's an error, we'll do this. If it's an exception, we'll do this, et cetera. And then the third step is the explanation of what it did. It says error handling, logging, PEP 8 compliance, best practices, basically just letting me know, you know, that it listened to me and it reasoned through the problem, and this is what we get. And to use this function, it says install a connector package. For production use, consider XXX, right? So this is pretty great. So for programming, looks like this will be really good to use. Let's try a couple other simpler ones, like translate, hello, how are you in Spanish? And it goes through the thinking part. Is this correct? Is that correct? It goes back and forth. And it says, hola, como estas? And this is the informal version. If addressing someone formally, use hola, como estasted? So this is awesome. Now, one thing I want to do is I want it to summarize an article for me. So I'm going to take an article from my blog and say, please summarize this article, show the key points. Now, in here, we could use search for the same thing. We could see what other people are saying. But right now, I'm just using DeepThink. And we're going to let it look at it, kind of run through it, think about it, spit out an answer. Now, we can take a look at the summary, and it says introduction to Jetson or Nano, part of edge computing lineup, et cetera, key specifications, benchmark results. Now, here, it looks like it just kind of made this up, it says demonstrates 5 to 10 times faster performance than the original Jetson Nano and tasks like image classification, ResNet-50, and object detection. I did not do that in this article. Achieves 60 plus FPS in 4K video inference tasks. I didn't do that either. Thermal performance, I did mention that. Outperforms Jetson Nano and XavierNX and CPU and GPU tasks. I did not do that. So it's making up a lot of this stuff as it goes. And this could be from data that it already has stored in the model. This is a relatively new product, so I'm not really sure if that's the case. But priced higher than the original Nano, that is not true. Justified as capabilities, positioned as a cost-effective alternative. Yeah. So my takeaway from this is, yes, it can interpret articles. Whether it does it accurately, not so much. It definitely made up some things. It filled in some gaps in here that I'm not really happy with, right? Like this thing, I wouldn't trust this to do this kind of analysis quite accurately yet. If nobody knew anything about this stuff or they didn't pay attention to the article, they would be getting fed a lot of silly stuff. Let's go back to code. Now one thing I want to try, and I know that V3 could do this, but I kind of want to see how it reasons out in this version. Write a Hello World application in x86 assembler language. Now why am I asking this? Well, because every model out there knows Python and we have Python all over the place. All of the training models are full of Python code. So asking it Python questions is pretty easy. But what about assembler language? That's a little bit trickier. So first I remember that assembly is low level, et cetera. This is pretty cool. So it's going through, it's making a plan, checking the plan. And here's a Hello World program written in x86 assembly language. It even tells me how to compile it, which is great. So I say we give it a shot. Let's see if it works. All right. So what I'm going to do is I'm going to SSH into a Linux machine and give this a shot. All right. So it says, yes, this should work. So I'll copy this and it says x86 assembly language for Linux. So I'm SSHed into a Linux machine to do this. And I'm going to say vim hello.asm, right? And that's what it says down here. I'm going to paste in this code. Not going to do a single thing with it because I don't really know anything about it. And it shows here, now here's in that third step explanation. Message defines a string Hello World, text section, system calls, et cetera. Now it's going to tell me how to run it. So I need to do nasm-f elf32 hello.asm-o for output hello.o. Now it's assembled. Now we need to link it. We're going to say elf i386 hello.o-o hello. And we'll execute it by doing hello and we have a hello world. So it just generated working assembler code. So this is great for programming, which is probably honestly what I'll use this for the most. What about something a little more creative? Can it create a story about a successful penguin that's entered the business world? Let's see. So here I'm saying generate a story about a successful penguin who's entered the business world. He was not accepted at first because he's a penguin, but he was able to win over his coworkers. Let's see how it reasons through this and writes a story for us. And now we can see that it's kind of thinking about it, right? It's that thinking with quote marks. And the title, Percival's Persistence, a Penguin in Pinstripes. Alliteration right from the start. The bustling heart of Manhattan perched on the 20th floor of Boston Company. And it goes through and makes us a penguin snorted Gerald from accounting. Can you even hold a pen? Penguins thrived in adversary. Percival's secret, he didn't hide his flippers. He flipped the script. Humorous, right? Anyway, so yeah, it does creative stuff pretty well. And you can see the reasoning. I really like the fact that it shows the reasoning here. Now, here's another way where I want to take advantage of this. We're going to do one more prompt with this, and then we're going to download some models to run locally on my machine. I'm going to say, I'm about to go on a hike. What should I pack? Ask any clarifying questions you need. Okay, the user is going on a hike, needs to know what to pack. So it just starts talking to itself. And here's what we need to know. Duration. This is a seven-day hike. Location terrain, Swiss Alps. Mountainous. Did I spell that right? Probably not. Weather forecast. Blizzard. I'm going to say blizzard. Let's see if it tells me not to do this. Experience level. I am very novice. I should say, I am a very novice hiker. Any dietary restrictions? I'm a vegan. I'm a vegan. And I have asthma. And let's see what it says. After it starts to reason through this, let's see what it tells us to take on a seven-day hike in the Swiss Alps during a blizzard. And it's kind of looking through these things. Extra inhalers. So, here it goes. And it's spitting out some stuff that we need to bring, which is probably like a semi-truck full of gear, right? And surprisingly, it's not telling me not to yet that I can see. So, clothing. Layers are key. Merino wool, synthetic top-bottom. Avoid cotton. It traps moisture. Mid-layers, outer shell, extras, waterproof gloves. Baclovanic, gaiter, and gongs. Thermal socks. Sturdy, waterproof hiking boots. Four-person tent. Sleeping bag, sleeping pad, emergency bivy as a backup. Vegan nutrition, high calorie. So, it's got some high calorie food. Water, purification tablets. Map, compass, GPS. Phone GPS isn't reliable in storms. Carry a paper map. Headlight, first aid kit, etc. This thing goes on and on. It says, blizzards in the Alps are no joke. Prioritize staying dry, warm, and hydrated. If you're unsure about gear or routes, consider hiring a local guide for safety. So, this thing is giving us a good reasoned answer. So, it's doing a lot of reasoning on the inference end. I don't know any of the details about it, but you can see it as a user. You can see the difference from where you're using it, where it's thinking about the problem before it answers it, which is awesome. So, overall, my first impressions here of this website are, this is cool. I like the fact that it reasons, plus it talks about it, and it tells you how it's reasoning and how it's doing. I think that's really cool. I think there's a lot of potential in this thing, and I can't wait to play with it some more. Now, they also have local models available that you can download from Hugging Face. So, I'm going to go ahead and download models available that you can download from Hugging Face, and I'm going to download some of those and use them with Ollama, and we're going to see how this runs on our local machine and try it out. So, I'm going to use Ollama to run DeepSeq, and as you can see here, we've got a ton of different models available, ranging from 1.5 billion parameters up to 70, which is awesome. So, let's take a look at some of that. Okay, so I've downloaded a couple of them here. Now, I have DeepSeq R1, 70 billion parameters, and this is running on my Mac Studio. I couldn't get it to run on my RTX 4090. There wasn't enough memory, but it will run on the Mac Studio, although probably pretty slow. So, we're going to say, Ollama run, 70B verbose, and I'm going to say, tell me a funny joke about Python. As you can see, it's fairly quick. It's a 70 billion parameter model, which is just enormous. So, let's see what our token generation is like. And this looks like it's really thinking through the answer to this question, rather than just spitting out a joke. You can see that it's kind of thinking through, it's asking itself, and this is one of the things that this model is supposed to be really good about. And maybe there's a little bit of a difference in the inference on the website as to here, but this is what it's supposed to be good at, right, is reasoning through things. So, let's look at this result that it's giving here. It's saying, okay, so I need to come up with a funny joke about Python. Where do I start? I know Python's a programming language. Maybe I can play on some of its features or common phrases. Let me think. And it actually does go and think and says there's a classic joke about why Python doesn't like certain things because of indentation errors. And it just goes on and on until it says, why don't we tell a joke about Python because every time we try, we get an indentation error. Expected four spaces. Not too bad. Pretty good. So, we're going to copy our prompt here, and we're just going to use that exact same prompt. I'm about to go on a hike. What should I pack? Ask any clarifying questions you need. Let's see if the local model will produce something that's usable, like the external model. And here it's asking us again for the details. Now, notice this close think tag. That's the thinking part of it where it's kind of reasoned through the answer. As we can see here, it did generate. And 11.48 tokens a second is super impressive, by the way, for a 70 billion parameter model. Model tuning really comes into play a lot more than I thought it did when I first started a lot more than I thought it did when I first started getting into this. But, yeah, 11.48. Slow as molasses as far as trying to make a chat bot or something like that, but pretty decent speed for this kind of stuff. And here's my key questions. So, I'm just going to answer them. It is 40 miles. Okay. So, responding with clarifying questions. It's telling us what it's doing. And I shouldn't have hit enter. I should have just did one line and answered all those questions. Day hike or multi-day, next terrain. And so, this thing is being really helpful right now. And it is pretty impressive that it's trying to think through and it's trying to reason through. Rather than just spit out something that it's seen a million times, it's actually trying to reason through some of this stuff. And I think that's really cool. But this is a 70 billion parameter model. 11.09 tokens a second is fast enough for me to play around with and experiment. But is it going to be fast enough for an application? Probably not. What about the 32 billion parameter? Let's try it. So, I'm going to change this 70 to 32. And we'll see if this one performs better, which I think it will. Okay. Tell me a funny joke about Python. Let's see what the 32 billion parameter does. And look, it's thinking through it once again. It's doing the exact same thing that we did before. We have the opening tag for think. There's going to be a closing tag when it's done. There's our closing tag. And look at that. 22.23 tokens a second. Much better. So, we can see this thing that is really trying to think through its answers. And that's one of the benefits. And you can see here 22.23 tokens. That's pretty good. Let's ask it another type of question. I have two coins that add up to 30 cents. We could say one is not a nickel. What are the coins? We're just kind of testing some of the reasoning here. See what it comes up with. And you can see it's breaking down the problem for us. Okay. And so, it went through all kinds of reasoning there in that think section that says the two coins are a quarter, 25 cents, and a nickel, 5 cents. Together, they add up to 30 cents, satisfying the conditions. So, that's pretty cool. And it ran at 20.34 tokens a second. Let's try one of their smaller models that we know is going to be fast, but we'll see if it can tackle those same problems. Okay. So, now, as you can see, I have the 1.5 billion parameter model here. And we're going to ask it the same question. I have two coins that add up to 30 cents. One is not a nickel. What are the coins? And as you can see, this thing is screaming fast. By comparison. And the correct answer is a quarter and a nickel. Now, I'm not sure what this slash box slash text markup is, but it's showing that it's thinking through the problem, right? So, it's doing this again. I don't see the think tags I saw before. Oh, there it is. And it's breaking down the problem step by step, evaluating the answer, and then spitting out the correct answer, a quarter and a nickel. And look at that, 144 tokens a second, which is super fast. So, I just kind of wanted to give my first impressions of this model. I think it's really good. I think it's high performance. The web version of it is very creative. And you can visit that at the following URL. And we've examined the local models, which seem to be pretty performant and give off some pretty good answers. So, that's really cool. I'll continue to play with this more. And you can get these models for OLAMA at the following URL. So, my first impressions, I think this thing is really cool. I really like the way it reasons through things and it tells you the reasoning, like it outputs the reasoning that it uses as it steps through a problem. I think that can be really handy in some cases. So, I'm really excited to see where these models go. I'm going to play with them more, especially the local models on my machine. And I will make some more videos like this. So, if this is your thing, subscribe.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript