Install DeepSeek R1 Locally: A Simple Guide

Convert Your Audio To Text

4.9/5

3718 customer reviews

Learn to install DeepSeek R1 AI model locally on your device. Step-by-step instructions ensure privacy and ease of use without technical jargon.

Run DeepSeek R1 Privately on Your Computer

Added on 01/29/2025

Speakers

Add new speaker

Speaker 1: I've installed the new DeepSeek R1 Reasoning model, the large language model that rivals OpenAI's O1 model, which right now I'm paying $200 a month for. This is completely local, running on my computer, and it's private. I could literally turn off the Wi-Fi right now. So in this video, I want to show you exactly how to install that step-by-step, and I'll make it as non-technical as possible. Now, I'm also making a video comparing DeepSeek R1 versus OpenAI O1, which I'll release in a couple of days. But in the meantime, I'm going to show you exactly how to install it. So if you go to deepseek.com, you could actually use their chatbot. So if you click Start Right Now, it brings you to this page, and this is just available at chat.deepseek.com, and you could create an account, and then you could use it here. That's not what I want to show you here. I want to show you how to install it locally on your own device, so you're not using a website. And if you click DeepThink, this is the R1 Reasoning model. Now, if you just have a normal chat with it, it uses a different model, not R1. It's called V3, DeepSeek V3, and that's been available since December. And it also has the power of search, so you combine this DeepThink R1 model with search here too. Again, I'll cover this in a more deep dive video about the model. Right now, let's go ahead and install it locally. Okay, it's only going to take you a couple of steps to install it on your computer, and then it's going to take a couple more steps to get a really nice user interface so it looks like chat GPT. So the very first step of the process is we need to install something called Olama on our computer at olama.com. This works for Mac and for PC, and just press Download right over here. Choose your operating system from these three, and then press Download over here. If you're using Windows, it needs to be Windows 10 or later. Okay, once you unzip and install Olama, it's going to give you this pop-up right here. I'm going to press Next. It's going to say Install this command line, and then you can press Install right over here. I'm going to type in a passcode. Then it's going to say Run your first model. It's going to give you a little code here, just three words to use as your first model. Right now, Lama 3.3 is out, so I could actually change that to 3.3. So I'm going to copy this, and I'm going to press Finish right here. Now, let me, before I installed R1, I'm just going to open up and search for the Terminal app over here. Now, this is optional, but I'm going to install Lama 3.3. So I'm going to type in 3.3 here. So it says olama run lama 3.3. I'm going to press Enter. This is going to install Lama 3.3, which is obviously not R1, but this way we have another open source model that is not a reasoning model, so we could go back and forth in the rest of this process. While this is installing, let me show you R1. By the way, Lama 3.3 is 42 gigabytes in size. So make sure you have that kind of space here on your computer if you're going to do this. If you don't have that much space, skip this and just install R1, which we're going to install pretty much exactly the same way. Now we're on step two. Now we have Olama installed. We're going to go back to the olama.com website. We're going to click on models and we're going to see this DeepSeq R1 model available right here. Let me click on it. So R1 comes in multiple different weights and sizes, OK? So if I click this, you'll see that it comes the smallest version is 1.5 billion parameter. Now that's not going to be very good, but it's going to be very fast and very small. It's only one gigabyte. By default, it's set to the 7B model and then you could go all the way up to 671 billion parameter model, which is 400 gigabytes. And no computer is going to be able to handle that, at least no consumer grade computer. And depending on how good your computer is, you may be able to get to that 7B model, but I usually recommend for these type of things, start with the 7B model and then see how it is with speed because it's going to require pretty beefy GPU, your graphics card, in order to run this properly. Now I also got this new GPU. This is called GeForce RTX 5090, top of the line GPU available right now from NVIDIA. This is not sponsored by them, but they flew me out to CES and I made a different video about the different things they rolled out there. But I'm going to try to build a new PC and I'm going to install this in there so I could take advantage of some of these other models, so I could show you the local install of those. So I'm going to start with the 7B model myself. So in order to install that, and we're just on step two, installing the large language model using Olama, I'm going to copy this right over here. So it says, olama run deepseq-r1. That's the code I need. And if you were to install the other models, all you have to do is click on it here. And then you see, it just adds that to the end of the code. So it adds 14B. And then you can install multiple models too. And then when we get to the user interface that we're going to use for this, we're going to be able to choose between different models too. But right now, just get started with one and I'll copy the code from here, which is by default, the 7B parameter model. Okay, so I restarted terminal so I could start here fresh. I did install Olama 3.3 before, but I'm going to paste olama run deepseq-r1 again, which is what I got from that Olama website. And I'm going to press enter here. And now it's going to go ahead and install that. This is 4.7 gigabytes, this version of it, which is the 7B version, the smaller version here. Okay, and that took about a minute here. That depends on the speed of your computer, obviously, because it's downloading it. And right here, I technically right now have deepseq-r1 running on my computer. This is how I could use it inside of the terminal app. And I could literally ask it anything. It does its thinking right here. How can I assist you? Okay, so I'll ask it how many r's in strawberry. And this is how it works. It creates that thinking bracket. If you've used chat, gpt-01 works in a similar way. Now, it actually didn't give me a complete answer. So let me try again. And this time, there are three letter r's in the word strawberry, and it kind of broke down its thinking process here. Now, obviously, this is all you need to do to install it locally, but it's not very nice, right? We need a couple more steps to install a user interface that looks a lot more like chat-gpt or the deepseq chatbot I showed you in the beginning. The first thing we need is we need to install something called docker. This is at docker.com. And then again, we could download this for our computer, and I'll move it to my application folder. Now, I just need to search for it on my computer after it's installed, and I just need to double click to open it. And it just needs to be open. I literally don't have to do anything else. And I'm not even going to create an account. So I'm going to skip this. Okay, and then you should see this page where it says your running container show up here, and there shouldn't be a container just yet. And we don't have to do anything with Docker. It just needs to be installed, opened, and minimized. So I'm going to minimize it now. Now, I only have one more step. So the first thing, just to recap real quick, so we're on the same page. We went to the Olama website. We downloaded Olama. We went to the models tab. We went to deepseq R1. We copied this over and installed it inside of the terminal app. That's all we did. And then we went to the Docker website, and then we downloaded Docker from here, depending on what machine we have. And we installed and minimized it. Now, the last thing we need to do is we need to install something called open web UI. And this is what it's going to look like. It's going to run Olama 3.3. It's going to run deepseq R1. Any model, and we could pick and choose what model we want, depending on what we're doing with it. And I'm going to show you exactly how to install it. Now, this page docs at openwebui.com. I'll include this in the description of this video with all the steps. So all the links are going to be in one place. And all I have to do right here, it says if Olama is on your computer, use this command. And it gives you this command right over here. So that's all we have to copy from right over here. I'm going to copy this code and let's open up terminal one more time. I'm going to actually terminate this terminal and start a new one. Okay. And then I'll go ahead and paste that code over here and press enter. And it looks like nothing happened, but it did create something inside of Docker now. So let me open up Docker, which I had minimized over here. And you could see now I have something called open web UI. And this is a container. And it gives me this port number right here, which is going to open up a tab inside of a browser, but it's offline. This is actually not going to be online right here. And then from this page right here, we do need to sign up for open web UI to get that UI. And then after that, your computer could be offline. It doesn't have to be connected to the internet to use it. I already have an account, so I'm going to sign in. Okay. Now that brings us to this page right here. And this is what I'm going to show you now where we could use R1. So the way we do that is right up here. If you click this dropdown, you'll see DeepSeek R1 latest. The 70B model is right over here. So I have different Llama models that I have installed. I have older models here. I've had this for a while, but if I click on this right now, this is going to now use R1. If I set it as default, every single time I use it, it's going to use R1. But most of the time, I actually probably want to use something else like Llama 3.3, which is what I installed in the beginning of this video. Maybe that could be my default because it's more broad and not a reasoning model. And then use R1 when I need it. But in this case, since it's a DeepSeek R1 video, let's go ahead and use it right over here. Okay, let me give it this prompt right here. Give me 10 challenges only a reasoning AI model can solve. And I'm going to press send. I'm not going to edit it. So you see the speed of it right here. You see that it's almost instant because it's a smaller 7B model that I'm using. The higher the model, the better, but the slower depending on your computer here. And you could see, it always puts it in these thinking brackets. So it starts a bracket and then it ends a bracket. So this part of it is actually not your answer. This is how he's thinking through it. And then down here, this is going to be your answer. So he gave me 10 different things. Problem solving, ethical decision-making, strategy games, mathematical problem solving. So this just gives you an idea of the different things that it could actually do. And one of my favorite reasons for using OpenWebUI is this right here, right on top. I could choose this model right here, but I could actually select a secondary model. So I could press the plus sign, select another model. Since I installed Llama 3.3, I could run those side by side, ask it for the same exact prompt again. Okay. And look what it does now. He actually uses both models side by side. You could see Llama 3.3 is like 10 times the size right now. That's the much bigger, that was 43 gigabytes. This was four. And you could see how much slower that is. And this is why I probably need a much better computer here to run these models locally as they get bigger. Because I mean, it's been what, like 10 seconds and I'm still not getting a response out of Llama. But R1 went to work right away. Okay. Now it's finally going to work. It's very slow. Typing out kind of one word at a time, but I'm running this on a laptop right now. So the fact that it is even running, it's kind of surprising here, this Llama 3.3, but you can see how fast R1 is. And then I could install more models. So let's say I wanted to install the bigger model since 70B worked. Well, let's try 32B. At this point, it's super simple. I just go back to the models tab on the Llama website, click on 32B over here and click on the dropdown, copy this over, open terminal, paste this into terminal right here, press enter, and then it's going to go to work. This is going to be 19 gigabytes. So it's going to take much longer than the first version I installed here. And then as soon as it's done, let me show you what ends up happening inside of this open web UI. And if you look at my URL, by the way, I'm on the browser, I'm on Chrome right here, but I'm not on the web. I'm on localhost 3000. This is running locally right now on my computer. Now, while that's running, I want to also show you this because we recently revamped our entire e-learning platform for AI. And we actually merged with Futurepedia, which is one of the leading AI tool libraries and newsletters. And now it's a kind of an all-in-one type of platform. And we have a course specifically for this right here, AI Powered Private Chatbot. And this one is about an hour, but it really dives into all the different things you could do with running Llama models, running Mistral models, running different models. Now, I'm going to add a little section here about running R1 models, but it's focused also a lot more about the open web UI interface, about actually giving your knowledge base to it, all kinds of different things you could do inside of that. So that's available here. And this is all in one subscription. So not only do you get this, you get all these different courses, notebook LM course comes out tomorrow. So all these are going to be available under the same bundle. And we still have a free trial available right now. So if you want, you could actually watch a course for free. See if you like this platform. There's a whole community section, all kinds of different things here. You could explore once you become a member, just to see if it's a good fit for you. And then you could complete courses and get certification and things like that all in this platform. I'll link this below as well. Okay, this one just finished up installing. So I'm not going to use it inside of terminal. I'm actually going to close terminal here. And now if I click this dropdown, it's not going to appear just yet over here. But if I refresh this tab over here, now, if I click down here, I'm going to get two different R1 models. So now we have a 32B model and we have that 7B model. So I'm going to click on this one instead this time. And let me actually start a new chat. Okay, let me choose the other one here. And in this case, I'm going to run the other one. I just want to show you kind of the speed of it here. So this is 7B versus 32B. I'll paste the same prompt, which is a relatively simple prompt here, right? It's not really doing any reasoning. Okay, that was not bad. It actually kind of worked. Wow, look at this. This is really surprising that the 32B model is actually working. Well, not exactly as fast, but I mean, it's not like the Lama 3.3, right? Which took 15 seconds to even start. All right, not bad. So I could definitely use this on my computer. And once I build that PC with the NVIDIA chip, I could definitely run the 7B model if this is going to handle the 32B model like this. Now in the next video, I'm going to use their chat bot and compare R1 versus chat GPT-01, which right now I'm paying $200 a month for to get unlimited access to. So we'll go ahead and compare that. I have about 10 prompts ready for that. And I'll post that as soon as it's ready. Thanks so much for watching. All the links are in the description. I'll see you next time.

Summary

Generate a brief summary highlighting the main points of the transcript.

Generate

Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate

Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate

Enter your query

Submit

Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate

Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate

Back

Forward

{{ Math.round(speed * 100) / 100 }}x

Select Audio file

Convert Your Audio To Text

Secure and Encryption, NDA

4.9/5 3718 customer reviews

1/730

Verified Order

“I needed an interview transcribed accurately and I was happy with the quick turnaround. ”

Jen

Jul 20, 2025

“Very accurate transcription, fast service, easy to use and order, thank you!”

Gabby

Jul 15, 2025

“I am beyond happy with this service, which I am using it produce interview transcripts for my dissertation research. The interface is easy, the customer service was prompt and informative, the transcript is accurate, and the pricing is wonderful. I will recommend GoTranscript to anyone who is in need of affordable human-powered transcription services.”

Justin McDonald

Jun 29, 2025

“great work. quick and professional”

christian oradesky

Jun 28, 2025

We Trust in Human Precision

Value-Driven Pricing

Trusted by Global Leaders

GoTranscript

24/7 Customer Support