Speaker 1: Hi, welcome to another video. So OpenAI's operator has been launched. That is basically their own implementation of Claude's computer use. I mean, it's funny because this is only available on their $200 plan. It's good and all, but I mean, it's pretty evident that they are losing hope, as open source is emerging. Like everyone knows now that what they are charging for models like O1 is absurdly high, and this new operator is okay, but it's not worth $200 for sure. So I thought that I'll tell you guys how you can use DeepSeek R1 as a similar kind of AI agent that is not only open source, but also free. Yes, well, how am I going to do it? Since the OpenAI operator only works on the web, and it cannot do any desktop tasks, so I'll also use browser use. What's browser use, you may ask? Well, it's an open source AI agent that can control a browser, and if you have seen my videos, then you'll know that it performs too well, and it doesn't even use vision. It does everything via scraping the page's code and performing playwright actions on it. The basic one is just the Python package, but there's also the new web UI that also works pretty well for our tasks. Now for the model, there are multiple free providers that you can use with it that will be pretty good. Like there's Grok's Llama 3.3 that will perform well and super fast, and it's free with some rate limits, while there's also the GitHub Models API that gives you access to GPT-4.0 with some rate limits as well, and all of them are good. But one thing that I was able to see on their blog post is the fact that their operator can reason, and that just makes me think that the best model with it that performs similarly should be none other than DeepSeek's R1. Now DeepSeek's R1 is pretty cheap. You can probably generate about 200 million tokens, which is, trust me, a lot of tokens, and still be under the $200 price tag of OpenAI's ProPlan, which is just insane. Anyway, but remember that I had said the word free. Well that's true. So there are multiple providers that are now providing the R1 model for free, like Together, with about $10 of free credits, as well as Hyperbolic, with $10 free credits, and probably some other. But one more provider that I saw is literally providing $100 of free credits is Cluster. Yes, they have the DeepSeek R1 model, and they are giving $100 of free credits, which means that you can use the model a bunch of times. Like you can generate about 100 million tokens, which is pretty insane, and you can use it with anything like Klein or Ader, or in our case, browser use. So this is what I'll be using. Now let's set it up and use it. But before we do that, let me tell you about today's sponsor, NinjaChat. NinjaChat is an all-in-one AI platform that gives you access to more than 10 models like Clawed 3.5 Sonnet, GPT 4.0, Gemini, and even image generation models like Flux, and video generation models like Kling, and much more, all in one place, for a price that's even cheaper than one chat GPT membership, starting at only $11. Not just that, they have a bunch of AI tools that can help you use these models in intricate ways. They have also recently added an Artifacts feature to their platform that now allows you to generate code, preview it, and share it with others using preview links, which is great. It can even run Python code and create charts. You can check them out through the link in the description, and make sure to use my coupon code KING25 to get an additional 25% off these already great deals. Now let's get everything set up. First of all, you'll need to navigate to this repo. So there are two ways to set it up. The first way is the Docker option. It allows you to basically have everything sandboxed, and it's easy to spin it up, although If you're a tweaker like me who likes to change API providers in between, then I'll recommend running it locally, without Docker. Now even locally, they recommend you to use UV to make a virtual environment and everything, and I also recommend you to do that, but you only live once. So I'll show you the easiest way without that. First, just get it cloned, and then you'll need to get in the folder. Once you have done that, just get the packages installed with this command, and now we can start it with this command. Now, just head over to localhost, and the port it shows you, and now this is the interface that you'll see. So here, you can easily start using it. You have multiple tabs here. First, you have the max run steps, that gives you the options to basically set how many max steps you want the agent to take. If you're using a vision model, then you can also enable the vision capability as well through here. Then, you have the provider option, where you can set the provider you want to use. So here, you can just put in the API base URL in here, and you can just put in the model name and API key. The best experience that I have had is with Gemini 2.0 Flash, and it's free as well. So you can use that as well, but I'll show you how you can use it with DeepSeek R1. So just put that up like this over here. I will not as much recommend to use DeepSeek R1 in this, because I just feel that it makes browser use a lot slower because of the chain of thought. So, you can use something like Llama 3.370b as well, via the same provider, or even Gemini, which is insanely good. So, there's that. You can also set the browser settings here, like if you want to use the headless mode, keep the browser open even after the task is done, or what resolution of the browser do you want, and stuff like that. There's also the enable recording, which will basically record the whole session, which is also great. Now, apart from this, you have a bunch of other stuff as well, but the main tab is the run agent option. Here, you can just enter what you want it to do, or anything like that. Let's ask it to go to Best Buy and search for MacBook Air, and add it to my cart. This is a lot complex, and actually, OpenAI also shows something quite similar in their demos as well. So, I'm going to do just that. Let's ask it. Okay, now it's doing that. So, I'm pretty sure that it will do this quite easily, because browser use is really good for sure. So, it first goes to Best Buy, then if we wait a bit, then it will mark each element over there, and then if we wait a bit more, then it will start to type in the search box. Then, it will again check the page, and now it will search for it. Once we have it here, it will click the stuff here, and now it will open it up. So, now it will try to add it to the cart. If we wait a bit, then it's now done. Okay, so it did this pretty well, which is just amazing. I mean, what else do you want? It's so good, and you can actually implement this super easily with anything you'd want, because browser use is basically just a library. Also, although DeepSeek R1 is good, it can be a little finicky at times with browser use. So, I'll recommend you to use Gemini 2.0 Flash, as it's just free with very generous rate limits, and it will perform very similarly, if not better, because it also has vision capabilities. So, I'll recommend that for regular usage. You can just set that up here if needed. So, that's great. I think you guys shouldn't pay even a cent for OpenAI's operator, because this is so good with both Gemini and R1. So, make sure that you use it instead of the OpenAI operator. Overall, it's pretty cool. Anyway, share your thoughts below and subscribe to the channel. You can also donate via SuperThanks option, or join the channel as well and get some perks. I'll see you in the next video. Bye.
Generate a brief summary highlighting the main points of the transcript.
GenerateGenerate a concise and relevant title for the transcript based on the main themes and content discussed.
GenerateIdentify and highlight the key words or phrases most relevant to the content of the transcript.
GenerateAnalyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.
GenerateCreate interactive quizzes based on the content of the transcript to test comprehension or engage users.
GenerateWe’re Ready to Help
Call or Book a Meeting Now