Speaker 1: It purchased a book autonomously on my account. This is crazy. We literally have AI agents that can take actions on the web. Wow, this is the future. My name is David Andre and here's how to build your own Chai GPT operator locally on your machine. OpenAI just released their first AI agent named Operator, which can browse the web and execute actions for you. However, most people have no idea that there is an open source framework that's actually better than Operator and you don't have to pay $200 a month for it. So in this video, I'm gonna show you step-by-step how to set up your own local operator, even if you are a complete beginner. Now, how is this even possible? How is there a free alternative to OpenAI's Operator that costs $200 a month? Well, it's all thanks to this open source framework called browser use. Now, usually open source alternatives are behind what the AI companies make, right? However, the year 2025 is different. First, DeepSeek R1 completely destroyed O1 and now browser use is destroying Chai GPT Operator. Now, let's address the OpenAI problem because on the first glance, Chai GPT Operator looks like a great AI agent, right? It can control a browser, allowing you to book flights, order groceries, do reservations and all that. However, it's hidden behind a $200 a month subscription. Not only that, it's not even available in most countries and they said it won't be available in Europe for months. And also, it's too careful. It's like asking you for confirmation after every other step. That's not what autonomous AI agents should look like. Browser use, on the other hand, is open source, meaning you can run it for free on your computer with any model you want. Now, some of you are probably thinking, but David, this is not fair. You know, Chai GPT Pro has other stuff, such as unlimited access to OpenAI O1, right? Maybe that is worth $200 a month. However, DeepSeek R1 just defeated O1 on many different benchmarks. And on top of that, you can actually see the reasoning tokens, which OpenAI doesn't let you see. And that's why I've decided to add DeepSeek R1 into Vectal. And you can use it with your task. Tell me step by step how to complete task seven, which in this case is obviously set up Entropic API. And unlike Chai GPT, you can actually see the reasoning tokens inside of the model. So if you want unlimited access to DeepSeek R1, as well as all the other features Vectal has to offer, just go to Vectal.ai and give it a shot. With that being said, let's get to building our browser use operator. First, we need to create an empty project inside of Cursor. So open up Cursor or WinServe or VS Code, doesn't matter what you use. Click on open project and just create an empty folder. Now we need to open the terminal. So go to top terminal or press command J to open up a new terminal. Okay, we can mark this as complete. Now let's clone the GitHub repo. So this is the link. I'm gonna also put it in the description. Okay, so all we need to do here is go to code and copy this link right here, copy URL to clipboard, then switch back to Cursor, type in git clone, and then do control V or command V to paste in the URL. Hit enter. And this will automatically clone the entire GitHub repository into your project right here. So if you open your files, you can see it's all here. That's the beauty of open source projects. That's the step three done. Next step is actually to CD into the web UI folder. So let's do clear and type in CD. And then we have to go into this folder. So web-ui, boom. And you should see it right here. So let's clear that. Next, we should install the requirements. First, I'm gonna actually activate a conda environment. Side note, you technically do not need conda. It's just good to separate different Python environments, but you can totally follow along without it. If you don't know how to use conda, DeepSeq will be perfect for this. So how do I set up conda on my MacBook? The reason why it would be perfect for this, because it's a reasoning model. Man, Vector is becoming a real good product. I know it's my product, so I should be saying that, but with the addition of DeepSeq mode, guys, it's getting crazy good. So anyways, let's get back for focus. So I'm gonna do conda activate test, which is my test environment. All right, so now we can see we're in the test environment. So let me hit clear again. And I'm actually gonna utilize command K, which is a feature inside of cursor. I'm gonna say install reqs. That should be enough for it to recognize the requirements.txt file and install it correctly. Yeah, that's right. Let's run it. Boom, it will install all of the required packages. We can hit clear. All right, that's step five complete. As you can see, guys, you can do this, even if you aren't a programmer. Let's launch it on localhost. So we need to run this command right here. I'm gonna put it in the description. This is actually not your IP. This is just localhost, and this is a port. So don't worry, nobody can hack you like this. Copy this and paste it into the same terminal. Hit enter. It's gonna take a few seconds to run, and then you should get this link right here. If you command click on it, it will open up browser, and boom, there it is. This is browser to use, and this is where the magic begins. Let's mark step six as complete. Step seven, man, we're blazing through these steps. This is the agent settings. Here you can select which type of agent you wanna use, either org, organization, or custom. Let's use custom. There is a maximum amount of steps the agent can take from one to 200, right? I think a good place to put it is like 50. I think that's, for most tasks, that's safe. Then maximum amount of actions per step. We can leave this as 10. Enable vision and enable tool calling. We wanna keep those turned on, but here is the super important setting, which is the LLM, right? We need to choose which model to use, and there is a lot of things you can choose. Anthropic, OpenAI, DeepSeek, which, by the way, I'm gonna make a video on how to run local DeepSeek models because DeepSeek also released six smaller distilled models that you can definitely run on your computer or even on your phone. In a few days, I'm gonna be making a video on that, how you can run DeepSeek R1 locally. So make sure to subscribe, that way you don't miss it. But for this video, I'm gonna be using Anthropic, and we wanna make sure we're using the best model, which, in this case, it's not 0620. In Composer, I'm using the new Sonnet, which is 1022. This one is the old one, right? It's easy to get fooled, but the new one is much better. So delete that and do 1022. For temperature, we definitely don't wanna use one. We wanna use like 0.1, which is much less random, and then we have the base URL and API key. So of course, the base URL depends on which LLM provider you're using. For Anthropic, it's api.anthropic.com slash v1. Now we need to get the API key. So let's go into Anthropic console, but in here, you need to log in with the same account as you have in Cloud. And then on the top, go to Settings, and then on the left, go to API keys. In here, click on Create Key. Select Workspace, can be default, and I'm gonna name it Browser Use. Add, let's copy this. Now, make sure to never share your API keys. I will delete mine before uploading the video. Treat them as passwords. Let's go back into our Browser Use web UI and paste it in here, boom. Now let's click on Browser Settings. Here, we can set up a lot of stuff, right? For example, if you wanna use your existing browser or if you wanna keep it open between tasks. However, the most important tab right here is Run Agent. In here, we can give the clear description of the task, which we want our AI agent to perform on the web. So for example, here, the default is go to google.com and type OpenAI, click Results. Super simple, right? But I'm gonna give it a task that's much more complicated. Go to amazon.ai and purchase the book, The Singularity is Nearer by Ray Kurzweil. You have my full permission to buy the book. So if you remember the OpenAI video where they showed chat GPT operator, this is more advanced than any of the examples they showed. I'm going a step further. I wanted to do everything to complete the purchase and there's not gonna be like annoying, every two steps, like, oh, can I do this? Can I do this? So yeah, let's do that. Additional information. Make sure to choose the book with most reviews and English translation. Okay, that's good. Let's do Run Agent and let the magic begin. Okay, let's see. Seems like we have some errors inside of our terminal. So what I'm gonna do is I'm actually gonna utilize cursor to debug this. So let me open up Cursor Composer here and I'm just gonna put the entire terminal output. This is how I'm building my startup Vectal as well. Like I've been building Vectal now for the past three and a half months and I've built it entirely by speaking English and writing English to Cursor, Cloud, V0 and all these AI tools, right? So this is a fully deployed startup with over 100 paying customers that has been built with AI tools. So if I'm saying you can do this, even if you're not a programmer, I'm not just saying that, I did it. This is a fully deployed product that you can use right now. And to be honest, it's a pretty damn good product right now. So this is how I'm debugging, using Cursor. So I have an error, okay, I put it here and say, we have an error. Explain what it is and how we can fix it. And I'm gonna switch to Agent. Cursor Agent is much better than just normal Composer. So let's see, executable doesn't exist at Playwright. Oh, I guess I forgot to install Playwright. Seems like we missed this step. So yeah, this is a crucial step in the GitHub repository. So we need to install Playwright. Luckily, we can do it right now. Actually, no. The issue, what happened there is it did not activate my Conda environment. So I'm just gonna copy this. So let me do a new terminal. I'm gonna Conda activate test. That way we're in the same Conda environment. I'm gonna clear and paste in Playwright install. So now this is installing Playwright. So this is actually the engine powering browser use. This is how the AI agent is able to interact with the different elements in a website. It can see bounding boxes around all the buttons, images, texts, and then it can decide what to act. Okay, so that's installed. Let's do clear. And in here, we can actually, we need to restart our port. So let me go back in here. Actually, we've done step seven already. So let me click that off. We just need to copy this command and rerun that again. I'm gonna save it in vector just in case. So let me do clear and then rerun that. Okay, let's open it up again. We lost our config, so that's my bad. I'm gonna set this up real quick. Okay, so now that this is running, let's do again. So same prompt, same information. Let's do run agent. Okay, there it is. It opened up a new window, beautiful. Stop five consecutive failures, okay. But now it's running, so that's good. Not found, no info. What is going on? Let's debug it with cursor. So add to composer. I mean, to be honest, I think I can cut this out, but I think it's more authentic if I actually show you that not always do you do it on first try. In fact, almost always you run into unexpected errors and you can see me how I solve them, right? I don't get discouraged. I don't take them personally. I just utilize the AI tools to solve them. So we have another error. Explain what this one is and how we can fix it. So in here, I'm using actually the new Sonnet 3.5, same model as the default inside of Vectal. So let's see, what actually is the issue? The app is trying to use OpenAI's API, why? Okay, so I'm gonna manually restart it again. This time we're gonna try OpenAI. Maybe Entropiq is having some issues. Entropiq API is very unstable, to be honest, even compared to DeepSeek. DeepSeek has 99.9% uptime. Entropiq has like 99%. And it might seem like, oh, 99% is good, but no, 1% of the time it's either offline or like severely slower. All right, so let's do OpenAI instead. Go to platform.openai.com slash API keys. Log in with the same account as you have chat GPT. Click on API keys on the left. Create new secret key. I'm gonna name it browser use. Create, again, do not share these with anybody. I'm gonna delete mine before releasing the video. Put in your API key for sure. Browser settings, these look good. Run agent. So let's try again. Okay, so same prompt, this time obviously with a different configuration instead of Entropiq OpenAI. I don't know if that will fix the issue. Hopefully, fingers crossed. Let's run the agent. Okay, boom, it opens up the browser. Hopefully it doesn't get stuck. Come on. Okay, there it is. I'm not touching, guys. I'm not touching here on my hands. It opened up Amazon. It's amazing. Come on. Okay, so the issue is I'm not logged in here. Okay, the singularity is nearby, as well. Come on, search it up. This is crazy, guys. We really are living in the future. This is an AI agent buying item on Amazon, the one with most reviews. I didn't specify this this time, so let's see what he chooses. Okay, chooses the first option. Make sure it's English. It is English, okay. Good ratings. Come on. Either buy now or add to cart. What are you gonna choose? Buy now, nice. All right, so here I had to stop it because I'm not logged in on this browser. However, maybe if I choose browser settings, use existing browser, maybe this is good. Maybe this is because I'm logged in, obviously, to my Amazon account here. Maybe we should do this, actually. So let's do that. Let's try again. This is already amazing, right? Don't get me wrong. Having an AI agent control your computer, this is the future, guys. This is only January of 2025. This year is gonna be insane. If you really log in, if you pay attention, if you utilize tools like Vector, like Cursor, browser use, you're gonna be so ahead of everybody. Oh my God, it's not even funny. All right, so I enabled use my own browser. Let's try again. Hopefully he opens up a new window here. I mean, I can just log in. Oh, is it signed in? No, it's not. No, okay, let's stop it. All right, so I'm gonna do something quite risky. I'm gonna give it my login details. Username, boom. Password, boom. And to be honest, I'm doing it for you guys because I wanna show you what this AI agent is capable of. Okay, amazon.com, come on. It now has my login, okay? So it should be able to complete it. Let's see if it's able to do that or not. Be careful when you give it login details, obviously. There is a certain risk with that. And if it purchases the wrong item, I'm gonna have to front the cost. But for the sake of experiment, I'm willing to do that. Come on, what's going on? Did I mess it up? I think I distracted it. Actually, I'm gonna put more context in here. If you need to log in, use the login details below. Login details below. And I'll also say, you have my full permission to buy the book. Let's do it again. Come on, browser use, I believe in you. Okay, Amazon, let's go. I'm not touching, guys. This is AI. This is AI controlling my computer. We're living in the future, all right? Find the search box, come on. Number two, you got it. Boom, there it is, there it is, okay? See, it's beautiful how it puts everything into bounding boxes and then executes the next action. Okay, so click number 22, just click 22. First book is fine. That's right, that's right. All right, so buy now, click on buy now. Also, it's pretty fast. I would expect it to be a lot slower than this. Obviously, it's not as fast as me buying an item, but it's pretty good. All right, so let's see. Login, I gave it my login details. Come on, you got it. Oh, there it is. I may need to do some 2FA authentication. Come on, password. Okay, please, come on. It's logging into my account. There it is. Okay, okay, so pay now. Click on pay now, okay? And there it is. It just purchased a book. It purchased a book autonomously on my account. Wow, it did everything from going to Amazon, from going to the URL to buying the book, and now it's finished. It took less than two minutes task completely successfully. Let me see the SMS notification confirmation. There it is. Order placed for singularity is nearer. Wow. Guys, the singularity really is nearer. This is crazy. We literally have AI agents that can do your, oh my, that can take actions on the web. Wow, this is the future. That being said, make sure to go to vectil.ai and give it a shot. Really, with unlimited DeepSig usage, this is one of the best deals you can get. Obviously, you can also add in your notes, you can add in your tasks, and all the AI agents in there see all of your active tasks, they see your user preferences, your goals, and they give you super relevant and actionable stuff, and you can actually use the AI agents to help you rewrite tasks, move them to tomorrow, complete tasks, and all that stuff. So go to vectil.ai and give it a shot. Every paying Vectil member gets unlimited DeepSig R1 usage. It's on me. I don't care what the token costs are gonna be, it's unlimited. So go to vectil.ai and give it a shot. With that being said, thank you guys for watching, and have a wonderful, productive rest of the week. See ya.
Generate a brief summary highlighting the main points of the transcript.
GenerateGenerate a concise and relevant title for the transcript based on the main themes and content discussed.
GenerateIdentify and highlight the key words or phrases most relevant to the content of the transcript.
GenerateAnalyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.
GenerateCreate interactive quizzes based on the content of the transcript to test comprehension or engage users.
GenerateWe’re Ready to Help
Call or Book a Meeting Now