Speaker 1: Yesterday on 27th Jan, 1 trillion dollar were wiped out of US stock market. Nvidia alone lost more than 500 billion dollar. And the reason was this Chinese AI model called DeepSeek R1. In this video, I'm going to explain what exactly is DeepSeek, why is it such a big deal and what it means for the future of AI as well as your future just in case you're planning a career in AI. Let's first check what these big players have to say about DeepSeek. Nvidia whose stock was tanking yesterday calls this an excellent AI advancement. Microsoft CEO calls it a big win for tech. So there is something real about DeepSeek. So let's see why is it such a big deal. Number one reason is it offers low cost training and inference. It took them 2000 of Nvidia's H100 GPUs to train this model in two months at a cost of 5.5 million dollar. Now this is way lesser than the cost that OpenAI spends in training their GPT models. So if you compare the cost that US companies are spending in training their LLMs compared to that this cost is nothing it's peanuts. Not only that it is faster and also it is accurate if you look at the benchmarks performance. It is performing almost at the same level as GPT and offering very fast inference. Inference means when you are asking a question the time it takes to generate the answer for LLM that is very fast. Accuracy is also very good at par with OpenAI's flagship models and the cost of training is also very low. Jared Friedman who is a partner at Y Combinator he's from Silicon Valley says that this is 45x more efficient and the reason for efficiency is it is using 8-bit instead of 32-bit floating point numbers. It has some amazing compression techniques. It can do multi-token prediction instead of single token prediction There is this distillation mixture of expert models decompose a big model into small models. There are variety of reasons why this is more efficient. Basically DeepSeq achieved this algorithmic efficiency. The belief so far was that you need compute supremacy if you want to build the best AI model but they broke that myth. DeepSeq engineers came up with very fast optimized way of training so that you don't need so many GPUs and that is the reason why NVIDIA stock was crashing. This is from former Intel CEO. He is saying that he has already started using DeepSeq instead of JGPT. He says engineering is all about constraints. The Chinese engineers had limited resources and they had to find creative solutions. If you know about this trade war that is going on between China and US, US imposed some bans on exporting these high-end GPUs to China because US wants to keep the edge in their AI advancement. But Chinese have proven that to be wrong. Since Chinese engineers had these restrictions, they were not having access to these super fast GPUs. They found these innovative creative solutions. And then open wins. DeepSeq will reset the increasingly closed world of foundational AI model work. So it is an open source model. I also posted yesterday on LinkedIn where I talked about this target project that US government is going to kick off where they are going to invest $500 billion in AI data centers to build a country of geniuses in data center. They thought, okay, we can have all these GPUs, we can spend all this money and that way we can have this AI supremacy. But DeepSeq proven that to be a wrong thing. They achieved the efficiency at code level, at software level, not at the hardware level, software level. They optimized the algorithm and they were able to produce very high performance model. Now, the question that comes to people's mind whenever they think about Chinese AI companies, how do I know that my data is safe? Well, if you are using DeepSeq website, which is sort of similar to ChatGPT, that is Chinese website. So your data is going to China. But guess what? DeepSeq is open source. So you can see their approach, their research paper, that PDF, everything, right? If you just Google DeepSeq research paper, you will find this paper. You can look at their approach and everything, okay? See, these are the performance benchmarks. So the blue dotted line is DeepSeq. And this gray model is OpenAI-01, okay? And just look at this comparison. These are various tests, by the way. This is a math test, coding test, variety of tests. And it is performing at par. So you can read about the approach that they took. For example, distillation, empowering small models, okay? Then also the approach of quantization. Quantization is like using that 8-bit versus 64-bit. You can access this model from the platform such as OLAM, okay? You can actually locally download it. And you can run the inference. If you have a fear that, no, my data will go to China, do this thing. You download this model, DeepSeq R1, locally on your computer. You turn off your internet. Now you don't have any fear, right? It will go to anywhere. And it still works. So this is an open source model, which you can run it locally. You can also run it in grok cloud. So grok is a US website, okay? They're hosting DeepSeq R1 right now as we speak. And you can have this inference. So if I'm calling grok, my data is not going to China. My data is going to grok's data center, grok's cloud, which is a US company. Arvind Srinivas, who is a founder of Perplexity. Perplexity is also hosting DeepSeq R1 model. And he posted yesterday this on LinkedIn saying that it is an open source. None of your data goes to China. Now, what does this mean for the future of AI? Number one thing, which is very obvious. Everyone is accepting the fact that this is democratizing AI. AI was kind of closed in the OpenAI lab. I know there were Lama models, etc. But Lama was not as great as OpenAI's models. But with DeepSeq, which is having a very good performance, it is democratizing AI. What it means is that it is going to result into faster adoption. Why? Because now if you are a small company and you can't afford OpenAI's API, because they charge per thousand tokens and their bill can go up, it's completely okay because now you can use DeepSeq. You can host it locally in your cloud. In that case, you're not paying anyone the cost for the inference. You, of course, have your cloud bill, but that's not that high. So this is going to reduce the cost of building an AI solution significantly, which means there will be a faster adoption worldwide. Let's say you are a company based in some African countries, where you don't have much revenue and you are struggling. You can be a company in pretty much anywhere in the world where you don't have revenue, you have a tight budget. You can now adopt AI faster. In terms of geopolitical impact, China is gaining momentum in AI race against the US. US was trying to curb all this export of Nvidia GPU, etc. But now China found some intelligent way to get around it. So they are gaining in this AI race. Now, based on your philosophy, your belief system, you might like it, you might not like it, but this is the fact. And the impact that I'm most excited about is environment. They were training all these models and it was consuming so much energy. You know, power companies were exploring nuclear energy options because they couldn't meet the demand of these AI workloads. And now all of a sudden because of this optimization in a software layer, you don't need that much power. So the environmental impact, the CO2 emissions are going to go down as a result and environmental impact will be positive through this innovation. Now, what does it mean for your own individual career? If you are trying to become AI engineer or data scientist? Well, as I said, this is going to result into faster adoptions. You know, companies will make faster AI solutions, which means that they will need this gen AI developers. They will need this AI engineers to build the solutions faster. OK, we will closely watch out all these advancements which are happening. I will share my thoughts via YouTube, via LinkedIn, etc. Stay tuned. Have a nice day.
Generate a brief summary highlighting the main points of the transcript.
GenerateGenerate a concise and relevant title for the transcript based on the main themes and content discussed.
GenerateIdentify and highlight the key words or phrases most relevant to the content of the transcript.
GenerateAnalyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.
GenerateCreate interactive quizzes based on the content of the transcript to test comprehension or engage users.
GenerateWe’re Ready to Help
Call or Book a Meeting Now