Speaker 1: The release of deep-seek AI from a Chinese company should be a wake-up call for our industries that we need to be laser focused on competing to win because we have the greatest scientists in the world. This is very unusual when you hear a deep-seek, when you hear somebody come up with something. We always have the ideas, we're always first, so I would say that's a positive. That could be very much a positive development. Instead of spending billions in billions, you'll spend less and you'll come up with hopefully the same solution.
Speaker 2: Deep-seek is currently the top trending app on Apple store and it just surpassed ChantGPT and this happened in a couple of days. NVIDIA lost almost 17% of its market cap thanks to the deep-seek release but you can run deep-seek locally thanks to the folks at Unsolved AI and there were a couple of new model releases including Quint 2.5 version language model which is again open source and straight-of-the-art but I guess the biggest news today was the release of Janus series from deep-seek which is a new series of models that have image understanding and image generation capabilities. Now we're going to look at all of these and a lot more but first let's start off with this post from the CEO of Hugging Face. If you had have told me a few years ago that a model released on Hugging Face could tank Wall Street and get mentioned by the US president, I probably would not have believed you. What a world we live in. That's right because today was a bloodbath for the tech sector on the US stock market. In fact, NVIDIA lost almost 17% of its market cap and I had to ask deep-seek to tell me what happened today and what was the reason. So it said NVIDIA experienced a historic market capitalization loss of almost 600 billion dollars, marking the largest single-day value decline of any US company in history. Now this is substantial and this was supposedly triggered by concerns over Chinese AI startup DxE which released a low-cost open-source large-language model comparable to US counterparts like ChatGBT. Now if you haven't been keeping track, they trended for almost six million dollars which is almost 50 times supposedly less than what OpenAI and other model creators are using. But I think the biggest news is that deep-seek actually went mainstream today. If you look at the Google trends and compare deep-seek with the other model providers like ChatGBT, this is the first time that it is actually surpassing in terms of the volume of search for deep-seek. Now this does bring some concerns because deep-seek is a Chinese company. The servers are running in China. So there are takes like this, Americans sure love giving their data away to the CPP in exchange for free stuff. And this is a tweet from Steven who is an employee of OpenAI. The community notes pointed out that deep-seek can be run locally without any internet connection unlike OpenAI's model which is absolutely correct. Now you need enough hardware to run the R1 but there are distilled versions available. So for example Grok is currently hosting the deep-seek 70 billion distilled version and this is probably the fastest infant speed that you can have of any model that is capable of reasoning. Now if you want to run the full model through an API and you don't want to use deep-seek API, there are options like TogetherAI who are also serving deep-seek R1. Although the price of this API is I think almost five times higher than what deep-seek is providing. Now there are other options as well for APIs for example hyperbolic is another good option who are running deep-seek but keep in mind they are serving it at a lower quantization so you probably are not getting the same level of performance. However if you have a couple of H100 lying around somewhere or even if you have a couple of Mac Ultras, now you can run this locally thanks to the folks at Unsoth. They are doing a really amazing job. They were able to quantize it down to 1.58 bits and there is a GGUF available so now you can run it in 1.58 bit while being fully functional. They shrank it down to almost 80% so instead of 720 gigabytes now you can run this on 130 gigabytes and the way they approached it is that instead of naively quantizing all the layers which breaks the model entirely and will cause endless loops in gibberish output, they use dynamic quantization. So still you can run this on two H100s or as I said a couple of Mac Ultras. For H100s you are going to get about 140 tokens per second which is pretty amazing. Now here is an example output that you can expect. This is the original 8-bit quantization and this is an output for the same prompt for this 1-bit dynamic quantization. The result seems to be pretty close and since it's a pretty huge model usually the quantization has a lot less impact on bigger models compared to smaller models. Now the effect of quantization is more prominent on MOEs or a mixture of experts but this dynamic quantization seems to solve that problem which is a pretty amazing news for developers. However there are some other providers in the US who are hosting DeepSeek as well. So for example Perplexity started hosting DeepSeek R1 so you can access it on Perplexity. Now over the last few days I have seen some really terrible takes on DeepSeek R1 especially when it comes to a data privacy. Every other company who are giving you any product for free are collecting your data but the beauty behind DeepSeek is that it's completely open source so you can run this on your own hardware if you want and if you have the resources. And I think the only reason that DeepSeek got so much attention from not only the tech community but public in general is that it's open source. Otherwise there are some other Chinese companies like Kimi who also released a Kimi 1.5 Pro which is almost the same level of performance and it's free but nobody's actually using it. This was acknowledged by even Sam Altman. So in a tweet he said DeepSeek R1 is an impressive model particularly around what they are able to deliver for the price. We will obviously deliver much better models and also it's legit invigorating to have a new competitor. We will pull up some releases. So thanks to R1 we probably are going to see some models from OpenAI pretty soon. Now he goes on to say but mostly we are excited to continue to execute on our research roadmap and believe more compute is more important now than ever before to succeed at our mission. And this is I think directly trying to address the sell-off in NVIDIA. So he said the world is going to want to use a lot of AI and really be quite amazed by the next generation model coming. Look forward to bringing you all AGI and beyond. But the funniest thing I think was that the CEO of DeepSeek he just retweeted good luck. NVIDIA also released a very interesting statement and it goes beyond just the need of GPUs so it says DeepSeek is an excellent AI advancement and a perfect example of test time scaling. DeepSeek's work illustrates how new models can be created using that technique leveraging widely available models and compute that is fully export control compliant. The reason for this specific part seems to be some of the news that DeepSeek and some of their Chinese companies have access to H100s which they're not supposed to get. There are a lot of speculations about that but they goes on to say inference requires significant numbers of NVIDIA GPUs and high performance networking. Now DeepSeek was able to optimize their H800 GPUs. They had to write optimized codes for the low-level control of the hardware which was pretty interesting. We now have three scaling laws pre-training and post-training which continue and the new test time scaling. And the idea over here is that you will need a lot of compute not only for training these models but also at inference which I hope is the case because I also bought some NVIDIA stock today. No other financial advice but I do think that we will need a lot more compute moving forward and this is just hopefully a dip at this point. Hopefully the recovery of NVIDIA is going to be pretty strong and I'm not going to lose any money. I also want to look at a couple of other things in this video and one of them is going to be the Genes series of models from DeepSeek. So these are another set of open source models and the biggest one is Genes Pro 7 billion which again is state-of-the-art in its weight class. So even Mark Benoff who is the CEO of Salesforce he tweeted this. DeepSeek now introducing Genes Pro 7 billion for image generation and DeepSeek had already claimed the number one spot on the App Store surpassing No need for NVIDIA supercomputers or 100 million dollar budgets. A statement I probably don't agree with in the long term. He goes on to say say that the true treasure of AI is in the UI or the model. There are no commodities which is basically true. The real value lies in data and metadata. The oxygen fueling AI's potential. The future's fortune. Now data is the new all. Everybody probably is aware of that by now. It's all in our data and not in our models. Now I don't personally agree with that part. Data is critical but not everybody's data. It has to be very high quality data or high quality synthetic data. So the model itself is available on Hugging Face and it seems to be pretty great at image generation especially the Genes Pro 7 billion. The result a difference between the normal Genes and 7 billion Pro are substantial. Now there are a number of demos that are available on Hugging Face. So for example here's a demo of a visual understanding of the model. So when you pass on this image here's the response that it generates. But most of them are pretty busy at the moment because a lot of people are trying to use not only DeepSeek R1 but these newly released Genes Pro models as well. There was also this other model released from Quen which is not getting enough attraction because of everything that is going on with Deep. But this is Quen 2.5 vision language model 72 billion. Like DeepSeek, Quen also has their own chat platform where you can use this new vision language model along with some of their other models including that 1 million token context window model. But there are a couple of features that are not available in some of the other platforms including image generation. So you can ask it to generate an image if you enable this and you can ask it to create an image of a llama wearing sunglasses and it will be able to do that for you. Not sure which model it's using in the background for generating images but they also have capability to generate videos as well. So here is a video of a cat playing with a ball of yarn. It looks pretty great. I would say the quality is on par with Sora Turbo from OpenAI but you can use this for free. Now with any free platform on the internet use it at your own risk. If you don't want to share your data with Chinese companies stay away from it but if you're not doing anything critical and just want to have fun or explore different models I'll highly recommend to test it out. Anyways this was a very interesting day today and since everybody's trying to use DeepSeek now it's extremely slow. I was really enjoying it for the last few days but now because of all the traffic that that is coming in they are not able to keep up with the demand and I don't think they have enough GPU resources to actually support all the traffic. The best part that I personally liked about DeepSeek was the internal thought process. I personally was really interested in reading the internal chain of thought rather than the output that it was generating. Hopefully they will be able to bring it back online when they have enough resources. So this was a quick roundup of whatever happened today in the world of DeepSeek. Let me know if you like videos like these. I'll start creating these quick roundups but if you are just interested in technical content or tutorial more of those are coming pretty soon. Anyways I hope you found this video useful. Thanks for watching and as always see you in the next one
Generate a brief summary highlighting the main points of the transcript.
GenerateGenerate a concise and relevant title for the transcript based on the main themes and content discussed.
GenerateIdentify and highlight the key words or phrases most relevant to the content of the transcript.
GenerateAnalyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.
GenerateCreate interactive quizzes based on the content of the transcript to test comprehension or engage users.
GenerateWe’re Ready to Help
Call or Book a Meeting Now