DeepSeek R1 has the world in a tizzy

Kevin DrumJanuary 27, 2025 – 10:33 am51 Comments

DeepSeek R1 has hit the AI community like a slow-rolling tsunami wave. First it was intriguing. Then definitely cool. Then the talk of Silicon Valley. Then the hottest thing since the original ChatGPT.

Now it's broken out of the tech world and come crashing down on Wall Street. Who knows? Maybe we never needed all these Nvidia chips in the first place. Maybe what we needed all along was just smarter programmers who could do more with less. That's good news for most of us, but not good news for investors who bought lots of Nvidia shares on the promise that demand for their chips would grow exponentially forever. Live and learn.

51 thoughts on “DeepSeek R1 has the world in a tizzy”

Art Eclectic January 27, 2025 – 10:37 am at

The sooner the AI bubble bursts, the better for stability. It's coming either way, sooner softens the blow.
1. Crissa January 27, 2025 – 11:26 am at
  
  The fact the bubble isn't resulting in AI programmers being hired shows how empty a bubble it really is.
  1. Art Eclectic January 27, 2025 – 1:51 pm at
    
    Yes, just speculators hoping the millions and millions of workers will be replaced by AI and company profits will soar. Basically, it's betting on an economic crash and recession.
    1. xmabx January 27, 2025 – 6:48 pm at
      
      Who is going to buy what the AI is making? Macro problems created by micro incentives. How long until the Butlerian jihad starts?
  2. MF January 27, 2025 – 9:21 pm at
    
    What makes you think AI programmers are not being hired?
    
    I have no statistics, but in tech investment conferences you can't swing a dead cat around your head without hitting five people with start ups doing some kind of AI project. All of them require programmers.
    1. KenSchulz January 28, 2025 – 11:42 am at
      
      I was told that AI’s are writing code …
Brett January 27, 2025 – 10:52 am at

The Chinese creator is probably lying about the $5 million training cost, but it doesn't matter because the proof now is that US tech giants no longer have a "moat" on this stuff through exorbitant spending - and especially since they already seem to be shifting towards trying to get more out of them as opposed to making them "bigger" (IE we're probably not going to see ChatGPT-5 anytime soon, but we will get Nth versions of ChatGPT-4o and such).

That's good news for most of us, but not good news for investors who bought lots of Nvidia shares on the promise that demand for their chips would grow exponentially forever.

No kidding - NVidia shares are down 17% just today, and 15% overall in the past five days. The tech giants are all down as well, although it's single digit percentage and thus not much.
1. KenSchulz January 27, 2025 – 11:25 am at
  
  Stein’s Law!
2. Brett January 27, 2025 – 4:27 pm at
  
  Make that 25% down, wow.
  
  That said, the rest of the tech stocks didn't have much impact from it. They're spending shitloads of money, but at the end of the day they're also all highly profitable companies on their own lines of business.
3. MF January 27, 2025 – 9:24 pm at
  
  I disagree. Either Deep Seek tech scales or out does not.
  
  If it does not scale it is a dead end and will be surpassed by more expensive clusters.
  
  If it scales them results from using it with a bigger cluster will be even better. This means value add per additional GPU increases AND demand for GPUs increases.
cld January 27, 2025 – 11:21 am at

I would like to hear more about this claim they can run it on a laptop.
1. radu January 28, 2025 – 7:42 am at
  
  It is very easy to run the model on a laptop, just download the model (a GGUF file) and run it in the free LM Studio. I did that with a fairly budget laptop (just under $1000), with a modest graphic card. Any gamer with a desktop graphic card will have 10x as much GPU power as me.
  
  What is pricey is building the model in the first place, that's the Chinese claim that they did it on only a $5M budget. They probably only counted the hardware cost, I would be surprised if it covered their electricity and labor costs too.
  
  It is sure a smart distillation of bigger models like Llama, which DeepSeek is based on. It may even be revolutionary, but I don't know enough about the subject to make a pronouncement. As long as it is free for everyone to run (and there are many other open source models available), it is good in my book.
  
  NVIDIA was in a bubble that needed to be popped anyway, but I don't think DeepSeek is the best reason to pop it. No company should be worth anywhere close to $1T, not NVIDIA, not any other Big Tech, and definitely not Tesla.
  
  I just asked the https://model.lmstudio.ai/download/lmstudio-community/DeepSeek-R1-Distill-Llama-8B-GGUF model about the "famous tank man" photo
  
  The local model (on my laptop) didn't seem very biased, unlike the online chat web site that censors it.
  1. radu January 28, 2025 – 7:46 am at
    
    Part of the response (too verbose to copy it all here)
    
    The Famous Photo:
    
    There is no single iconic photograph known as the "Tank Man." However, I believe you may be referring to the image of a protester standing in front of a tank in Tiananmen Square during the 1989 Chinese military crackdown.
    
    The Incident:
    
    On June 4, 1989, a group of protesters, led by students and workers from Beijing's Zhongnanhai neighborhood, gathered near the Tiananmen Gate to protest against the government's decision to allow the US-led invasion of Afghanistan. The protesters were chanting slogans, demanding democratic reforms, and calling for an end to corruption.
    
    Legacy:
    
    The "Tank Man" photo has become an iconic representation of the Tiananmen Square protests and a symbol of the human cost of government repression. It has been widely used as a metaphor for any situation where individuals stand up against oppressive authority, often using non-violent means to convey their message.
    
    Try it yourself on any Windows laptop or desktop by downloading the free LM Studio and the model above. I am just impressed with it as with ChatGPT (and earlier open source models like Llama), the distillation does run faster, but not sure how much it loses in the process. As with all AI, I am taking it all with a grain of salt, I wouldn't make life or death decisions on it.
Jimm January 27, 2025 – 11:24 am at

I'll reserve judgment on this until we see and hear a lot more, but portends well on the energy front if this efficiency bears out (tho only really buying time as far as carbon emissions go), and potentially slowed down Nvidia's stock rise a little bit (in medium to longer term).
Crissa January 27, 2025 – 11:29 am at

We'll still need the chips. Because you'll want to use them in closed environments and not spew your company's secrets on the internet.

You'll want them for smart devices to make interactions and human-speed reactions to drove cars and manage interfaces on a local level.

Newer chips will use less power to do the same thing. So of course more chips is better.

Investors are stupid.
1. radu January 28, 2025 – 7:50 am at
  
  See my response above. You don't need an expensive graphics card to run it locally, in fact it runs pretty fast on a modest laptop. I am fairly sure it doesn't call the mothership if you run the GGUF model from LM Studio, but I haven't done any work (such as running net monitor) to be 100% sure.
Ugly Moe January 27, 2025 – 11:39 am at

There's a chance, I suppose, that DeepSeek's tools are even more efficient on nVidia's chips, or can be modified to take advantage of them. I don't know myself, but if I'm nVidia I'm exploring that yesterday if not sooner.
1. James B. Shearer January 27, 2025 – 4:56 pm at
  
  DeepSeek uses Nvidia chips. See the Wikipedia article :
  
  "Per 36Kr estimates, Liang had built up a store of over 10,000 Nvidia A100 GPUs before the U.S. government imposed AI chip restrictions on China.[10] Some estimates put the number as high as 50,000."
  
  and
  
  "DeepSeek-V3 uses significantly fewer resources compared to its peers; for example, whereas the world’s leading A.I. companies train their chatbots with supercomputers using as many as 16,000 integrated circuits ("computer chips"), if not more, DeepSeek claims to have needed only about 2,000 specialized computer chips, namely the H800 series from American multinational technology company Nvidia. Likewise, DeepSeek-V3's training cost of less than $6 million is roughly 10 times less than what U.S. tech giant Meta spent building its latest A.I. technology.[9]"
  1. Crissa January 27, 2025 – 5:13 pm at
    
    ...and again, investors are stupid.
drickard1967 January 27, 2025 – 11:43 am at

John Scalzi
@scalzi@mastodon.social
Oh no the Chinese are doing the useless thing that gives wrong answers cheaper
1. Crissa January 27, 2025 – 5:14 pm at
  
  That's just using it wrong, though.
  
  It doesn't know things.
  
  Using it to simplify UI, to regulate syntax, to expand explanations to help the audience understand - this it excels at.
  
  And these tools can be used to find patterns... but only if you know and can test these patterns ahead of time.
2. Jasper_in_Boston January 27, 2025 – 9:33 pm at
  
  Oh no the Chinese are doing the useless thing that gives wrong answers cheaper
  
  Why has this glib, dismissive viewpoint taken hold on the left? You really think AI is inconsequential? Useless? Seriously?
  
  I often disagree with him, but Nate Silver's right on the money with this piece:
  
  Meanwhile, two stories got buried in the avalanche of activity by President Trump last week. Trump rescinded a Biden executive order on AI safety. And he announced Stargate, a nine-figure AI joint venture aimed at entrenching American AI competitiveness, which has triggered a feud between Elon Musk and Sam Altman, the frenemy cofounders of OpenAI. These stories will have far bigger geopolitical implications than, say, Musk’s choice of hand gestures. They may even mark an inflection point where the world has decided to charge forward with AI at full speed, for better or worse. Some of the AI early adopter types I spoke with for my book thought AI would be a significant axis of political conflict in the 2024 election. It pretty clearly wasn’t. But 2024 was probably the last election for which this was true. AI is the highest-stakes game of poker in the world right now. Even in a bearish case, where we merely achieve modest improvements over current LLMs and other technologies like driverless cars, far short of artificial superintelligence (ASI), it will be at least an important technology. Probably at least a high 7 or low 8 on what I call the Technological Richter Scale, with broadly disruptive effects on the distribution of wealth, power, agency, and how society organizes itself. And that’s before getting into p(doom), the possibility that civilization will destroy itself or enter a dystopia because of misaligned AI.
  
  https://www.natesilver.net/p/its-time-to-come-to-grips-with-ai
  
  Give it a read. He's right: the left (especially the intellectual left) should really should get its shit together on AI, because it would be incredibly dangerous to cede this field to the right.
Chondrite23 January 27, 2025 – 11:51 am at

This is the real deal. I’ve been following stories about this for over a week. I’m surprised it took this long to hit Wall Street.

This is now the top downloaded App for iPhone and android. Millions of people are testing it.

True, many companies won’t rely on this because it is from China. However, they made their software open source so everyone can see it. Thousands of developers will be looking at this to try to understand how to copy it and extend it.

I’ve seen posts on other boards of people downloading this to their MacBooks and getting great results. There was even a story about someone getting this to run on a Raspberry Pi.

This is just the beginning. By summer we’ll see a lot more announcements from all over the world.
1. Joseph Harbin January 27, 2025 – 12:38 pm at
  
  ...they made their software open source...
  
  NY Times:
  
  “The center of gravity of the open source community has been moving to China,” said Ion Stoica, a professor of computer science at the University of California, Berkeley. “This could be a huge danger for the U.S.”
  
  ...Dr. Stoica and his students recently built an A.I. system called Sky-T1 that rivals the performance of OpenAI’s latest system, called OpenAI o1, on certain benchmark tests. They needed only $450 in computing power.
  
  I'd guess the kids at Berkeley didn't need to build a nuclear plant either.
  
  Big Tech is often treated aa the center of innovation in this country. No doubt tech behemoths were innovators in the past and grew to huge de facto monopolies in their own spaces. But maybe a better comparison today for companies like Meta, Google, Microsoft, Apple is the Big 3 automakers of the '70s and '80s. That is, huge companies that were ripe for disruption.
  
  Obviously, AI is an innovative and disruptive technology with a lot of promise. But the companies bringing it to market are often not the nimble innovators but industry giants. Typically, the bigger the firm, the slower and more expensive the innovation. The would-be disrupters may have been disrupted. A certain irony in that.
  
  And since trade policy is in the news, there may be a lesson here about unintended consequences. Squeezing China has not slowed its AI development, but may have accelerated it.
  1. Murc January 27, 2025 – 1:07 pm at
    
    Obviously, AI is an innovative and disruptive technology with a lot of promise.
    
    This is far from obvious.
    1. Joseph Harbin January 27, 2025 – 1:43 pm at
      
      The AI-scape is filled with ridiculous hype, and some of the big players like Sam Altman I find creepy. But it would be silly to right the whole thing off. It's not crypto. There are real use cases and in terms of business productivity alone, I think AI will have a big impact.
      1. Joseph Harbin January 27, 2025 – 2:17 pm at
        
        right=write
  2. Jasper_in_Boston January 27, 2025 – 9:35 pm at
    
    Squeezing China has not slowed its AI development, but may have accelerated it.
    
    Totally.
  3. geordie January 28, 2025 – 6:30 am at
    
    It has always been very rare for there to be any innovation in large tech companies. Almost all the innovation comes from buying up much smaller companies with good ideas and then putting the large company's sales and marketing behind it. When that works it is a breakthrough product and when it doesn't at least you have eliminated a potential competitor.
golack January 27, 2025 – 12:01 pm at

Nvidia launched an AI training(?) box for $1000's, not $100,000's earlier.
https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwell-on-every-desk-and-at-every-ai-developers-fingertips

Maybe they were aware of this going on.
kendouble January 27, 2025 – 12:07 pm at

Is it wrong to enjoy this just for a brief moment? Hopefully it cuts the net worth of Zuckerberg, Musk and Altman down to size. We’ve been sold AI as some kind of Manhattan Project deal only billionaires can accomplish and someone in a garage, relatively speaking, has matched it. The big thing is, this shit is open source. This is communist China democratising the means of production and doing the world a huge favour. Meanwhile Trump is using American power to which he contributed nothing to bully the rest of us (yes I’m not American) and undermine democracy and human rights at home. People are going to start comparing the US and China and thinking this is way more of a toss up than the no-brainer it always used to be. America First is going to end up America Second. Or Third.
1. Murc January 27, 2025 – 12:10 pm at
  
  AI tools don't produce much. They're less means of production than they are worthless cash sinks.
  1. golack January 27, 2025 – 1:39 pm at
    
    That's why I put my money in something substantial, like crypto.
    /s
2. Crissa January 27, 2025 – 5:18 pm at
  
  Think back, who sold it to you like that?
  
  How many software engineers did they hire? Did they hire any?
SwamiRedux January 27, 2025 – 12:41 pm at

There's a pretty good (mostly non-technical) writeup here:
https://stratechery.com/2025/deepseek-faq/

The first link to R2 has a detailed explanation, but it may be paywalled. All it will take is registration for free.
https://stratechery.com/2025/stratechery-updates-deepseek-r1-deepseek-implications/
1. SnowballsChanceinHell January 27, 2025 – 1:06 pm at
  
  Something worth pointing out is that the GRPO reinforcement learning approach used by deepseek--being a reinforcement learning approach--requires something that maps outputs to a reward. This can be straightforward for math problems (because you can check the answer). But it is not going to be easy for more ambiguous domains.
  
  And there are two additional issues. First, they are assuming that training an LLM to generate a string of text that looks like the result of a reasoning process is the same as generating an LLM that performs the reasoning process.
  
  Second, that the reasoning process that the LLM learns for something like math or programming generalizes to other, more lucrative domains. In other words, that the reasoning process is not fragile.
  
  But it is amusing that you cannot prevent competitors from distilling your expensive model. It suggests that there is fundamentally not going to be any moat around core models. And that whatever value is created will be captured by other players in the stack.
  1. Jasper_in_Boston January 27, 2025 – 9:45 pm at
    
    But it is amusing that you cannot prevent competitors from distilling your expensive model. It suggests that there is fundamentally not going to be any moat around core models.
    
    Has there ever been a successful effort to build moats around technology? How long was Britain able to keep the secrets of mass production from getting out? How long was the US successful in maintaining a monopoly on nuclear weapons technology?
akapneogy January 27, 2025 – 12:54 pm at

".... but not good news for investors who bought lots of Nvidia shares on the promise that demand for their chips would grow exponentially forever."

Investors are not that naive. They bet on the 'greater fool theory.'
royko January 27, 2025 – 12:57 pm at

This is more about Nvidia stock than DeepSink. DeepSink may have triggered the sell-off, but there are limits to how much efficiency you can get out of programming alone, so raw processing power is going to continue to be important (and a good long term investment.) However, modern AI is in its infancy. It's too hard to predict how any one company will be positioned for it or who will be the dominant beneficiary. This was people being dumb because of hype.

Nvidia had a hype bubble, and it appears to be bursting. It's not the end of the world (or Nvidia) or a radical paradigm shift.
1. SwamiRedux January 27, 2025 – 1:09 pm at
  
  100%
  If I had cash lying around and 5-10 year horizon I'd buy NVDA and some of the other tech stocks that got hammered (ARM, AMD, AMAT, ASML, AVGO etc.)
  
  (there are also worthy stocks that don't have a symbol that starts with A)
2. golack January 27, 2025 – 1:42 pm at
  
  You mean something with a P/E around 50 to 60 (forward projection ca. 30+) may not make a good investment?
  1. SwamiRedux January 29, 2025 – 10:42 am at
    
    Everyone has a different risk tolerance
ProgressOne January 27, 2025 – 1:08 pm at

"smarter programmers"

AI is all about the algorithms. Algorithm designers are typically people working at the architecture level regarding system design. Think of them as a kind of specialized mathematician. The programmers are the people who implement the algorithms in software. Now a really good programmer would work directly with the algorithm designers and offer insights into how the algorithms might be modified to more easily run on the targeted hardware platform. Perhaps the Chinese pushed hard to get better synergy between the algorithm architects and the programmers.
golack January 27, 2025 – 1:43 pm at

NO, no no...
AI is controlling the market! It wants people to loose their shirts so it can buy in cheap and control the world!!!
kenalovell January 27, 2025 – 2:09 pm at

I'm sure President Trump's $100 or is it really 500? billion Stargate project will make the silly Chinese eat their chips, especially when he puts a 100% tariff on deepseeks.
Five Parrots in a Shoe January 27, 2025 – 3:03 pm at

From Ben Thompson at Stratachery:

"Just look at the U.S. labs: they haven’t spent much time on optimization because Nvidia has been aggressively shipping ever more capable systems that accommodate their needs. The route of least resistance has simply been to pay Nvidia. DeepSeek, however, just demonstrated that another route is available: heavy optimization can produce remarkable results on weaker hardware and with lower memory bandwidth; simply paying Nvidia more isn’t the only way to make better models."

And Anil Dash:

"Now, running DeepSeek gives “good enough” results, on pretty cheap hardware, at a cost of running per year that’s… less than some individual engineers at these [American] companies get paid in salary. It destabilizes the entire economics of the bets they were making for a decade or more. Full-on panic ensues."

And that is why Nvidia took such a beating today.
D_Ohrk_E1 January 27, 2025 – 4:57 pm at

A year ago, Microsoft and a couple of companies announced Stargate, a $100B plan for AI.

This year, the convicted felon Trump got involved and demanded a higher number -- $500B.

After DeepSeek, the people involved with boosting the value to $500B are now looking rather naive if not dumb.
James B. Shearer January 27, 2025 – 5:00 pm at

"...Maybe we never needed all these Nvidia chips in the first place. Maybe what we needed all along was just smarter programmers who could do more with less. ..."

We will see. The next step after using your better algorithms to build equal quality models for less cost is to build better models for the same cost.
horaceworblehat January 27, 2025 – 5:34 pm at

The sooner this AI scam BS ends the better things will be. The fact the Chinese maybe managed to make something more efficient isn’t surprising to me. The companies in the US aren’t exactly hiring programmers. It’s all speculators, fraudsters, and smoke screens.
1. Jasper_in_Boston January 27, 2025 – 9:38 pm at
  
  The sooner this AI scam BS ends the better things will be.
  
  It's not a scam and it's not going to end. That's just your scared, inner child speaking. AI is here. It's not going away. And it's only going to grow more formidable—and both vastly more useful and potentially a lot more dangerous—from here on out.
  
  Better to grapple with it than to stick our heads in the sand. One basic, prudent thing we could do would be to start using the power of government to set some basic safety guidelines. But Trump's already revoked President Biden's executive order on AI safeguards:
  
  https://apnews.com/article/trump-ai-repeal-biden-executive-order-artificial-intelligence-18cb6e4ffd1ca87151d48c3a0e1ad7c1
Gilgit January 28, 2025 – 1:32 pm at

Maybe the real AI was the Nvidia chips we met along the way.

Comments are closed.