We have a new AI champion: 深度求索.
Yes, it's a Chinese AI with the English name DeepSeek. And it's sort of the new champion, but it gets complicated. Overall, their R1 model, released yesterday, appears to be about as good as the latest GPT-4 model, or maybe a hair worse. But it's got a lot of other things going for it:
- It's open source. Completely open source, model weights and all.
- It's trained via reinforcement learning.
- It's far cheaper to run than GPT-4.
- Smaller "distilled" versions can run on a laptop and perform pretty well.
I've used DeepSeek a little bit and it seems to work pretty well—though I haven't pushed it very hard. Here are the benchmarks they've run:
Is this really the next great advance, or it typical AI hype? I don't know, but Sam Altman seems to be afraid of it, and that must mean something. On the more noxious side, you have to be a little careful about your subject matter:
I suppose there's always Wikipedia for stuff the CCP is sensitive about. In any case, stay tuned. There are a ton of good AI models available these days, and in the end DeepSeek might just be part of the pack. But maybe not.