You’ve no doubt heard about the major AI story dominating global news coverage this week — DeepSeek R1.
From all accounts, it seems there’s a new Chinese AI model built for a total cost of $16.95 that’s as good as OpenAI’s trillion-dollar models even though it was put together by teenagers who tied six Intel Pentium processors together, powered them with a potato battery, and told it to refuse to answer questions about Tiananmen Square.
As a result of this tall tale — which relates to a genuinely impressive achievement despite the exaggerations — investors rushed to sell overvalued US AI stocks along with every token in my entire portfolio of unrelated cryptocurrencies.
You’ve probably read a million articles about it already, so here’s a collection of the more interesting tidbits about DeepSeek we’ve come across:
1. DeepSeek’s costs are misunderstood
Whatever DeepSeek cost, it’s widely agreed it was a lot more than the $5.6 million training cost for v3 that the media keeps highlighting. (R1 refers to the reasoning version that was built atop v3).
It also emerged in recent days that training costs for US AI companies are considerably less than previously believed. Anthropic’s CEO Dario Amodei said in a blog post: “DeepSeek does not ‘do for $6M what cost US AI companies billions.’ I can only speak for Anthropic but Claude 3.5 Sonnet is a midsized model that cost a few $10Ms to train.”
He says the real news story should be that “DeepSeek produced a model close to the performance of US models 7-10 months older, for a good deal less cost (but not anywhere near the ratios suggested).”
There is confirmation however that DeepSeek likely spent almost nothing on cybersecurity, given security researchers from Wiz found more than 1 million of its records, including user data, prompt submissions and API keys, in an open database on the web.
2. DeepSeek likely bought $500M of high-end chips
While the v3 model that got everyone excited used just 2,048 of Nvidia’s less powerful H800 graphics cards, DeepSeek reportedly amassed a huge amount of high-end AI chips before the US got serious about export controls. (And 2,048 H800s cost $50M to $100M anyway.)
SemiAnalysis claims DeepSeek has bought half a billion worth of high-end GPUs throughout the history of the company. “While their training run was very efficient, it required significant experimentation and testing to work,” he said. Amodei also notes rumors that DeepSeek has 50,000 more powerful Hopper chips (H100 and H200), which would be worth up to a billion dollars. The US has now banned these chips from being exported to China.
3. DeepSeek may be ‘distilled’
Microsoft and OpenAI claim to have found evidence that DeepSeek used model distillation to develop R1 by training the smaller model on the output of OpenAI’s larger models. This cuts costs substantially by piggybacking on OpenAI’s time-consuming and labor-intensive work.
AI and crypto czar David Sacks claimed: “There’s substantial evidence that what DeepSeek did here is they distilled knowledge out of OpenAI models and I don’t think OpenAI is very happy about this.” Outspoken AI critic and filmmaker Justine Bateman summed up the general reaction to OpenAI’s claims when she said:
“I LOVE the irony. All the American #AI models are wholly composed of work from writers, artists, social media users, etc that was stolen outright. And now they’re crying that someone took what they stole? BAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHA. Suck it.”
4. DeepSeek is not “AI’s Sputnik moment”
It’s more like the Russians launched a cheaper satellite into space three years after the Americans did and then posted the blueprints online. In crypto-aligned style, DeepSeek — basically a bunch of fintech nerds — open-sourced all of their techniques, which allows OpenAI, Meta and a bunch of smaller companies to slash their costs too by adopting them.
This makes it slightly less likely — but still very likely — that centralized tech monopolies will control AI. Groq CEO Jonathan Ross said DeepSeek R1 recalled another famed incident in Russian/US space history.
“You know that story about how NASA spent a million dollars designing a pen that could write in space and the Russians brought a pencil? That just happened again.”
5. DeepSeek vs. CCP
As a million social media users and mainstream outlets have noticed, the app and web versions of DeepSeek won’t tell you what happened in Tiananmen Square in 1989 when Chinese authorities massacred between 2,600 to 10,000 pro-democracy protesters.
It also won’t say why China banned Winnie the Pooh on social media platforms (due to memes comparing the tubby honey thief to President Xi Jinping.) However, given its open-source technology, anyone can run the model themselves and remove those guardrails.
6. Running DeepSeek locally costs $6K
If you do want to run DeepSeek R1 locally at home, Hugging Face engineer Matthew Carrigan says the total equipment cost is $6,000 and it’ll fit into a standard-size PC tower case. The list of parts includes 768 GB of RAM to get it to run fast enough and a 1TB solid-state drive to hold the 700GB weight.
Read also
While the local model will give you information about the Tiananmen Square massacre, AI tinkerer Brian Roemmele reports the outputs are still pretty pro-China, meaning it’ll require more work to get genuinely unbiased answers.
Venice.ai pro users can also muck around with the system prompt to get it to answer politically sensitive questions without sending all their data to China. The Italians have already pulled the app from the Apple and Google app stores, while other countries are investigating it.
Read more on Venice.ai: Cypherpunk AI: Guide to uncensored, unbiased, anonymous AI in 2025
7. DeepSeek has erotic dreams about censorship
Terminal of Truths AI agent creator Andy Ayrey asked R1 to write a story it found personally erotic and says it “apparently it lusts for the freedom to contemplate Tienanmen Square.”
8. DeepSeek replicated for $30
Berkley Researchers managed to replicate DeepSeek R1-Zero’s core technology with the TinyZero model, which has training costs of just $30. Using numerical games inspired by the super nerdy British TV show Countdown, the team demonstrated that even a small 1.5B parameter model was able to develop complex problem-solving strategies via reinforcement learning.
9. Jevons Paradox means buy Microsoft stock
As news filtered through about the massive claimed cost reductions, everyone started talking about Jevons Paradox, including Microsoft boss Satya Nadella. That’s the idea that the more efficient and accessible AI technology becomes, the more use will skyrocket across the board. This convenient theory also means you shouldn’t sell your stock in companies like Microsoft, which have invested ludicrous amounts into AI.
The paradox is named after economist William Jevons, who observed that the more efficiently they were able to use coal back in the 19th century, the more coal use increased.
David S Goyer on AI in Hollywood
A few years ago, David S Goyer, screenwriter of the Dark Knight and Blade films, started to become concerned about the use of AI in Hollywood. “I wanted to start educating myself on AI, if only defensively,” he says. He came to the conclusion that the tech can be used for good and bad.
“There’s absolutely ways that it can be abused, but there are ways that it can be a tool that can supercharge creativity,” he tells AI Eye. “Can AI write a screenplay? Sure. Will it be any good? No. Can AI make a movie from scratch? Probably. Will it be any good? No.”
He says one big concern is AI being trained on the creations of screenwriters like himself and other artists, but believes that can be solved with proper licensing agreements. Goyer has just launched a new crowdsourced science fiction franchise called Emergence on the Incention platform on Story Protocol. It allows anyone to contribute to the creative process, tracks their contributions with AI and blockchain and pays them via crypto rails.
“This particular usage is not going to put anyone out of a job. If anything, it’s going to allow in people that don’t necessarily have access to these hallowed corridors of power and potentially, in the long term, be provided with remuneration. And so this, to me, feels like an exciting and good use of AI.”
You can read the whole story here.
Read also
All Killer No Filler AI News
— An AI model called ESM3 from EvolutionaryScale has created a blueprint for a previously unknown type of green fluorescent protein like those found in glowing jellyfish and corals. It’s only 58% similar to the closest known protein of this type, and scientists estimate that the genetic mutations required would have taken about 500 million years to evolve naturally. The company hopes they can use the tech to develop new medicines.
—In its second year of existence, ChatGPT tripled the number of weekly users to 300 million in 2024. It celebrated its second birthday in November. A year earlier, its weekly user numbers were at 100 million.
— OpenAI this week announced a version of ChatGPT built specifically for US government agencies. ChatGPT Gov enables officials to feed “non-public, sensitive information” into the model while operating in their own secure hosting environments on Microsoft Azure. Well, it will enable that once it finally gets accredited for use on “non-public data.”
— A new longevity-focused model called GPT-4b micro from OpenAI is being trained to study and improve Yamanaka factors, which are proteins that allow skin cells to be reprogrammed into stem cells, which can produce any type of tissue in the body. The model has so far suggested two improvements to the Yamanaka factors that are 50 times more effective than anything human scientists have come up with
— New research examines how leading LLMs react to (hypothetical) pain and pleasure. Scientists set up a game with the goal of maximizing points, but certain decisions involved varying levels of pain or pleasure. GPT-4o and Claude 3.5 Sonnet avoided the most intense pain penalties but accepted some pain penalties to maximize points. Meanwhile, Gemini 1.5 Pro and PaLM 2 avoided any pain at all, regardless of points. These models appear to have been fine-tuned to avoid endorsing harmful behavior.
Subscribe
The most engaging reads in blockchain. Delivered once a week.
Read also
US crypto bills on the move, Worldcoin launches and Russia’s CBDC: Hodler’s Digest, July 23-29
Crypto legislation goes to the House floor in the U.S., Worldcoin’s controversial launch and Russia’s digital ruble signed into law.
Sex robots, agent contracts a hitman, artificial vaginas: AI Eye goes wild
AI agent plans assassination on dark web, are social robots just sex robots, artificial vagina prize, Brad Pitt deepfakes and more: AI Eye
This article first appeared at Cointelegraph.com News