Paul Dempsey Tue 4 Feb 2025

Collected at: https://eandt.theiet.org/2025/02/04/deepseek-r1-shakes-ai-cost-revolution-could-change-everything

The furore that erupted in January around the China-developed DeepSeek R1 AI model saw leading technology and energy stocks tumble sharply. But what might all this mean for future AI research and adoption? A much wider reassessment of the global shape of the market, particularly the potential and cost of generative AI (GenAI), is likely to rumble throughout 2025.

The money men have been seriously spooked. That is not the whole story, but you must still start there. Because one of the questions set to come out of the debate concerns where their investments may now go.

At the tumble’s January low-point, Nvidia, the largest producer of AI GPUs, saw its share price fall by 17%, wiping nearly $600bn off its valuation. The other members of tech’s ‘Magnificent Seven’ – Apple, Microsoft, Amazon, Alphabet, Meta, and Tesla – also fell, but not as sharply.

In the energy sector, two leading generators supplying the hyperscale data centre market, Vistra and Constellation Energy, were off respectively 28% and 21%. According to Jefferies investment bank, data centres will fuel 75% of extra US energy demand between now and 2035. Constellation, for its part, is Microsoft’s partner in a plan to bring Unit 1 at the Three Mile Island nuclear power station in Pennsylvania back online after five years in mothballs.

These numbers reflect the degree to which investors initially saw DeepSeek as having directly challenged the impetus behind the current AI investment wave – put bluntly, the notion that big is not only beautiful but essential.

Apart from previously rocketing stock prices, AI received just over 50% of all venture capital funding in Q4 2024, according to Pitchbook data, doubling its share on Q4 2023. But a notable aspect to many of the deals struck throughout last year is that they were far bigger than normal, with technology giants joining investment firms to make billion-dollar bets on the companies they expect to become AI giants.

OpenAI, the poster child for Big GenAI, raised $6bn in new funds late last year, including ongoing backing from Microsoft. Its main US rival, Anthropic, secured $4bn in its most recent round, mostly from Amazon though Alphabet too is a major investor. Anthropic has been tipped to soon go back to the well for another $2bn, at a pro rata valuation that will rise from 2024’s $18bn to $60bn.

Neither of these companies is as yet making a profit. Both currently have losses in the billions. But what they had convinced investors of was that GenAI is an expensive game. Anthropic CEO Dario Amodei estimated that model training costs could soon reach $10bn, and OpenAI CEO Sam Altman has said that training its existing GPT-4 model was more than $100m.

More recently, Altman joined Oracle CEO Larry Ellison and SoftBank CEO Masayoshi Son at the White House to laud newly installed US President Donald Trump and unveil the Stargate project (incidentally, already more than in year in planning). This promises an initial $100bn investment in AI infrastructure (data centres and energy), potentially rising to $500bn by the end of Trump’s second term.

Upping the ante from AGI (artificial general intelligence) to an even more hyperbolic ASI (artificial superintelligence), Son presaged a golden age. ASI will, he said, “come to solve the issues that mankind never ever thought that we could solve”.

At which point, enter Chinese hedge fund manager Liang Wenfeng and his spin-off DeepSeek.

Its latest R1 model has been favourably benchmarked alongside GPT-4 and Anthropic’s Claude but is open-source (like Meta’s Llama) and is estimated to have cost only $6m to train (although questions have been raised over its compute spending and basic research costs).

Where US projects have used tens of thousands of the most advanced Nvidia AI chips, R1 is estimated to have been put together using roughly 2,000 ‘hobbled’ H800 devices that were designed to offer performance just below that covered by the 2022 round of US semiconductor export restrictions. H800s have since also been restricted from sale to China in more recent ones but had previously been stockpiled by Liang’s company, and analysts say Nvidia’s latest H20 China-friendly device is still plenty powerful for inference and other AI processes.

By working within the then constraints, DeepSeek innovated around the concept of ‘mixture of experts’ where models within models are designed for targeted use cases. This broadly comprises tuning the sub-models very precisely, restricting accessing until they are justifiably needed, and then applying reinforcement learning (essentially getting the system to check and recheck potential outputs) against both real and synthetic training data.

DeepSeek improved and reduced cross device communication and memory usage by adding what it calls multi-head latent attention (MLA). This involved optimisations that could only be achieved by working below Nvidia’s near ubiquitous CUDA AI programming environment in the PTX assembly language that can be used for the company’s devices.

The coup de grâce, though, is arguably R1’s cost to a potential user. Here, two options raise further questions around the spend-spend-spend status quo.

You can choose to be DeepSeek-hosted, much as you can subscribe to the OpenAI and Anthropic model families. The US companies are already in a price war, but this has been intensified. For comparison, here are just the OpenAI vs DeepSeek numbers at time of writing.

OpenAI’s API for GPT-4.0 is $2.50 per one million input tokens and $10 per million output tokens; DeepSeek comes in at, respectively, 7¢ and $1.10 (tokens are the words or fragments of words that a large language model (LLM) computes). OpenAI’s Reasoning Model is $15/1m inputs and $60/1m outputs; DeepSeek, 55¢ and $2.19.

The second option – and one potential western users are likely to prefer given that the hosted model is subject to Chinese censorship and data collection – is simply to download DeepSeek R1 and run it natively. Already, it has been reported that a well-specified MacBook is up to the task.

With many potential AI adopters noting in surveys from the likes of McKinsey & Co, PwC and Deloitte that they still see serious questions around the return on AI investments – what is the point of replacing staff and tasks if the technology costs more to develop, maintain and operate? – this could bring more enterprises to seed start-ups looking to serve them with more traditional VC funding. Numbers like those hitherto bandied around by the likes of Sam Altman have probably had some deterrent effect.

DeepSeek does threaten to park current GenAI strategies in a category of, perhaps, Not-So-Good-Old-Fashioned AI. The economic status quo has taken a serious knock. The notion that the big players had built a ‘moat’ of cost and complexity that kept at bay any competitor without the deepest pockets looks untenable. OpenAI, Anthropic and other adherents to megabillion-parameter LLMs are probably rethinking those gestating IPOs.

Moreover, there were already doubts as to how far LLM scaling would take GenAI as the pathfinder for AGI or even Son’s ASI. Bill Gates is on record as saying that maybe there were only two more ‘cranks’ in that approach and that more work is now needed on combining LLMs with other AI strategies (see E&T Jan/Feb 2025). Others such as GenAI critic Gary Marcus argue that the technique is starving those other techniques of investment already.

But there is also a case that says there is still a value in ‘big is beautiful’, albeit mainly where it is also open-source and there is some kind of patron. Not surprisingly, Yann Le Cunn, Meta’s chief AI scientist, takes that view as the leader for its huge but open Llama model.

“To people who see the performance of DeepSeek and think: ‘China is surpassing the US in AI.’ You are reading this wrong. The correct reading is: ‘Open-source models are surpassing proprietary ones,’” he has written.

“DeepSeek has profited from open research and open source (e.g., [the machine learning library] PyTorch and Llama from Meta). They came up with new ideas and built them on top of other people’s work. Because their work is published and open source, everyone can profit from it. That is the power of open research and open source.”

Le Cunn has a point. Not only has DeepSeek acknowledged its leverage of open-source resources, but the model’s position as a distilled one has been essentially identified by AI infrastructure specialist Groq (no relation to X AI model Grok).

“DeepSeek R1 – the bigger and smarter model of the DeepSeek suite – was distilled into the Llama 70B architecture, making it smarter, based on benchmarks and human evaluation, than the original Llama 70B, and particularly exceptional at tasks requiring mathematical and factual precision.

“DeepSeek-R1-Distill-Llama-70b delivers top-tier performance on [the interactive problem suite] MATH-500 (94.5%), the best among all distilled models, and achieves a strong score of 86.7% on AIME 2024 (an exam designed to challenge the brightest high school math students in America) – this makes it a top choice for advanced mathematical reasoning. It is also more competent in coding tasks than most other models, performing better than OpenAI’s o1 mini and gpt-4o at GPQA Diamond (65.2%) and LiveCode Bench (57.5%).”

On one level, this shows how DeepSeek proves that Chinese engineering is now good enough to compete in AI at the highest level despite attempts to restrict the country’s access to the latest technology. In many ways, the culture of conquering constraints can be traced back to limits on access to the latest chip manufacturing nodes and, more recently, the latest processor IP from companies like Arm. There is a wake-up call in that – and that its national industry took heed of Beijing’s AI plan to be a world-class competitor by, you guessed it, 2025.

But the breakthroughs from Liang’s team also point to the potential for anyone, just about anywhere to innovate around very large foundational LLMs to derive less compute-demanding, more task-attuned models of their own. DeepSeek has not shared its training data, but its technical paper discloses the guts of how it has succeeded.

Former Intel CEO Pat Gelsinger is another of the keen observers of the AI market who sees the work seeding broader and necessary research.

“Open wins every time it is given a proper shot. AI is much too important for our future to allow a closed ecosystem to ever emerge as the one and only in this space,” he has noted.

“DeepSeek is an incredible piece of engineering that will usher in greater adoption of AI. It will help reset the industry in its view of open innovation. It took a highly constrained team from China to remind us all of these fundamental lessons of computing history.”

Meanwhile, the need for a continuing and massive boost in global compute resources is also not necessarily blocked. Another addition to the debate is now the Jevons paradox – it states that once you introduce greater efficiency to the use of a resource, the use of that resource nevertheless increases.

However, perhaps the AI business will more practically be driven back to develop something the hype around LLMs is still yet to produce, something that probably does require their integration with other techniques, and a lowered barrier-to-entry to innovation around them.

Whether in itself or by incentivising others, can DeepSeek finally get us closer to a killer application? That’s what the money really, really wants – as, indeed, do we all.

Leave a Reply

Your email address will not be published. Required fields are marked *

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments