Nvidia called DeepSeek’s R1 model “an excellent AI advancement,” despite the Chinese startup’s emergence causing the chip maker’s stock price to plunge 17% on Monday.
“DeepSeek is an excellent AI advancement and a perfect example of Test Time Scaling,” an Nvidia spokesperson told CNBC on Monday. “DeepSeek’s work illustrates how new models can be created using that technique, leveraging widely-available models and compute that is fully export control compliant.”
The comments come after DeepSeek last week released R1, which is an open-source reasoning model that reportedly outperformed the best models from U.S. companies such as OpenAI. R1’s self-reported training cost was less than $6 million, which is a fraction of the billions that Silicon Valley companies are spending to build their artificial-intelligence models.
Nvidia’s statement indicates that it sees DeepSeek’s breakthrough as creating more work for the American chip maker’s graphics processing units, or GPUs.
“Inference requires significant numbers of NVIDIA GPUs and high-performance networking,” the spokesperson added. “We now have three scaling laws: pre-training and post-training, which continue, and new test-time scaling.”
Nvidia also said that the GPUs that DeepSeek used were fully export compliant. That counters Scale AI CEO Alexandr Wang’s comments on CNBC last week that he believed DeepSeek used Nvidia GPUs models which are banned in mainland China. DeepSeek says it used special versions of Nvidia’s GPUs intended for the Chinese market.
Analysts are now asking if multi-billion dollar capital investments from companies like Microsoft, Google and Meta for Nvidia-based AI infrastructure are being wasted when the same results can be achieved more cheaply.
Earlier this month, Microsoft said it is spending $80 billion on AI infrastructure in 2025 alone while Meta CEO Mark Zuckerberg last week said the social media company planned to invest between $60 to $65 billion in capital expenditures in 2025 as part of its AI strategy.
“If model training costs prove to be significantly lower, we would expect a near-term cost benefit for advertising, travel, and other consumer app companies that use cloud AI services, while long-term hyperscaler AI-related revenues and costs would likely be lower,” wrote BofA Securities analyst Justin Post in a note on Monday.
Nvidia’s comment also reflects a new theme that Nvidia CEO Jensen Huang, OpenAI CEO Sam Altman and Microsoft CEO Satya Nadella have discussed in recent months.
Much of the AI boom and the demand for Nvidia GPUs was driven by the “scaling law,” a concept in AI development proposed by OpenAI researchers in 2020. That concept suggested that better AI systems could be developed by greatly expanding the amount of computation and data that went into building a new model, requiring more and more chips.
Since November, Huang and Altman have been focusing on a new wrinkle to the scaling law, which Huang calls “test-time scaling.”
This concept says that if a fully trained AI model spends more time using extra computer power when making predictions or generating text or images to allow it to “reason,” it will provided better answers than it would have if it ran for less time.
Forms of the test-time scaling law are used in some of OpenAI’s models such as o1 as well as DeepSeek’s breakthrough R1 model.
WATCH: DeepSeek challenging sense of U.S. exceptionalism priced into markets, fund manager says