Nvidia may face further delays with its Blackwell AI chips, which were introduced earlier this year, are reportedly overheating in server configurations raising concerns.
According to The Information, there are fears that the chips might create challenges for big cloud service consumers like Meta Platforms, Google, and Microsoft. This comes as these tech giants get ready to develop data centers depending on this technology.
Nvidia has asked for a redesign of racks to address the problem
The Information further reports that the overheating is arising from chips placed in server racks that are meant to support up to 72 units. As part of efforts to limit the challenge, Nvidia has requested vendors to repeatedly remodel racks.
The several redesigns on the racks have sparked concerns among its customers about delays in installing new AI data center tech.
Some consumers and staff have confirmed the overheating problem with the Blackwell AI chips.
“Nvidia is working with leading cloud service providers as an integral part of our engineering team and process,” a Nvidia spokesperson told Reuters.
“The engineering iterations are normal and expected.”
Nvidia spokesperson.
The Blackwell AI chips were first unveiled in March and were supposed to hit the market in the second quarter but faced delays in deliveries subsequently affecting client deployment schedules.
The chips reportedly combine two silicon components into one unit meant to run times quicker than past versions in jobs like chatbot answer creation.
According to Guru Focus, with Nvidia’s products essential to key technological platforms, this issue tracks the tech giant’s attempts to establish its leadership in AI and cloud computing.
Additionally, the overheating challenge might cause questions over whether the rising needs of data-intensive AI projects can be satisfied.
The chip-making giant has not yet revealed which vendors help to remedy the design problems or when the overheating concerns can be resolved. The further delays could influence the larger AI infrastructure plans of critical customers.
Nvidia’s price fell in response to the overheating news
A Tom’s Hardware report indicated that the Blackwell AI chips release was already delayed by several months because Nvidia was working on a design flaw that affected production yields.
Now, Investors reported Nvidia’s share price retreated following the reports of its Blackwell AI chips heating when installed on high-capacity server racks.
The stock dipped 2.9% in pre-market trades on the stock market. However, on a year-to-date basis, the stock is up 187%.
“We had heard that server designs were still being finalized as of last month but would be surprised if NVL72 shipments are meaningfully delayed by issues with heat (and cooling),” Wedbush Securities analyst Matt Bryson said in a client note Monday.
“Having said this, it will be a topic we will be asking about at SC24 this week.”
Bryson.
The SC24 is a conference on high-performance computing, networking, and analysis and is taking place in Atlanta. Nvidia is expected to release its third-quarter earnings this week Wednesday.
The news of overheating has raised more concerns about the impact of AI on energy and water consumption. The advanced GPUs reportedly can be 30 times faster than the previous GPUs. The more powerful a GPU, the more heat it produces.
According to PCmag, running generative AI models requires a lot of energy as well as water to cool the servers, resulting in some predictions that data centers will experience water shortages as soon as 2025, as tech firms are not as quick to add electricity to the power grid as quickly as they add the data centers.
From Zero to Web3 Pro: Your 90-Day Career Launch Plan