Nvidia's data center Blackwell GPUs reportedly overheat, require rack redesigns and cause delays for customers

Nvidia’s next-generation Blackwell processors are facing significant challenges with overheating when installed in high-capacity server racks, reports The Information. These issues have reportedly led to design changes and delays and raised concerns among customers like Google, Meta, and Microsoft about whether they can deploy Blackwell servers on time.

According to insiders familiar with the situation who spoke with The Information, Nvidia’s Blackwell GPUs for AI and HPC overheat when used in servers with 72 processors inside. These machines are expected to consume up to 120kW per rack. These problems have caused Nvidia to reevaluate the design of its server racks multiple times, as overheating limits GPU performance and risks damaging components. Customers reportedly worry that these setbacks may hinder their timeline for deploying new processors in their data centers.

Nvidia has reportedly instructed its suppliers to make several design changes to the racks to counteract overheating issues. The company has worked closely with its suppliers and partners to develop engineering revisions to improve server cooling. While these adjustments are standard for such large-scale tech releases, they have nonetheless added to the delay, further pushing back expected shipping dates.

In response to the delays and overheating issues, an Nvidia spokesperson reminded Reuters about the collaborative efforts with cloud providers and described the design changes as part of the normal development process. This partnership with cloud providers and suppliers aims to ensure the final product meets performance and reliability expectations as Nvidia continues to work on resolving these technical challenges.

Previously, Nvidia had to delay the Blackwell production ramp due to the processor’s yield-killing design flaw. Nvidia’s Blackwell B100 and B200 GPUs use TSMC’s CoWoS-L packaging technology to connect their two chiplets. This design includes an RDL interposer with local…

Read full post on Tom’s Hardware

The ultimate steampunk machine? Toshiba glued an old-school dot matrix printer with a DVD drive, a touchscreen display and a PC with two SSDs — and it even runs Windows 10

Best Beginner Supports To Play in League of Legends 2024

A Glitch Is Causing People To Hear Voices Coming From Their iPhone

China to spend billions on another fab offering mature nodes — YDME formed to operate new $4.6B facility

AMD dominates chip sales on Amazon — top ten best selling CPUs all come from Team Red, Intel’s highest entry sits at 11th place

Musk’s concerns over Google DeepMind ‘AI Dictatorship’ revealed in emails from 2016 — communications released during the recent OpenAI court case

iPhone Is Becoming A Hub For All AI Features As Google Launches Standalone ‘Gemini’ App While ChatGPT Integrates With Siri

What is CPU Boost: Should You Use It on Your PC?

Salesforce unveils its Agentforce future of AI to the world

The Best characters in Another Eden

Top Features of a Gaming Mouse: Things You Need to Know

Nvidia’s data center Blackwell GPUs reportedly overheat, require rack redesigns and cause delays for customers

Related Posts