Designing, simulating and bringing up modern data centers is incredibly complex, involving multiple considerations like performance, energy efficiency and scalability.
It also requires bringing together a team of highly skilled engineers across compute and network design, computer-aided design (CAD) modeling, and mechanical, electrical and thermal design.
NVIDIA builds the world’s most advanced AI supercomputers and at GTC unveiled its latest — a large cluster based on the NVIDIA GB200 NVL72 liquid-cooled system. It consists of two racks, each containing 18 NVIDIA Grace CPUs and 36 NVIDIA Blackwell GPUs, connected by fourth-generation NVIDIA NVLink switches.
On the show floor, NVIDIA demoed this fully operational data center as a digital twin in NVIDIA Omniverse, a platform for connecting and building generative AI-enabled 3D pipelines, tools, applications and services.
To bring up new data centers as fast as possible, NVIDIA first built its digital twin with software tools connected by Omniverse. Engineers unified and visualized multiple CAD datasets in full physical accuracy and photorealism in Universal Scene Description (OpenUSD) using the Cadence Reality digital twin platform, powered by NVIDIA Omniverse APIs.
Design, Simulate and Optimize With Enhanced Efficiency and Accuracy
The new GB200 cluster is replacing an existing cluster in one of NVIDIA’s legacy data centers. To start the digital build-out, technology company Kinetic Vision scanned the facility using the NavVis VLX wearable lidar scanner to produce highly accurate point cloud data and panorama photos.
Then, Prevu3D software was used to remove the existing clusters and convert the point cloud to a 3D mesh. This provided a physically accurate 3D model of the facility, in which the new digital data center could be simulated.
Engineers combined and visualized multiple CAD datasets with enhanced precision and realism by using the Cadence Reality platform. The platform’s integration with Omniverse provided a powerful computing platform that enabled teams to develop OpenUSD-based 3D tools, workflows and applications.
Omniverse Cloud APIs also added interoperability with more tools, including PATCH MANAGER and NVIDIA Air. With PATCH MANAGER, the team designed the physical layout of their cluster and networking infrastructure, ensuring that cabling lengths were accurate and routing was properly configured.
The team used Cadence’s Reality Digital Twin solvers, accelerated by NVIDIA Modulus APIs and NVIDIA Grace Hopper, to simulate airflows, as well as the performance of the new liquid-cooling systems from partners like Vertiv and Schneider Electric. The integrated cooling systems in the GB200 trays were simulated and optimized using solutions from Ansys, which brought simulation data into the digital twin.
The demo showed how digital twins can allow users to fully test, optimize and validate data center designs before ever producing a physical system. By visualizing the performance of the data center in the digital twin, teams can better optimize their designs and plan for what-if scenarios.
Users can also enhance data center and cluster designs by balancing disparate sets of boundary conditions, such as cabling lengths, power, cooling and space, in an integrated manner — enabling engineers and design teams to bring clusters online much faster and with more efficiency and optimization than before.
Learn more about NVIDIA Omniverse and NVIDIA Blackwell.