country_code

Staying in Sync: NVIDIA Combines Digital Twins With Real-Time AI for Industrial Automation

NVIDIA software — Omniverse, Metropolis, Isaac and cuOpt — combine to create an AI gym where robots, AI agents can work out and be evaluated in complex industrial spaces.
by

Real-time AI is helping with the heavy lifting in manufacturing, factory logistics and robotics.

In such industries — often involving bulky products, expensive equipment, cobot environments and logistically complex facilities — a simulation-first approach is ushering in the next phase of automation.

NVIDIA founder and CEO Jensen Huang today demonstrated in his GTC keynote how developers can use digital twins to develop, test and refine their large-scale, real-time AIs entirely in simulation before rolling them out in industrial infrastructure, saving significant time and cost.

NVIDIA Omniverse, Metropolis, Isaac and cuOpt interact in AI gyms where developers can train AI agents to help robots and humans navigate unpredictable or complex events.

In the demo from the Mega virtual factory reference workflow above, a digital twin of a 100,000-square-foot warehouse — built using the NVIDIA Omniverse platform for developing and connecting OpenUSD applications — operates as a simulation environment for dozens of digital workers and multiple autonomous mobile robots (AMRs), vision AI agents and sensors.

Each AMR, running the NVIDIA Isaac Perceptor multi-sensor stack, processes visual information from six sensors, all simulated in the digital twin.

At the same time, the NVIDIA Metropolis platform for vision AI creates a single centralized map of worker activity across the entire warehouse, fusing together data from 100 simulated ceiling-mounted camera streams with multi-camera tracking. This centralized occupancy map helps inform optimal AMR routes calculated by the NVIDIA cuOpt engine for solving complex routing problems.

cuOpt, a record-breaking optimization AI microservice, solves complex routing problems with multiple constraints using GPU-accelerated evolutionary algorithms.

All of this happens in real time, while Isaac Mission Control coordinates the entire fleet using map data and route graphs from cuOpt to send and execute AMR commands.

An AI Gym for Industrial Digitalization

AI agents can assist in large-scale industrial environments by, for example, managing fleets of robots in a factory or identifying streamlined configurations for human-robot collaboration in supply chain distribution centers. To build these complex agents, developers need digital twins that function as AI gyms — physically accurate environments for AI evaluation, simulation and training.

Such software-in-the-loop AI testing enables AI agents and AMRs to adapt to real-world unpredictability.

In the above demo, an incident occurs along an AMR’s planned route, blocking the path and preventing it from picking up a pallet. NVIDIA Metropolis updates an occupancy grid, mapping all humans, robots and objects in a single view. cuOpt then plans an optimal route, and the AMR responds accordingly to minimize downtime.

With Metropolis vision foundation models powering the NVIDIA Visual Insight Agent (VIA) framework, AI agents can be built to help operations teams answer questions like, “What situation occurred in aisle three of the factory?” And the generative AI-powered agent offers immediate insights such as, “Boxes fell from the shelves at 3:30 p.m., blocking the aisle.”

Developers can use the VIA framework to build AI agents capable of processing large amounts of live or archived videos and images with vision-language models — whether deployed at the edge or in the cloud. This new generation of visual AI agents will help nearly every industry summarize, search and extract actionable insights from video using natural language.

All of these AI functions can be enhanced through continuous, simulation-based training and are deployed as modular NVIDIA NIM inference microservices.

Learn more about the latest advancements in generative AI and industrial digitalization at NVIDIA GTC, a global AI conference running through Thursday, March 21, at the San Jose Convention Center and online.

At ISC, JUPITER Shows What Exascale Science Looks Like

Europe’s first exascale supercomputer — running on NVIDIA Grace Hopper Superchips — is mapping the brain, modeling climate, advancing 6G AI and breaking records in quantum computing simulation.
by

JUPITER, Europe’s first exascale supercomputer at Germany’s Forschungszentrum Jülich, runs on NVIDIA Grace Hopper Superchips and NVIDIA Quantum-X800 InfiniBand networking — and it’s had a busy year.

As the international supercomputing community gathers at ISC in Hamburg this week, four projects running on JUPITER point to what exascale computing can actually do: map the human brain at cellular scale, simulate the entire Earth’s climate at 1-kilometer resolution, build AI systems for the next generation of wireless networks and simulate a universal 50-qubit quantum computer.

“With JUPITER, Europe doesn’t just join the exascale era — it leads it, across the widest range of science and AI of any system worldwide,” said Thomas Lippert, director of the Jülich Supercomputing Centre and professor at Goethe University Frankfurt. 

Four projects, detailed below, share a throughline: scientific problems that were out of reach on previous hardware are now tractable at exascale.

A Foundation Model for Mapping the Brain

The Jülich Brain Atlas project — anchored at Jülich’s Institute of Neuroscience and Medicine with partner Helmholtz AI, partner hospital and other Helmholtz institutions — has produced CytoNet, a foundation model for brain microarchitecture analysis.

The complexity of the human brain is astonishing. With 86 billion neurons and about 100 trillion connections between them, understanding brain function at single neuron resolution has been out of reach, until now.

The research is led by neuroscientist Katrin Amunts and computer scientist Christian Schiffer at INM-1, Jülich’s Institute of Neuroscience and Medicine. The model learns from brain imaging data at cellular scale, building a map that links individual cell structures to broader patterns of brain organization and function.

Training ran on JUPITER in under five days, using 6.5 petabytes of data from 21 post-mortem brains on 4,096 NVIDIA Grace Hopper Superchips. A paper describing the work is available on arXiv.

“For the first time, we’re not just using AI to analyze the brain — we’re building an agent that can think through the experiment itself,” said Katrin Amunts, director of INM-1 at Forschungszentrum Jülich and professor of brain research at Heinrich Heine University Düsseldorf. “That changes what neuroscience will be, and JUPITER is what makes that sentence possible to say today.”

That agent is the team’s next step: building an AI agent for brain researchers — integrating multimodal reasoning, language interfaces and Q&A capabilities using open models, including NVIDIA Nemotron 3 120B, working toward AI assistants that can help scientists interrogate brain data directly.

Climate at Kilometer Resolution

A novel ICON configuration — developed by researchers at the ETH Zurich, German Climate Computing Centre (DKRZ), Jülich Supercomputing Centre (JSC), Max Planck Institute for Meteorology, NVIDIA, Swiss National Supercomputing Centre (CSCS) and the University of Hamburg — won the Gordon Bell Prize for Climate Modelling at SC25 last November.

The breakthrough isn’t resolution alone. ICON is the first model to simulate a coupled Earth system at 1-kilometer resolution, with ocean, atmosphere and land, biogeochemistry and the full carbon cycle, with carbon exchanged, between all components. It can simulate and visualize complete ecosystems, such as phytoplankton blooms and zooplankton grazing. Previous systems could model pieces of this; ICON runs it all. This allows a much more precise and complete simulation of the Earth — observable at that level of detail for the first time.

Running on 20,480 NVIDIA Grace Hopper Superchips on JUPITER, the model simulated roughly 146 days of real climate into 24 hours of compute, setting a world record in global climate simulation. NVIDIA’s involvement in the ICON community spans more than a decade.

“Our simulations resolve the fine-scale winds, ocean eddies and upper-ocean mixing that shape marine ecosystems and regulate the ocean’s uptake of carbon,” said Daniel Klocke, computational infrastructure and model development group leader at the Max Planck Institute for Meteorology. “At a global resolution of just 1 kilometer, many of these interactions emerge directly from the laws of physics rather than being approximated. This gives us an unprecedented view of how the atmosphere, ocean and biosphere work together, helping us understand the processes driving our changing climate.”

6G Gets an Exascale Partner

In March, Ericsson and Forschungszentrum Jülich announced a collaboration to develop AI for the continued evolution of 5G and for 6G networks — with JUPITER as the compute engine for large-scale AI model training and testing.

The collaboration targets brain-inspired architectures designed to handle complex network operations at far lower energy costs. 

Research priorities include AI models for Ericsson’s radio and core networks, energy-efficient AI inference at the radio edge using neuromorphic approaches, and modular supercomputing architecture concepts drawn from JSC’s exascale work. 

Breaking Quantum Records

Researchers at the Jülich Supercomputing Centre (JSC), working with the jointly run NVIDIA Application Lab, also set a world first by fully simulating a universal 50-qubit quantum computer, surpassing the previous 48-qubit record. 

The simulation was made possible by drawing on the coherent, tightly coupled CPU-GPU memory architecture of JUPITER’s NVIDIA GH200 Grace Hopper Superchips, which lets data exceeding GPU limits spill seamlessly into CPU memory with minimal performance loss — allowing JUPITER to hold a far greater quantum state than GPU memory alone, which is what pushed the simulation past the previous 48-qubit record.

For now, that kind of simulation is the most powerful tool quantum research has: today’s quantum hardware can’t yet outperform classical computers on useful problems, so simulating quantum machines at the largest possible scale is how researchers design and stress-test the algorithms that future hardware will run.

This powerful quantum simulator, JUQCS-50, will be accessible to explore the performance of quantum algorithm designs within JUNIQ, the quantum computer user facility at JSC, led by Kristel Michielsen, director of JSC and professor at the University of Cologne. JUQCS-50 turns Europe’s first exascale system into a powerful testbed for tomorrow’s quantum-GPU supercomputers.

Exascale’s Impact

The range of science running on JUPITER — from neurons to atmosphere to wireless infrastructure to quantum — makes a case that exascale computing has moved from a research category into production. 

The results are a proof point for the Grace Hopper platform at the frontier of science.

Learn more about NVIDIA AI for science.

NVIDIA Vera CPU Opens the Way for Agentic Scientific AI at Los Alamos National Laboratory

Mission, Vision and Veritas supercomputers with Vera CPUs to advance materials simulation, scientific AI agents and molecular design.
by

Mission, Vision and Veritas — new Los Alamos National Laboratory (LANL) supercomputers to be built with HPE and NVIDIA — are tapping NVIDIA Vera CPUs to accelerate scientific discovery, unlocking agentic AI for science.

The supercomputers will use the HPE Cray Supercomputing GX5000 architecture with the NVIDIA Vera Rubin platform, combining NVIDIA Vera CPUs, NVIDIA Rubin GPUs and NVIDIA Quantum-X800 InfiniBand networking.

Under the planned configuration, Mission will include NVIDIA Vera Rubin GPU nodes and 2,300 standalone NVIDIA Vera CPUs using the HPE Cray Supercomputing GX240 blade. Veritas will feature approximately 1,150 standalone NVIDIA Vera CPUs to complement NVIDIA Vera Rubin nodes. 

Veritas will arrive alongside Mission and Vision and serve the Laboratory Directed Research and Development program, helping accelerate agentic AI for science. The system will test these technologies for use in larger systems being built out at LANL. 

Researchers are adding a new tool for science with AI agents that can form hypotheses, choose tools, launch simulations, analyze outputs and refine the next step. LANL’s public work on URSA, the Universal Research and Scientific Agent — running on Venado and soon Mission and Vision — points in this direction: a modular, feedback-driven AI framework designed to help scientists brainstorm hypotheses, plan experiments, run simulations and analyze results. 

LANL demonstrated that the Vera CPU delivered 7x higher performance on URSA workloads than the CPUs in the Crossroads x86 supercomputer.

Vera CPU for Agents and Simulation

In LANL’s early testing of NVIDIA Vera CPUs on Branson — an open source Monte Carlo heat transfer simulation tool — Vera outperforms the CPUs used in the Crossroads x86 supercomputer by over 3x. 

These results were made possible by Vera, including its custom Olympus core, LPDDR5 memory and fast on-chip fabric. 

A single Vera CPU outperforms a single socket x86-based CPU by more than 3x while providing more than 4x the memory per core and 6x the memory per node. Ultimately, this means faster  scientific results for LANL.

All of the lab’s supercomputers were codesigned by hardware architects, system software developers, domain scientists, computer scientists and applied mathematicians — helping ensure systems are shaped by real scientific workloads, not abstract benchmarks alone. 

Building on Generations of LANL Systems

Mission, expected to be operational in 2027, will be the fifth Advanced Technology System in the National Nuclear Security Administration’s Advanced Simulation and Computing program and will replace Crossroads for classified national security workloads. 

Vision, also expected to be operational in 2027, will serve as a resource for fundamental science, including materials and nuclear science, energy modeling, biomedical research and AI — letting more scientists test methods, train models and explore ideas before moving into higher-consequence work.

The work extends more than a decade of LANL and NVIDIA’s deep collaboration on CPUs, from Grace to Vera, using extreme codesign for LANL simulation workloads.

The three new supercomputers build on Venado, the HPE Cray EX supercomputer installed at Los Alamos in 2024 with NVIDIA GH200 Grace Hopper Superchips and NVIDIA Grace CPU Superchips. 

Learn more about the NVIDIA Vera CPU.

Hotter Than a Hot Tub: The 45°C Breakthrough to Cool AI’s Biggest Machines

NVIDIA’s latest AI servers can run on coolant warmer than a hot tub — and that counterintuitive choice is one of the biggest efficiency leaps in data center history.
by

Hot tubs sit at about 38 to 40 degrees Celsius, warm enough that most people can only soak for about 15 minutes. NVIDIA’s newest AI servers can run their cooling liquid even hotter — up to 45 degrees Celsius, or 113 degrees Fahrenheit. That higher temperature limit is precisely what makes them more energy efficient.

The Rubin generation of NVIDIA AI infrastructure is the world’s first to achieve 100% liquid cooling — every chip, every networking component, cooled entirely by liquid in a closed loop with no fans anywhere in the system. This liquid cooling methodology is outlined in the NVIDIA DSX AI factory reference design, a guide that outlines best practices to design, build and operate the entire AI factory infrastructure stack.

Although each generation offers significantly more computing power for each watt, full liquid-cooled AI compute infrastructure enables data centers to dramatically reduce cooling energy consumption — making a meaningful difference to overall data center energy use at hyperscale.

“The NVIDIA DSX reference design for AI factories has zero water consumption — we have eliminated massive amounts of power usage and pretty much all water usage,” said Ali Heydari, director of data center cooling and infrastructure at NVIDIA. “With dry-cooler-based designs, it’s a closed-loop system with no evaporative water cooling — outside of maybe 1% of the year when we might need chillers in some climates.”

Historically, cooling alone has accounted for up to 40% of a data center’s electricity consumption, making it one of the most significant areas where efficiency improvements can drive down both operational expenses and energy demands.

Industry estimates suggest that raising chiller plant temperatures by just one degree can cut cooling energy costs by about 4%. At scale, those savings add up quickly. A 50-megawatt hyperscale facility can save over $4 million annually in cooling-related energy and water costs by moving to liquid-cooled infrastructure. 

In favorable climates, NVIDIA’s 45-degree liquid-cooling architecture can enable chiller-less operation with dry coolers, reducing facility cooling water consumption from roughly 2.6 million gallons per megawatt per year for conventional cooling-tower-based systems to near zero — up to a 100% reduction in water use. 

The reason: traditional air-cooled data centers depend on large volumes of cooled air to remove heat from IT equipment, often requiring energy-intensive cooling infrastructure during hot weather. With NVIDIA’s 45-degree liquid cooling, heat is captured directly at the chip and transported through liquid loops operating at much higher temperatures, allowing outdoor dry coolers to reject heat efficiently for much of the year while significantly reducing mechanical cooling requirements and facility water consumption. 

The data center ambient temperature is flexible — warm summer air is fine — because nothing in the server depends on cool air. The liquid does all the work — and the same liquid can be recirculated in a closed loop so no new water is consumed to cool the chips.

 

A New Standard for the Industry

Because the NVIDIA Rubin platform integrates 100% liquid-cooled infrastructure, every cloud provider and data center operator building for it is making the transition. 

The ecosystem is keeping pace. Motivair, the advanced cooling division of Schneider Electric, has worked alongside NVIDIA’s product roadmap for nearly a decade — and Richard Whitmore, its president and CEO, says the relationship only intensified as power densities crossed the threshold where air cooling was no longer a viable option.

“Once the watts per chip crossed a certain level, liquid cooling became mandatory,” said Whitmore.

Too Hot to Cool AI Infrastructure Is Hotter Than You’d Think

There’s a long-standing misconception in the industry that a cold data center is an efficient one. Decades ago, if a data center didn’t feel like a walk-in freezer, people would assume something was wrong. 

In reality, chips can sustain far warmer environments than that instinct suggests. Silicon processors generate enormous internal heat — the coolant entering a fully liquid-cooled chip at 45 degrees Celsius exits at roughly 55 degrees, having absorbed that heat load across the chip surface. Yet performance doesn’t degrade. 

The processors continue to operate at full performance because liquid-cooled cold plates keep device temperatures within validated operating limits, even with coolant entering the rack at 45 degrees Celsius. 

No Fans, No Cold Aisles — A Fundamentally Different Machine

Walk into a traditional data center and notice two things: the noise — cooling fans contribute to total noise levels at or above 85 decibels, loud enough to require ear protection — and the physical choreography of hot aisles and cold aisles, carefully managed to push cooled air across components. 

The Rubin architecture changes the picture.

Coolant — 75% water and 25% propylene glycol — flows through cold plates that sit directly on processors, pulling heat out at the source. Running that coolant at up to 45 degrees Celsius means that in many climates, the facility loop can reject heat without turning on mechanical chillers and noisy fans. 

In an AI factory, coolant flows from a coolant distribution unit to the servers in a closed-loop cyle.

That unlocks something beyond energy savings: the possibility of eliminating water consumption entirely. 

In the right geography — somewhere with reliably cool outdoor air — a liquid-cooled data center can reject its heat through coolant distribution units that capture heat directly at the source and transport it to outdoor dry coolers, essentially large radiator coils positioned outside the building. 

The loop is filled once and runs closed for the life of the facility. And it takes dramatically less space in the AI factory compared to traditional air-cooling infrastructure.

“In the right geographic location, with the right system design, you don’t need any refrigeration equipment,” Whitmore said. “You can just put big radiator coils outside and use the air temperature for all your cooling. It’s incredibly efficient.”

The geography caveat matters. A data center in the Scottish Highlands and one in Phoenix, Arizona, face very different realities. But even in warmer climates, the shift toward 45-degrees-Celsius coolant moves operators significantly closer to that chiller-less ideal — where chillers may turn on just a few days a year when the outside air temperature demands it.

Another key benefit of this new model for AI factories is the potential for waste heat recovery, where residual heat from AI factory operations can be repurposed to heat commercial or residential buildings nearby. 

The Engineering Problem Nobody Had Solved

Previous liquid-cooled servers were hybrid: GPUs and CPUs got cold plates, but the rest of the system stayed air-cooled, with finned heat sinks designed to shed heat into moving air. In a fully liquid-cooled server, the cooling for these components needed to be completely redesigned to use liquid.

NVIDIA’s thermal engineering team reworked how those components handle heat, designing cooling loops that simplify how liquid is routed to multiple high-power chips on the board using a single inlet and outlet, resulting in a cleaner tray-level cooling architecture.

One visible outcome: Rubin servers have clean, sealed front panels where air-cooled servers have perforated bezels. Another: fully liquid cooled servers enable higher rack density than air-cooled servers, so a system that previously occupied six rack units now fits in two — more compute, less space, less noise.

Liquid cooling infrastructure overhead pipes routes into powerful AI servers.

AI workloads are not getting lighter. The compute demand driving data center construction is growing faster than almost any other category of infrastructure investment. 

Without efficiency improvements in how that compute is cooled, the energy cost of running AI at scale would grow in lockstep with the hardware. Liquid cooling at up to 45 degrees Celsius — hotter than a hot tub, cooler for the planet — is one of the most important tools the industry has to close that gap.

Learn more about liquid cooling, the NVIDIA DSX platform for AI factories and NVIDIA’s approach to energy-efficient AI infrastructure.