NVIDIA Research Shapes Physical AI

How Physical AI Unites Graphics, AI and Robotics

Physical AI development starts with the construction of high-fidelity, physically accurate 3D environments. Without these lifelike virtual environments, developers can’t train advanced physical AI systems such as humanoid robots in simulation, because the skills the robots would learn in virtual training wouldn’t translate well enough to the real world.

Picture an agricultural robot using the exact amount of pressure to pick peaches off trees without bruising them, or a manufacturing robot assembling microscopic electronic components on a machine where every millimeter matters.

“Physical AI needs a virtual environment that feels real, a parallel universe where the robots can safely learn through trial and error,” said Ming-Yu Liu, vice president of research at NVIDIA. “To build this virtual world, we need real-time rendering, computer vision, physical motion simulation, 2D and 3D generative AI, as well as AI reasoning. These are the things that NVIDIA Research has spent nearly two decades to be good at.”

NVIDIA’s legacy of breakthrough research in ray tracing and real-time computer graphics, dating back to the research organization’s inception in 2006, plays a critical role in enabling the realism that physical AI simulations demand. Much of that rendering work, too, is powered by AI models — a field known as neural rendering.

“Our core rendering research fuels the creation of true-to-reality virtual words used to train advanced physical AI systems, while AI is in turn helping us create those 3D worlds from images,” said Aaron Lefohn, vice president of graphics research and head of the Real-Time Graphics Research group at NVIDIA. “We’re now at a point where we can take pictures and videos — an accessible form of media that anyone can capture — and rapidly reconstruct them into virtual 3D environments.”

NVIDIA Research at SIGGRAPH

NVIDIA researchers are presenting at SIGGRAPH advancements in simulation, AI-powered rendering and 3D content generation with potential applications in virtual world creation, robotics development and autonomous vehicle training.

One paper addresses the challenge of reconstructing physics-aware 3D geometry from 2D images or video. While many models can estimate a 3D object based on video footage, the generated 3D shape often lacks structural stability. Even if it’s a close visual match to a real object, a generated shape may have slightly uneven proportions or missing details that impact its physical realism.

For example, a 3D simulation of a chair built from 2D footage might collapse when dropped into a physically accurate simulation because the AI model is making visual estimations of 3D structure, not ground-truth measurements. The method introduced in this paper helps ensure that the generated 3D shapes replicate real-world physics to avoid this pitfall — supporting the creation of virtual worlds for physical AI training.