They say there’s no such thing as a dumb question. As someone who asks dumb questions for a living, I can tell you that’s a really stupid thing to say.
But the best questions are often the ones where someone smart explains something from the ground up to a total novice (read: me). The beauty of NVIDIA: there are a lot of smart people upon whom I can inflict my very dumbest questions.
Turns out I’m not alone. Month in and month out, tens of thousands of readers ask search engines these very questions. And they get connected to the answers through our blog.
What are they? Smart question. Here are five of our most popular in 2019.
This post is over a decade old, but the answer — thanks to the emergence of deep-learning driven AI, supercomputing, and self-driving cars — is more relevant than ever. That’s why we’ve updated our original post earlier this year, and why more readers are seeing this post than ever.
Visualize the fields of AI, machine learning and deep learning as concentric circles. AI — the idea that came first — is the largest circle. Then comes machine learning, which blossomed later. And finally deep learning — which is driving today’s AI explosion — fitting inside both. Click on the link, above, for more.
This is one of the key questions in AI right now, which is why this post has become one of our most popular. Click on the link for a plain English answer to these questions, and a walk through the kinds of datasets and problems that lend themselves to each kind of learning.
This is another question that’s drawn more readers over time. Training, in short, is the process of running data through a neural network to teach it a task. That’s taught computers to do things that, just a decade ago, most believed could only be done by humans. Inference, by contrast, is the process of putting that trained network to work, in everything from hyperscale data centers to autonomous machines.
Used to be if you wanted to see ray tracing, you went to the movies. If you wanted to see rasterization, you fired up a video game. Ray tracing models the way light moves around the real world beautifully but it’s computationally intensive. Rasterization, by contrast, can be done in a hurry. NVIDIA’s latest Turing architecture GPUs blur these lines, with hardware acceleration for real-time ray tracing, making truly cinematic games possible.
Have a question you want answered? Send us your idea.
NVIDIA Research Advances Robotics From Simulation to the Real World
Featured at the International Conference on Robotics and Automation, eight new NVIDIA Research papers show how robots trained in simulation are moving into the real world.
Robotics is entering a new phase: moving from controlled demos and scripted automation toward generalizable, reliable embodied autonomy in the real world.
At the International Conference on Robotics and Automation (ICRA), eight of NVIDIA Research’s 28 accepted papers show how simulation-to-real transfer is becoming a foundation for that shift, helping robots perceive, reason, plan and act across dynamic, unpredictable environments.
Together, the papers span the full stack of challenges robot developers face: coordinating multiple arms in parallel, building policies that generalize across robot bodies, grasping novel objects in clutter, performing precise assembly and developing vision-language-action models that reason before they move.
The throughline is clear: sim-to-real is becoming a foundation for robots that can adapt, generalize, and operate with greater reliability outside the lab.
Picture a pharmaceutical lab run by robotic arms: picking up tubes, transferring liquids, mixing reagents — each step taking different amounts of time, all requiring careful coordination.
Traditional robot scheduling software handles those steps sequentially, one arm at a time.
ScheduleStream changes that by running computations on GPUs, letting multiple arms plan movements and operate in parallel. The result — a 3x speedup across multi-arm planning scenarios, on hardware like the NVIDIA Jetson edge AI platform. Code for the framework is available on GitHub.
A robot that learns to navigate through a space — avoiding obstacles and finding its destination — usually learns to do it in one body. Put the same navigation software into a differently shaped robot and it often falls apart, because its parts all move differently.
The COMPASS policy framework solves this by first building the baseline navigation functionality using imitation learning and then using residual reinforcement learning in NVIDIA Isaac Lab to build specialists for diverse robot embodiments. Crucially, no real-world robot data is involved at any stage: everything is trained in Isaac Lab simulation.
Compared with an imitation learning baseline, COMPASS achieved a 4.5x improvement in average success rate. It also seamlessly transfers to real-world environments, demonstrating around 80% success across 20 real-world navigation trials on autonomous mobile robots and humanoids.
COMPASS is agent-friendly, with dedicated skills — and developers can connect the pipeline with NVIDIA Omniverse NuRec to post-train and validate robots in a digital twin of a novel environment before deployment.
Most grasping systems identify the object, predict a grasp, plan a path, then execute. But the last few centimeters are where small errors matter most.
Grasp-MPC adaptively computes robotic grasps, continuously correcting the robot’s motion as it closes in on the object, rather than carrying out a fixed plan — the way a person grabs something by feeling rather than calculating every joint angle in advance.
To build the policy, the researchers generated 2 million simulated trajectories across 8,000 objects using annotations from the GraspGen dataset and motion planning data from cuRobo, a CUDA-accelerated library for robot motion generation.
After training on both successful and failed trajectories, Grasp-MPC learned to grasp novel objects in cluttered tabletops and shelves — achieving around 75% overall success on real robots, compared with a baseline of 41%.
Deformable Cluster Manipulation introduces a framework that tackles a parallel challenge: enabling systems to grasp not just one object, but a whole bundle of flexible, tangled material at once.
The framework was motivated by a real-world task: clearing a mass of tree branches that have grown over a power line, where there’s no single clean object to grab. The system uses its entire arm, not just the gripper: wrapping it around the branch cluster and sweeping it aside, the way someone might gather an armful of cables or push a tangle of brush out of the way.
The researchers built a tree generator using biological growth equations to create synthetic trees of many different shapes and sizes — then trained the system across thousands of them in NVIDIA Isaac open simulation frameworks.
The policy deploys to real branches zero shot. Beyond power lines, the researchers see potential in cable management, agricultural inspection and anywhere robots need to handle a tangle rather than a single graspable item.
Clearing tree branches in zero-shot sim-to-real deployment.
Assembling With Precision
Precise assembly — threading a nut onto a bolt, inserting a gear onto a gearshaft, pressing a peg into a hole — is notoriously hard to get right with simulation alone.
The real world is complex. Real surfaces aren’t perfectly smooth. Sensors don’t behave as specified. Tiny discrepancies that a simulator ignores can stop a robot in its tracks.
The SPARR method addresses this by splitting the job in two. A policy trained in Isaac Lab learns the general strategy for the assembly task in simulation. Then, on the actual hardware, a second layer learns to correct for whatever the simulator got wrong — using the robot’s own camera and without any human demonstrations or guidance.
SPARR improves success rates by 38% and reduces cycle time by around 30% compared with zero-shot sim-to-real baselines.
On National Institute of Standards and Technology (NIST) assembly tasks not seen during training, success improves by nearly 75% — approaching the results of methods that require a human in the loop.
The Refinery framework takes on the next layer of difficulty in assembly: tasks with multiple sequential steps, where how step one is finished determines whether step two is even possible. It’s like assembling furniture — leave a panel at the wrong angle, and the next fastener won’t go in.
By understanding how success varies across initial conditions and training across hundreds of simulated assembly scenarios, Refinery learns how to complete each step and leave each component in a position that sets up the next. It achieves 91% simulation success and a nearly 11% mean improvement over baselines with comparable real-world results — and its policies can be chained to handle long, multi-part sequences.
Action Models That Keep Their Word
The PEEK pipeline helps robots see past the clutter. In a typical manipulation task, the robot’s camera picks up everything in the scene — but most of it is irrelevant noise.
One task demonstrated on the PEEK project page is “give the banana to NVIDIA founder and CEO Jensen Huang”: a photo of Huang sits on a table alongside a photo of Michael Jordan, a collection of unrelated objects and other distractors.
A human doing the task instantly focuses on the banana and the right photo; a standard robot policy has to process everything and often gets confused. PEEK solves this by having a vision language model read the task instruction and focus the robot’s line of vision accordingly — showing a movement path, and highlighting around the objects that matter, while fading out everything else.
The policy then acts on that annotated view rather than the raw scene. For a policy trained purely in simulation, adding PEEK produced a 41x real-world improvement in accuracy. For large VLA models and smaller policies, gains range from 2-3.5x. Because it works at the image level, PEEK integrates with any camera-based policy without modification.
Do What You Say — a collaboration with researchers at Carnegie Mellon University, University of Utah and University of Sydney — addresses a specific failure mode that matters more as robots tackle longer, more complex tasks.
Give a robot an instruction like “store everything on this table inside the cabinet” or “prepare a Manhattan,” and it has to break that down into individual steps and execute them in sequence.
The problem is that the AI model can correctly reason through what it needs to do — and then execute something different.
The method, called SEAL, fixes this at runtime without any retraining: the robot generates several candidate action sequences, thinks through where each one would actually lead and picks the outcome that matches what it said it would do. SEAL delivers up to 15% accuracy gains over prior work, with robustness against rephrased instructions, changed objects, scene clutter and shifted camera angles.
In addition to papers, NVIDIA is expanding robotics research infrastructure with large-scale open datasets for robotics. The NVIDIA Physical AI Dataset is the world’s largest open dataset for physical development, surpassing 15 million+ downloads, while NVIDIA Isaac GR00T X Embodiment Sim has become one of the most-downloaded robotics datasets.
Universities Accelerate Physical AI Research With NVIDIA Technologies
Robotics teams from universities such as Carnegie Mellon University (CMU), ETH Zurich, MIT and University of Texas at Austin are tapping NVIDIA technologies to move physical AI research from simulation to real-world systems — with nearly 50 accepted papers referencing NVIDIA-accelerated simulation, robot learning and compute.
The next universal technology since the smartphone is on the horizon — and it may be a little less pocket friendly.
The Moonshot research program, funded by the Japan Science and Technology Agency and accelerated by NVIDIA AI and robotics technologies, is working to create a world by 2050 where AI-powered, autonomously learning robots are integrated into Japanese citizens’ everyday lives.
That’s just goal No. 3 of the broader Moonshot initiative, which includes researchers from across Japan’s universities and comprises 10 ambitious technology goals — from ultra-early disease prediction to sustainable resource circulation.
In light of Japan’s rising elderly population, many of the research projects underway center on how robots can aid in senior care. This includes designing a robot that’s capable of caregiving tasks like cooking, cleaning and hygiene care.
NVIDIA Architecture Powers On Moonshot Robots
NVIDIA technologies are integrated into every level of the Moonshot project’s senior care robots known as AI-Driven Robot for Embrace and Care, or AIREC.
Dry-AIREC robot, the larger and more mobile member of the Moonshot family, has two NVIDIA GPUs onboard. For AIREC-Basic, primarily used for data collection for the motion foundation model, three NVIDIA Jetson Orin NX modules power AI processing at the edge.
Pictured is AIREC-Basic (left) and AIREC-Basic (right).
Plus, NVIDIA Isaac Sim, an open-source robotic simulation framework, was used to train the AIREC robots to perform specific tasks, such as estimating the forces between objects.
The integration of NVIDIA technologies and AI into the robot development process has allowed this project to go from a far-fetched dream to reality faster than imagined.
“Five years ago, before generative AI, few people believed that this application was possible,” said Tetsuya Ogata, professor and director of the Institute for AI and Robotics at Waseda University. “Now, the atmosphere surrounding this technology has changed, so we can seriously think about this kind of application.”
Building a Full Set of Caregiving Capabilities
Additional research projects are underway to develop the Moonshot robot’s elderly-care capabilities.
“We’re focusing on things like changing diapers, helping patients take baths and providing meal assistance, so those actions can be supported by the robots, and caregivers can focus on improving the patients’ lives,” said Misa Matsumura, a bioengineering master’s student at the University of Tokyo.
A recent paper by Matsumura — presented at the IEEE/RSJ International Conference on Intelligent Robots and Systems — focused on repositioning, an essential action in elderly care to prevent bed sores and enable diaper changing.
Automating repositioning with a humanoid robot — while considering the elderly care patients’ personal states and bodily needs — is no easy feat.
To train the Dry-AIREC robots for this research endeavor, Matsumura’s team used laptops powered by NVIDIA RTX GPUs.
Matsumura used 3D posture estimation, trajectory calculations and force estimation to further develop the robots’ capabilities.
Dry-AIREC’s fisheye and depth cameras helped assess the movements required to reposition patients. The exact repositioning method needed for a patient is found through trajectory calculations based on movement data from skilled caregivers.
The robot must also use the right amount of force in repositioning to complete the action without causing the patient pain. By predicting the pressure required to press the shoulders and knees, it determines the appropriate timing for movement — enabling actions with the ideal applied force.
Preliminary experiments were done using mannequins, and Matsumura’s research has now advanced to incorporate humans testing the robots. Matsumura is conducting ongoing research to further improve this action for Dry-AIREC.
Milestone images for goal No. 3 for the Japan Science and Technology Agency.
Among the many projects within the Moonshot program, developing robots for elderly care has particular significance for some of the researchers due to the project’s social and personal implications.
“Although my study focus is on medical robotics, I decided to join this project because my mother is growing older, and that experience has given me an appreciation for the importance of personal care,” said Etsuko Kobayashi, professor of bioengineering at the University of Tokyo and Matsumura’s graduate advisor. “I found that my experience in medical robotics can be meaningfully extended to care robotics, contributing to the development of safe and reliable robotic systems for human-centered applications.”
Marine Biological Laboratory Explores Human Memory With AI and Virtual Reality
The lab in Massachusetts is studying molecular mechanisms of human memory function powered by NVIDIA RTX GPUs, HP Z Workstations and virtual-reality technology.
The works of Plato state that when humans have an experience, some level of change occurs in their brain, which is powered by memory — specifically long-term memory.
This change is what Andre Fenton, professor of neural science at New York University, and Abhishek Kumar, assistant professor of cell and regenerative biology at the University of Wisconsin–Madison, are studying at the Marine Biological Laboratory (MBL) in Woods Hole, Massachusetts.
“My life’s work is to understand how minds operate, and especially to understand memory — not merely as a trace of the past in the brain but as an estimate of the future that the brain is afforded,” Fenton said.
The researchers have upleveled their project by harnessing NVIDIA RTX GPUs and HP Z Workstations to visualize massive datasets and by integrating custom AI tools and syGlass, a virtual-reality (VR) platform for scientific exploration.
This project is additionally supported by grants from the National Institute of Mental Health and the Chan Zuckerberg Initiative.
A Neural Forest Uncovered
Memory is the job of the brain’s hippocampus. This C-shaped structure, resembling a seahorse, is the main focus of the MBL research group.
Fenton describes the cells within the hippocampus as a forest, where billions of neurons look like tiny tree trunks and the lines coming off the trunks look like leaves.
Projection image of neuronal cell nuclei (left) and dendrites (right) or branched extensions of a nerve cell. Images were acquired by Matthew Parent and Daryl Watkins.
The team is studying a small portion of these “leaves” — representing protein markers: an incredibly tedious task due to their length, at about a micrometer each. A researcher must search through the forest of brain cells to find the correct protein markers, which make up only about 1% of all protein markers in the hippocampus.
The researchers were looking to ease the process of studying these proteins and what their varying structures may reveal about memory encoding.
Collecting and analyzing enough 3D volumetric data on protein markers was a bottleneck within the project until NVIDIA and HP technologies were introduced into the workflow.
“This is a massive computational challenge, and the HP and NVIDIA technologies have enabled us to do the first step: capture, check and store the 3D image data,” Fenton said.
Using these technologies, the MBL researchers captured 10 terabytes of volumetric data and then performed human visual-quality inspections.
Understanding Memory Could Prevent Neurological Diseases
The team’s ultimate goal of discovering the function of memory at a molecular level can boost research into the root causes of brain diseases tied to neurocognition, such as Alzheimer’s and dementia.
“People don’t normally think of memory as part of their mental health, but almost all mental dysfunction depends on what your brain stores — the beliefs, the anticipations, the anxieties that you have and the things that you expect,” said Fenton. “These are all different aspects of what happens when you have a memory, so almost all neuropsychiatric illnesses and manipulation depend on this understanding.”
As a step toward solving these large-scale problems, the researchers are looking at how memory is affected when proteins go to incorrect locations in the hippocampus.
The team is also examining the correlation between the structure and function of brain cells through high-resolution 3D images curated and stored using syGlass on the HP Z high performance workstation powered and supported with multiple NVIDIA RTX GPUs.
“If we can understand how something is built, then if there’s a problem, we can dissect that and get to the bottom of it,” said Kumar. “That’s what we’re trying to do: understand how we retain memory, so if a problem arises, we know how to fix it.”
Enabling Virtual Reality and Student Exploration
The use of syGlass on the HP Z6 desktop workstation, running on NVIDIA RTX GPUs, turned the researchers’ endeavor from a time-consuming operation into an interactive scientific exploration — ideal for high-school-student engagement.
“The HP-NVDIA-syGlass system lets us innovate by engaging three high-school interns,” said Kumar. “They had an abstract interest in our science, and we recognized that the syGlass virtual experience might enthrall them. We were right.”
The researchers brought these three curious students into their lab this summer to analyze the memory proteins using the VR headsets, which allowed for 3D visuals of the data.
Their task was to find the specific proteins that were memory-related and label them as such. While this may sound like a simple task, the interns had to sift through a sea of billions of neurons to find only a few thousand protein markers that were relevant to the research.
High school intern using the syGlass VR headset to identify protein markers. Image taken by Andre Fenton.
Due to the success of this pilot program, the team is now looking to expand high-school research opportunities for the project.
“Why leave it at three students?” said Fenton. “Next year, it could be 10 at multiple locations helping us learn about brains while they learn about brains.”