From Simulation to Data Generation: NVIDIA Research to Animate Conference on Robot Learning

Simulation-to-real findings among 15 papers to be presented by NVIDIA robotics researchers at CoRL, in Atlanta.
by Jason Black

With the field of robotics burgeoning, there’s no better place to be than Atlanta next month, when the NVIDIA Robotics Research Lab will present its latest findings at the Conference on Robot Learning, an annual event focused on the intersection of robotics and machine learning.

CoRL, taking place Nov. 6-9, will include 15 papers in all from NVIDIA, many of which highlight simulation-to-real findings that enable imitation learning, perception, segmentation and tracking in robotics.

For contact-rich manipulation tasks using imitation learning, NVIDIA researchers have developed novel methods to scale data generation in simulation. They’ve also developed a human-in-the-loop imitation learning method to accelerate data collection and improve accuracy.

NVIDIA researchers also used large language models for tasks as diverse as traffic simulation and language-guided task description to predict grasping poses.

Some of the key papers featured at CoRL include:

 

 

  • MimicGen: A Data Generation System for Scalable Robot Learning Using Human Demonstrations”: MimicGen is a data generation system that automatically generates large-scale robot training datasets by adapting a small number of human demonstrations to various contexts. Using around 200 human demonstrations, it created over 50,000 diverse task scenarios, enabling effective training of robot agents for high-precision and long-horizon tasks.

 

 

  • M2T2: Multi-Task Masked Transformer for Object-Centric Pick-and-Place”: This paper covers how a transformer model can pick up and place arbitrary objects in cluttered scenes with zero-shot sim2real transfer, outperforming task-specific systems by up to 37.5% in challenging scenarios and filling the gap between high- and low-level robotic tasks. For more information, check out this website.
  • STOW: Discrete-Frame Segmentation and Tracking of Unseen Objects for Warehouse Picking Robots”: In dynamic industrial robotic contexts, segmenting and tracking unseen object instances with substantial temporal frame gaps is challenging. The method featured in this paper introduces synthetic and real-world datasets and a novel approach for joint segmentation and tracking using a transformer module. For more information, check out this website.
  • Composable Part-Based Manipulation”: This paper covers composable part-based manipulation (CPM), a novel approach that utilizes object-part decomposition and part-part correspondences to enhance robotic manipulation skill learning and generalization, demonstrating its effectiveness in simulated and real-world scenarios. For more information, check out the website.

NVIDIA researchers have both the second- and third-most cited paper from CoRL 2021 and the most cited paper from CoRL 2022.

One of the papers to be presented as an oral presentation at this year’s conference is a follow-up project to those prior findings, “RVT: Robotic View Transformer for 3D Object Manipulation.” Additional details can be seen in this summary presentation:

Another notable paper written by NVIDIA Robotics Research Lab members this year is “Language-Guided Traffic Simulation via Scene-Level Diffusion.”

For more information, follow CoRL.

Learn more about the NVIDIA Robotics Research Lab, a Seattle-based center of excellence focused on robot manipulation, perception and physics-based simulation. It’s part of NVIDIA Research, which comprises more than 300 leading researchers around the globe, focused on topics spanning AI, computer graphics, computer vision and self-driving cars.