With the field of robotics burgeoning, there’s no better place to be than Atlanta next month, when the NVIDIA Robotics Research Lab will present its latest findings at the Conference on Robot Learning, an annual event focused on the intersection of robotics and machine learning.
CoRL, taking place Nov. 6-9, will include 15 papers in all from NVIDIA, many of which highlight simulation-to-real findings that enable imitation learning, perception, segmentation and tracking in robotics.
For contact-rich manipulation tasks using imitation learning, NVIDIA researchers have developed novel methods to scale data generation in simulation. They’ve also developed a human-in-the-loop imitation learning method to accelerate data collection and improve accuracy.
NVIDIA researchers also used large language models for tasks as diverse as traffic simulation and language-guided task description to predict grasping poses.
Some of the key papers featured at CoRL include:
- “Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-Modal Rearrangement”: This paper covers a versatile system for repositioning objects in scenes trained on 3D point cloud data. The approach excels in various rearrangement tasks, demonstrating its adaptability and precision in both simulation and the real world.
- “Imitating Task and Motion Planning With Visuomotor Transformers”: This paper covers a novel system that uses large-scale datasets generated by task and motion planning (TAMP) supervisors and flexible transformer models to excel in diverse robot manipulation tasks. For more information, check out this website.
- “Human-in-the-Loop Task and Motion Planning for Imitation Learning”: This paper covers the employment of a TAMP-gated control mechanism to optimize data collection efficiency, demonstrating superior results compared to conventional teleoperation systems. For more information, check out this website.
- “MimicGen: A Data Generation System for Scalable Robot Learning Using Human Demonstrations”: MimicGen is a data generation system that automatically generates large-scale robot training datasets by adapting a small number of human demonstrations to various contexts. Using around 200 human demonstrations, it created over 50,000 diverse task scenarios, enabling effective training of robot agents for high-precision and long-horizon tasks.
- “M2T2: Multi-Task Masked Transformer for Object-Centric Pick-and-Place”: This paper covers how a transformer model can pick up and place arbitrary objects in cluttered scenes with zero-shot sim2real transfer, outperforming task-specific systems by up to 37.5% in challenging scenarios and filling the gap between high- and low-level robotic tasks. For more information, check out this website.
- “STOW: Discrete-Frame Segmentation and Tracking of Unseen Objects for Warehouse Picking Robots”: In dynamic industrial robotic contexts, segmenting and tracking unseen object instances with substantial temporal frame gaps is challenging. The method featured in this paper introduces synthetic and real-world datasets and a novel approach for joint segmentation and tracking using a transformer module. For more information, check out this website.
- “Composable Part-Based Manipulation”: This paper covers composable part-based manipulation (CPM), a novel approach that utilizes object-part decomposition and part-part correspondences to enhance robotic manipulation skill learning and generalization, demonstrating its effectiveness in simulated and real-world scenarios. For more information, check out the website.
One of the papers to be presented as an oral presentation at this year’s conference is a follow-up project to those prior findings, “RVT: Robotic View Transformer for 3D Object Manipulation.” Additional details can be seen in this summary presentation:
Another notable paper written by NVIDIA Robotics Research Lab members this year is “Language-Guided Traffic Simulation via Scene-Level Diffusion.”
For more information, follow CoRL.
Learn more about the NVIDIA Robotics Research Lab, a Seattle-based center of excellence focused on robot manipulation, perception and physics-based simulation. It’s part of NVIDIA Research, which comprises more than 300 leading researchers around the globe, focused on topics spanning AI, computer graphics, computer vision and self-driving cars.