Pickup Artist: GPU-Powered Robot Wins Amazon Warehouse Challenge

by Barrett Williams

Powerhouse teams from the U.S., Germany, Japan and elsewhere flew in to train, compete and prove themselves the best in the world.

They’re not competing in the quadrennial games in Rio de Janeiro, but the Amazon Picking Challenge, held this year at Robocup 2016 in Leipzig, Germany. Now in its second year, the event challenges autonomous robots to figure out how to pick and stow objects.

The competition marked a bellwether moment for GPU-powered deep learning in warehouse and factory automation, as it played an essential role in five of the 10 top ranking teams, including the top two finishers.

Deep learning “neural nets played a big part in many teams’ approaches to object recognition this year, a significant change from last year,” said Joey Durham, manager of Research and Advanced Development at Amazon Robotics.

The Future of Warehouse Automation

Amazon has used logistics robots in its vast warehouses across the globe to move around shelving units and minimize the hunt for items it will deliver to its millions of customers. But identifying and retrieving objects for packing, shipment and stowing still requires a great deal of work by hand.

To automate these tasks and to build links between the industrial and academic robotics communities, Amazon created the Amazon Picking Challenge. Participating teams are challenged to deploy robots that can autonomously recognize objects and pick, and stow, the desired targets from a range of unsorted items.

Netherlands’ TU Delft Takes First Place for Stowing and Picking

The winning team for the separate picking and stowing challenges was Delft University of Technology (TU Delft) in the Netherlands, working in collaboration with Delft Robotics, an affiliated startup. The team was able to detect objects in a mere 150 milliseconds.

TU Delft used an NVIDIA TITAN X GPU and about 20,000 images to train a “base” model, from which they built models for both a “bin” and a “tote” arrangement. They trained their deep learning system using PyFaster-RCNN, which uses the Caffe framework and is hardware-accelerated by NVIDIA cuDNN.

“After these results, we at Delft Robotics are currently working on implementing the knowledge of GPU computing that we acquired during the challenge and deploying these algorithms in industrial systems,” said Hans Gaiser, computer vision programmer at Delft Robotics.

Germany’s UBonn Snatches Second Place for Stowing

Team NimbRo from the University of Bonn took second place in the stowing challenge. The team integrated depth estimates from infrared light projection and two Intel RealSense SR300 depth cameras. It also measured dense stereo depth from two integrated full-HD color cameras.

Team NimbRo’s approach incorporated training on the ImageNet challenge, a popular image recognition benchmark, and annotation on a relatively small number of images for the Amazon Picking Challenge scenes. For training and recall, it used a Supermicro GPU workstation with four NVIDIA GeForce GTX TITAN X GPUs.

Japan’s PFN Grabs Second Place for Picking

Japan’s Preferred Networks, collaborating with FANUC, an industrial automation giant, came in second place in the picking challenge. The used convolutional neural networks for two different tasks: image segmentation using RGBD data produced by an Intel RealSense SR300 camera and an NVIDIA GeForce GTX 870M notebook GPU to perform segmentation at 0.1 seconds per image, instead of 2 seconds per image on a CPU.

Preferred Networks deployed Chainer, a deep learning framework built on CUDA and cuDNN. And it used 100,000 images rendered in 3D using Blender, also powered by GPUs, as well as 1,500 human-annotated photos. The training process took two days.

Amazon Picking Challenge graphic