What Is Active Learning?

Finding the right self-driving training data doesn’t have to take a swarm of human labelers.
by Danny Shapiro

Reading one book on a particular subject won’t make you an expert. Nor will reading multiple books containing similar material. Truly mastering a skill or area of knowledge requires lots of information coming from a diversity of sources.

The same is true for autonomous driving and other AI-powered technologies.

The deep neural networks responsible for self-driving functions require exhaustive training. Both in situations they’re likely to encounter during daily trips, as well as unusual ones they’ll hopefully never come across. The key to success is making sure they’re trained on the right data.

What’s the right data? Situations that are new or uncertain. No repeating the same scenarios over and over.

Active learning is a training data selection method for machine learning that automatically finds this diverse data. It builds better datasets in a fraction of the time it would take for humans to curate.

It works by employing a trained model to go through collected data, flagging frames it’s having trouble recognizing. These frames are then labeled by humans. Then they’re added to the training data. This increases the model’s accuracy for situations like perceiving objects in tough conditions.

Finding the Needle in the Data Haystack

The amount of data needed to train an autonomous vehicle is enormous. Experts at RAND estimate that vehicles need 11 billion miles of driving to perform just 20 percent better than a human. This translates to more than 500 years of nonstop driving in the real world with a fleet of 100 cars.

And not just any driving data will do. Effective training data must contain diverse and challenging conditions to ensure the car can drive safely.

If humans were to annotate this validation data to find these scenarios, the 100-car fleet driving just eight hours a day would require more than 1 million labelers to manage frames from all the cameras on the vehicle — a gargantuan effort. In addition to the labor cost, the compute and storage resources needed to train DNNs on this data would be infeasible.

The combination of data annotation and curation poses a major challenge to autonomous vehicle development. By applying AI to this process, it’s possible to cut down on the time and cost spent on training, while also increasing the accuracy of the networks.

Why Active Learning

There are three common methods to selecting autonomous driving DNN training data. Random sampling extracts frames from a pool of data at uniform intervals, capturing the most common scenarios but likely leaving out rare patterns.

Metadata-based sampling uses basic tags (for example, rain, night) to select data, making it easy to find commonly encountered difficult situations, but missing unique frames that aren’t easily classified, like a tractor trailer or man on stilts crossing the road.

Random Data Sampling Extracts
Caption: Not all data is created equal. Example of a common highway scene (top left) vs. some unusual driving scenarios (top right: cyclist doing a wheelie at night, bottom left: truck towing trailer towing quad, bottom right: pedestrian on jumping stilts).

Finally, manual curation uses metadata tags combined with visual browsing by human annotators — a time-consuming task that can be error-prone and difficult to scale.

Active learning makes it possible to automate the selection process while choosing valuable data points. It starts by training a dedicated DNN on already-labeled data. The network then sorts through unlabeled data, selecting frames that it doesn’t recognize, thereby finding data that would be challenging to the autonomous vehicle algorithm.

That data is then reviewed and labeled by human annotators, and added to the training data pool.

Active Learning Loop

Active learning has already shown it can improve the detection accuracy of self-driving DNNs over manual curation. In our own research, we’ve found that the increase in precision when training with active learning data can be 3x for pedestrian detection and 4.4x for bicycle detection relative to the increase for data selected manually.

Advanced training methods like active learning, as well as transfer learning and federated learning, are most effective when run on a robust, scalable AI infrastructure. This makes it possible to manage massive amounts of data in parallel, shortening the development cycle.

NVIDIA will be providing developers access to these training tools as well as our rich library of autonomous driving deep neural networks on the NVIDIA GPU Cloud container registry.

Learn more about autonomous driving on the NVIDIA Technical Blog.