NVAIL Partners Show Groundbreaking AI Research at NIPS Event

by Anushree Saxena

More people have probably heard of the annual NIPS conference in the past week than over its previous 30 years as the premier event focused on neural networks.

The once obscure gathering — held last week in Long Beach, Calif. — drew coverage from the New York Times, Bloomberg, The Economist and other major outlets, focused on its astounding growth as AI has become a hot field and on the mad dash companies are making to recruit gifted developers.

But recall what put NIPS on the map in the first place: the sharing of world-class research that advances artificial intelligence.

Two of our NVIDIA AI Labs (NVAIL) partners were among those presenting groundbreaking work. Researchers at New York University have advanced how computers can classify objects within complex images, and taken a step toward what might loosely be thought of as peripheral vision for machines.

And researchers at the University of California, Berkeley, are using the parallel processing power of GPUs to give machines more “curiosity” to explore their environments when trying to complete a task.

Giving Image Recognition a Second Look

At NIPS, lead NYU researcher Sean Welleck described how his team is using reinforcement learning to help computers better classify the objects within images.

Instead of looking at an image in a rote pattern — say from left to right, starting at the top and working its way to the bottom — the team’s multi-object recognition model takes a high-level look at all the objects in the image. If it identifies something it can likely classify correctly, then it takes a closer look and gets a reward if it’s right. It then proceeds to another object it has a good bead on.

This ability to classify objects in any order is a major advancement, and could lead to faster and more accurate image classification. It also minimizes the need for annotating objects — the drudgework often required to get good, labeled datasets to work with. Welleck’s work makes the best of the labels already present.

The research is also a step toward giving computers peripheral vision, where two levels of attention — one scanning the image for objects and the other deciding to take a closer look at potentially interesting items — are in play.

Rewarding Complex Tasks

At Berkeley, Justin Fu is lead researcher on work to overcome the problem in reinforcement learning of how to incentivize a machine to complete complex tasks. A classic reward test is the game Pong, where an AI, or even a sophisticated toddler, can learn to manipulate a paddle to successfully keep a ricocheting ball in play.

But games with many more choices, like Doom, pose a much greater challenge for an AI, and plenty of high-IQ humans, because the reward only comes after a much longer sequence of successful steps. If the task is complex enough, the chance of randomly completing it — and ever getting the reward — is slim.

The research team’s proposed solution uses what’s known as an exemplar model. In it, the model is incentivized to take actions that result in unexpected outcomes — so your robot isn’t trying the same left turn every time it sees a T in a maze. Instead, it’s incentivized to explore the options in its environment.

It does this by determining the differences between new images and all the previous ones it’s seen. Instead of comparing raw pixels between images, the model trains a classifier to distinguish what’s new in an image compared to earlier ones it’s examined.

Thanks to GPU computing, the model can crank through tons of images quickly, classifying everything as it goes. Noting these subtle changes to the environment helps the model try new options and better figure out how to successfully complete its task.

NYU and Berkeley are two of the 20 universities across the globe our NVAIL program supports. NVAIL helps researchers from schools like these advance their work through assistance from NVIDIA researchers and engineers, support for the universities’ students, and with access to the industry’s most advanced GPU computing power, like the DGX-1 AI supercomputer.

Bringing Structure to Conditional Modeling

Led by researcher Zhijie Deng, a team from Beijing’s Tsinghua University employed structured generative adversarial networks (or SGANs) to achieve state-of-the-art results in conditional generative modeling, which is used in, for example, creating images of objects based on labels.

A key advancement is it does this using only a small set of labeled data. The work involves two pairs of adversarial and collaborative games. They estimate the real conditional distribution and make semantics of our interest and other variables disentangled.

Generating labeled images from these areas can supplement the results of label-rich supervised learning and advance common AI tasks such as image classification.

Want to learn more about AI, machine learning and deep learning? Check out our cheat sheet to the top courses in AI