Groundbreaking Deep Learning Research Takes the Stage at GTC

by Jamie Beckett

NVIDIA researchers aren’t magicians, but you might think so after seeing the work featured in today’s keynote address from NVIDIA founder and CEO Jensen Huang at the GPU Technology Conference, in San Jose.

Huang spotlighted two deep learning discoveries that hold the potential to upend traditional computer graphics. Both could help game developers create richer experiences in less time and at lower cost. And one could accelerate autonomous vehicle development by easily creating data to train cars for a wider variety of road conditions, landscapes and locations.

The pair of research projects are the latest examples of how we’re combining our expertise in deep learning with our long history in computer graphics to advance industries. Our 200-person strong NVIDIA Research team — spread across 11 worldwide locations — is focused on pushing the boundaries of technology in machine learning, computer vision, self-driving cars, robotics, graphics, computer architecture, programming systems, and other areas.

“The productivity of this organization is absolutely incredible,” Jensen said. “They’re doing fundamental and basic research across the entire stack of computing.”

De-noising the image on the left was done using the traditional method, training a neural network on corresponding clean and noisy images. The NVIDIA researchers' image on the right was created by training the network on noisy images only.
The two images here are clean versions of the same noisy picture. De-noising the image on the left was done by training a neural network on corresponding clean and noisy images. Researchers de-noised the picture on the right using a model trained soley on noisy images.

Cleaning up Noisy Images

You may not know what a noisy image is, but you’ve probably taken one. You aim the camera at a dimly lit scene, and your picture turns out grainy, mottled with odd splotches of color, or white spots known as fireflies.

Removing noise from images is difficult because the process itself can add artefacts or blurriness. Deep learning experiments have offered solutions, but have a major shortcoming: They require matched pairs of clean and noisy images to train the neural network.

Ordinary AI denoising requires matched pairs of clean and dirty images. But it's often impossible to get clean images for MRIs and some other medical images. With Noise2Noise, no clean images are necessary.
Ordinary AI denoising requires matched pairs of clean and dirty images. But it’s often impossible to get clean images for MRIs and some other medical images. With Noise2Noise, no clean images are necessary.

That works as long as you have good pictures, but it can be hard, or even impossible, to get them. NVIDIA researchers in Finland and Sweden have created a solution they call Noise2Noise to get around this issue.

Garbage in, Garbage out? Not Anymore

Producing clean images is a common problem for medical imaging tests like MRIs and for astronomical photos of distant stars or planets —  situations in which there’s too little time and light to capture a clean image.

Time also poses a problem in computer graphics. Just the task of generating clean image data to train a denoiser can take days or weeks.

Noise2Noise seems impossible when you first hear about it. Instead of training the network on matched pairs of clean and noisy images, it trains the network on matched pairs of noisy images — and only noisy images. Yet Noise2Noise produces results equal to or nearly equal to what a network trained the old-fashioned way can achieve.

“What we’ve discovered is by setting up a network correctly, you can ask it do something impossible,” said David Luebke, our vice president of research. “It’s a really surprising result until you understand the whole thing.”

Not Child’s Play

The second project Huang featured represents a whole new way of building virtual worlds. It uses deep learning to take much of the effort out of cumbersome and costly tasks of 3D modeling for games and capturing training data for self-driving cars.

The technique, called semantic manipulation, can be compared to Lego bricks, which kids can put together to build anything from jet planes to dragons.

In semantic manipulation, users start with a label map — what amounts to a blueprint with labels for each pixel in a scene. Switching out the labels on the map changes the image. It’s also possible to edit the style of objects, like choosing a different kind of car, tree or road.

The NVIDIA researchers' AI-powered image synthesis technique makes it possible to change the look of a street simply by changing the semantic label.
The NVIDIA researchers’ deep learning-powered image synthesis technique makes it possible to change the look of a street simply by changing the semantic label.

Tough Game

The research team’s method relies on generative adversarial networks (GANs), a deep learning technique often used to create training data when it’s scarce.

Although GANs typically struggle to generate photorealistic, high-resolution images, NVIDIA researchers were able to alter GAN architecture in a way that made it possible.

Today, creating virtual environments for computer games requires thousands of hours of artists’ time to create and change models and can cost as much as $100 million per game. Rendering turns those models into the games you see on the screen.

Reducing the amount of labor involved would let game artists and studios create more complex games with more characters and story lines.

San Francisco to Barcelona: No Flight Required  

Obtaining data to train self-driving cars is equally cumbersome. It’s typically done by putting a fleet of cars equipped with sensors and cameras on the road. The data captured by the cars must then be labeled manually, and that’s used to train autonomous vehicles.

The team’s method could make it possible to take data from, say, San Francisco and apply it to another hilly city like Barcelona. Or turn a cobblestone street into a paved one, or convert a tree-lined street into one lined with parked cars.

That could make it possible to more effectively to train cars to handle many different situations. And could lead to a graphics rendering engine that’s been trained on real-world data and rendered with generative models.

“I’m so proud of our NVIDIA research team,” Jensen said. “We’re growing. Reach out to us. We’d love to work with you.”

For more information about how our researchers are revolutionizing graphics, see the papers (listed below) or read our related articles, “NVIDIA Research Brings AI to Graphics” and “NVIDIA Researchers Showcase Major Advances in Deep Learning at NIPS.”