Punch Buggy. Slug Bug. The names differ, but we’ve all played this game. See a Volkswagen Beetle, punch your sibling. Deep learning works the same way. Just with more math—and fewer bruises.
Deep learning refers to algorithms—step-by-step data-crunching recipes—for teaching machines to understand “unstructured data.” That’s another way of saying information that lives outside of spreadsheets and databases. For example, images, speech and video.
Because of the GPU’s key role in all these technologies, our GPU Technology Conference, which runs March 17-20 in Silicon Valley, is a great place to learn more. GTC will feature more than two dozen talks focused on how GPUs are changing the auto industry.
So, how does deep learning work? A great way to understand it is to look at NVIDIA DRIVE, our new auto-pilot car computer. When paired with computer vision technology—powered by our NVIDIA Tegra processors—DRIVE gives vehicles an uncanny level of self-awareness.
To show what DRIVE can do, NVIDIA engineers stuck video cameras onto their cars to capture 40 hours of video. They then used Amazon Mechanical Turk to have frames manually tagged by people to categorize about 68,000 objects in the footage.
Training Artificial Brains
Our engineers then fed these images to servers equipped with powerful GPUs that form an artificial neural network. It’s a process computer scientists call “training.” It lets a neural network learn to see patterns and recognize objects.
It’s a little like how children learn. Parents, friends and punch-happy older siblings identify objects in the world for the child. Then the child’s brain learns how to identify these objects in a broad array of situations.
This is where the “deep” in deep learning comes in. With deep learning, a neural network learns many levels of abstraction. They range from simple concepts to complex ones. Each layer categorizes some kind of information. It then refines it and passes it along to the next.
Deep learning stacks these layers. This lets a machine learn what computer scientists call a “hierarchical representation.” So, the first layer might look for edges. The next layer looks for collections of edges that form angles. The next might look for patterns of edges. After many layers the neural network learns the concept of, say, a pedestrian crossing the street.
GPUs are ideal for this. They can cut the time that it takes to train these neural networks to just days from a year or more. GPUs perform many calculations at once. That makes GPUs great for neural nets, which sort unstructured data like images. Once a system is “trained,” that learning can be used in applications like self-driving cars.
It’s technology that’s arriving just in time. Low-cost cameras and sensors are giving cars the ability to suck in huge amounts of information. NVIDIA’s advanced computer vision technology turns that data into 3D maps that vehicles can use to navigate the world around them.
Deep learning takes those capabilities to another level. Our Tegra X1-powered NVIDIA DRIVE system takes advantage of the models that neural networks create. That lets NVIDIA DRIVE understand the world the way human drivers do.
As a result, NVIDIA DRIVE can tease out information fast. DRIVE can pick out different kinds of vehicles. It can discern a police car from a taxi; an ambulance from a delivery truck; or a parked car from one that is about to pull out into traffic. That capability isn’t limited to vehicles. NVIDIA DRIVE can identify everything from cyclists on the sidewalk to absent-minded pedestrians.
Deep learning can even categorize images that challenge human eyes. Even in bad weather, DRIVE can read flickering electronic signs. Or spot brake lights.
Or recognize a Volkswagen Beetle. Trust us: never play Punch Buggy with a GPU.