AI Feast at CVPR: NVIDIA Brings New Tensor Core GPU AI Tools, Super SloMo, Cutting-Edge Research

by Ian Buck

As thousands of the world’s top artificial intelligence researchers gathered this week in Salt Lake City for the annual Computer Vision and Pattern Recognition conference, NVIDIA unveiled a series of tools and deep learning research to fuel the next big wave of AI discovery.

Topping the agenda: Our launch of NVIDIA Apex an open source extension that helps developers accelerate their AI research by tapping the multi-precision capabilities of NVIDIA’s Volta Tensor Core GPUs.

Amid rows of research posters in a teeming exhibition hall, we also published more than a dozen research papers, including an AI that creates super slow-motion from ordinary video. And Jensen Huang, our founder and CEO, surprised attendees Wednesday night by giving away limited-edition TITAN V GPUs he had specially made for AI researchers.

Hello, DALI: New Tools Revving Engine of Modern AI

The star of our show has been the engine of modern AI, the NVIDIA Volta Tensor Core GPU.

With Volta, we reinvented the GPU. Its revolutionary Tensor Core architecture enables multi-precision computing — cranking through deep learning matrix operations at 125 teraflops at FP16 precision, and using FP64 and FP32 when there’s a need for greater range, precision or numerical stability.

Optimizing massively long-running code for both precision and performance is hard. With Apex, we’ve made it easier than ever to harness the multi-precision computing feature built into Tensor Core GPUs. Developers can use the newly available PyTorch extension to split their AI training workloads based on the level of precision required. Apex lets them train AI models with incredible speed by automatically applying higher precision approaches where needed and efficient lower precision methods when possible.

NVIDIA GPUs have made compute nodes super-fast, so fast that I/O — input/output — can be a bottleneck. But, because GPUs can also accelerate data movement, we addressed that this week by unveiling DALI — an open source library designed as a plug-in that works with all major frameworks.

DALI allows users to define configurable data processing graphs to accelerate image decode and augmentation steps typical in AI training to the GPU. It’s an essential technology that recently allowed our own data scientists to feed data to our new NVIDIA DGX-2, powered by 16 Volta GPUs, and achieve a record-breaking 15,000 images per second in training.

DALI on Volta accelerates data loading to the framework, which is Tensor Core-optimized with Apex. Then, Volta GPUs train the model at lightning speed.

We also released the much-anticipated Kubernetes on NVIDIA GPUs and TensorRT 4 — tools that help developers scale up AI across thousands of GPUs. The release of Kubernetes on NVIDIA GPUs allows developers to seamlessly scale up training and inference deployments to multi-cloud GPU clusters. TensorRT 4 is now generally available with accelerated support for such layers as Top-k, LSTMs and batch GEMMs for speeding up neural machine translation, recommenders and speech applications.

NVIDIA Research at the Cutting-edge

CVPR is where researchers present cutting-edge computer vision research. So, it was a natural place for NVIDIA Research — which is now 200+ scientists strong — to show papers that answer questions at the intersection of art, science and practical challenges. Among them:

  • Is it possible to imagine the next frame in a video? We built a new AI network called Super SloMo. which can convert 30 frames per second video into 240 frames per second by intelligently filling in the missing frames. The technique described in our paper, “Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation,” creates beautiful slow motion video from standard footage. This is a killer app for the movie industry, which can film with standard equipment and use Super SloMo to deliver incredible slow-motion scenes on a whim.
  • How great would it be to have an AI that automatically labels data? NVIDIA Research, along with France’s Ecole Polytechnique and the University of Montreal, built a landmark localization network for accurate and reliable gesture recognition, facial expression recognition, facial identity verification and eye gaze tracking. Our research team’s paper, “Improving Landmark Localization with Semi-Supervised Learning,” presents a framework of sequential multitasking and an unsupervised learning technique for landmark localization — finding the precise location of specific parts of an image, a key step in many complex vision problems — that only needed 5 percent of data to be labeled to outperform previous methods.
  • Wouldn’t it be cool if we could create new virtual worlds from just pictures? Using conditional generative adversarial networks, or GANs, we can take a photograph of a city block and recreate a new virtual world. Our own self-driving car teams use this to design simulation tests for our NVIDIA DRIVE platform using the method we described in “High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs.”

Our work on NVIDIA SPLATNet — which describes a new deep learning based architecture that processes point clouds without preprocessing — received one of the six coveted CVPR research awards.

Join Us to Tackle World’s Great Challenges

NVIDIA’s GPU architecture is the leading platform for graphics, accelerated computing and AI. Combining these approaches can spark innovation in incredible ways.

Take NVIDIA RTX ray-tracing technology and combine it with Volta Tensor Cores and AI denoising, and for the first time we can realize real-time, cinematic-quality rendering. This is the holy grail for movie animation, gaming, product design and architecture. Architects can design interactively with their clients; movie animation studios can experiment with light and materials in real time.

Or take the challenge of training autonomous vehicles to drive safely by training them on data from billions of miles of driving. That could take years of driving on public roads to collect. So, we invented DRIVE Constellation and DRIVE SIM, a VR autonomous vehicle simulator that generates a wide range of testing scenarios, simulation sensors and apply neural networks to test and react to real world conditions. Photorealistic simulation enables a safer, more scalable and more cost-effective way to bring self-driving cars to our roads.

Or take the challenge of early disease detection, the inspiration behind Clara, an AI medical imaging supercomputer. Over the past 10 years, medical imaging has become computational, from reconstruction to image processing to rendering and an explosion in AI applications. Clara is a virtual, scalable AI supercomputer that can upgrade the installed base of instruments and usher in a new breed of miniaturized, intelligent devices.

AI has inspired us to dream about a better world, now and for the future of our family and friends. We love coming to CVPR to honor and celebrate the AI researchers that got us here and join in the discovery of the future of AI.