NVIDIA Sets Six Records in AI Performance

Tensor Core GPUs achieve best performance across every MLPerf benchmark submitted; acceleration stack available to all on NGC.
by Paresh Kharya

NVIDIA has set six AI performance records with today’s release of the industry’s first broad set of AI benchmarks.

Backed by Google, Intel, Baidu, NVIDIA and dozens more technology leaders, the new MLPerf benchmark suite measures a wide range of deep learning workloads. Aiming to serve as the industry’s first objective AI benchmark suite, it covers such areas as computer vision, language translation, personalized recommendations and reinforcement learning tasks.

NVIDIA achieved the best performance in the six MLPerf benchmark results it submitted for. These cover a variety of workloads and infrastructure scale – ranging from 16 GPUs on one node to up to 640 GPUs across 80 nodes.

The six categories include image classification, object instance segmentation, object detection, non-recurrent translation, recurrent translation and recommendation systems. NVIDIA did not submit results for the seventh category for reinforcement learning, which does not yet take advantage of GPU acceleration.

A key benchmark on which NVIDIA technology performed particularly well was language translation, training the Transformer neural network in just 6.2 minutes. More details on all six submissions are available on the NVIDIA Developer news center.

NVIDIA engineers achieved their results on NVIDIA DGX systems, including NVIDIA DGX-2, the world’s most powerful AI system, featuring 16 fully connected V100 Tensor Core GPUs.

NVIDIA is the only company to have entered as many as six benchmarks, demonstrating the versatility of V100 Tensor Core GPUs for the wide variety of AI workloads deployed today.

“The new MLPerf benchmarks demonstrate the unmatched performance and versatility of NVIDIA’s Tensor Core GPUs,” said Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA. “Exceptionally affordable and available in every geography from every cloud service provider and every computer maker, our Tensor Core GPUs are helping developers around the world advance AI at every stage of development.”

State-of-the-Art AI Computing Requires Full Stack Innovation

Performance on complex and diverse computing workloads takes more than great chips. Accelerated computing is about more than an accelerator. It takes the full stack.

NVIDIA’s stack includes NVIDIA Tensor Cores, NVLink, NVSwitch, DGX systems, CUDA, cuDNN, NCCL, optimized deep learning framework containers and NVIDIA software development kits.

NVIDIA’s AI platform is also the most accessible and affordable. Tensor Core GPUs are available on every cloud and from every computer maker and in every geography.

The same power of Tensor Core GPUs is also available on the desktop, with the most powerful desktop GPU, NVIDIA TITAN RTX costing only $2,500. When amortized over three years, this translates to just a few cents per hour.

And the software acceleration stacks are always updated on the NVIDIA GPU Cloud (NGC) cloud registry.

NVIDIA’s Record-Setting Platform Available Now on NGC

The software innovations and optimizations used to achieve NVIDIA’s industry-leading MLPerf performance are available free of charge in our latest NGC deep learning containers. Download them from the NGC container registry.

The containers include the complete software stack and the top AI frameworks, optimized by NVIDIA. Our 18.11 release of the NGC deep learning containers includes the exact software used to achieve our MLPerf results.

Developers can use them everywhere, at every stage of development:

  • For data scientists on desktops, the containers enable cutting-edge research with NVIDIA TITAN RTX GPUs.
  • For workgroups, the same containers run on NVIDIA DGX Station.
  • For enterprises, the containers accelerate the application of AI to their data in the cloud with NVIDIA GPU-accelerated instances from Alibaba Cloud, AWS, Baidu Cloud, Google Cloud Platform, IBM Cloud, Microsoft Azure, Oracle Cloud Infrastructure and Tencent Cloud.
  • For organizations building on-premise AI infrastructure, NVIDIA DGX systems and NGC-Ready systems from Atos, Cisco, Cray, Dell EMC, HP, HPE, Inspur, Lenovo, Sugon and Supermicro put AI to work.

To get started on your AI project, or to run your own MLPerf benchmark, download containers from the NGC container registry.