Expanding Universe for HPC, NVIDIA CEO Brings GPU Acceleration to Arm

Mars, Microsoft, massive data take SC19’s center stage as Jensen Huang details vision for intersection of graphics, simulation and AI.
by Brian Caulfield

Broadening support for GPU-accelerated supercomputing to a fast-growing new platform, NVIDIA founder and CEO Jensen Huang Monday introduced a reference design for building GPU-accelerated Arm servers, with wide industry backing.

Huang — speaking Monday at the SC19 supercomputing show in Denver — also announced that Microsoft has built NDv2, a “supersized instance” that’s the world’s largest GPU-accelerated cloud-based supercomputer — a supercomputer in the cloud — on its Azure cloud-computing platform.

He additionally unveiled NVIDIA Magnum IO, a suite of GPU-accelerated I/O and storage software to eliminate data transfer bottlenecks for AI, data science and HPC workloads.

In a two-hour talk, Huang wove together these announcements with an update on developments from around the industry, setting out a sweeping vision of how high performance computing is expanding out in all directions.

HPC Universe Expanding in All Directions

“The HPC universe is expanding in every single direction at the same time,” Huang told a standing-room only crowd of some 1,400 researchers and technologists at the start of the world’s biggest supercomputing event. “HPC is literally everywhere today. It’s in supercomputing centers, in the cloud, at the edge.”

Driving that expansion are factors such as streaming HPC from massive sensor arrays; using edge computing to do more sophisticated filtering; running HPC in the cloud; and using AI to accelerate HPC.

“All of these are undergoing tremendous change,” Huang said.

Putting an exclamation mark on his talk, Huang debuted the world’s largest interactive volume visualization: An effort with NASA to simulate a Mars landing in which a craft the size of a two-story condominium traveling at 12,000 miles an hour screeches safely to a halt in just seven minutes. And it sticks the landing.

Huang said the simulation enables 150 terabytes of data, equivalent to 125,000 DVDs, to be flown through at random access. “To do that, we’ll have a supercomputing analytics instrument that sits next to a supercomputer.”

Expanding the Universe for HPC

Kicking off his talk, Huang detailed how accelerated computing powers the work of today’s computational scientists, whom he calls the da Vincis of our time.

The first AI supercomputers already power scientific research into phenomena as diverse as fusion energy and gravitational waves, Huang explained.

Accelerated computing, meanwhile, powers exascale systems tackling some of the world’s most challenging problems.

They include efforts to identify extreme weather patterns at Lawrence Berkeley National Lab … Research into the genomics of opioid addiction at Oak Ridge National Laboratory … Nuclear waste remediation efforts led by LBNL, the Pacific Northwest National Lab and Brown University at the Hanford site … And cancer-detection research led by Oak Ridge National Laboratory and the State University of New York at Stony Brook.

At the same time, AI is being put to work across an ever-broader array of industries. Earlier this month, the U.S. Post Office, the world’s largest delivery service — which processes nearly 500 million pieces of mail a day — announced it’s adopting end-to-end AI technology from NVIDIA.

“It’s the perfect application for a streaming AI computer,” Huang said.

And last month, in partnership with Ericsson, Microsoft, Red Hat and others, Huang revealed that NVIDIA is powering AI at the edge of enterprise and 5G telco networks with the NVIDIA EGX Edge Supercomputing platform.

Next up for HPC: harnessing vast numbers of software-defined sensors to relay data to programmable edge computers, which in turn pass on the most interesting data to supercomputers able to wring insights out of oceans of real-time data.

Arm in Arm: GPU-Acceleration Speeds Emerging HPC Architecture

Monday’s news marks a milestone for the Arm community. The processor architecture — ubiquitous in smartphones and IoT devices — has long been the world’s most popular. Arm has more than 100 billion computing devices and will cross the trillion mark in the coming years, Huang predicted.

NVIDIA’s moving fast to bring HPC tools of all kinds to this thriving ecosystem.

“We’ve been working with the industry, all of you, and the industry has really been fantastic, everybody is jumping on,” Huang said, adding that 30 applications are already up and running. “This is going to be a great ecosystem — basically everything that runs in HPC should run on any CPU as well.”

World-leading supercomputing centers have already begun testing GPU-accelerated Arm-based computing systems, Huang said. This includes Oak Ridge and Sandia National Laboratories, in the United States; the University of Bristol, in the United Kingdom; and Riken, in Japan.

NVIDIA’s reference design for GPU-accelerated Arm servers — comprising both hardware and software building blocks — has already won support from key players in HPC and Arm ecosystems, Huang said.

In the Arm ecosystem, NVIDIA is teaming with Arm, Ampere, Fujitsu and Marvell. NVIDIA is also working with Cray, a Hewlett Packard Enterprise company, and HPE. A wide range of HPC software companies are already using NVIDIA CUDA-X libraries to bring their GPU-enabled management and monitoring tools to the Arm ecosystem.

The reference platform’s debut follows NVIDIA’s announcement earlier this year that it will bring its CUDA-X software platform to Arm. Fulfilling this promise, NVIDIA is previewing its Arm-compatible software developer kit — available for download now — consisting of NVIDIA CUDA-X libraries and development tools for accelerated computing.

Microsoft Brings GPU-Powered Supercomputer to Azure

“This puts a supercomputer in the hands of every scientist in the world,” Huang said he announced NDv2, a GPU-powered supercomputer now available on Microsoft Azure.

Giving HPC researchers and others instant access to unprecedented amounts of GPU computing power, Huang announced NDv2, a GPU-powered supercomputer now available on Microsoft Azure that ranks among the world’s fastest.

“Now you can open up an instance, you grab one of the stacks … in the container, you launch it, on Azure, and you’re doing science,” Huang said. “It’s really quite fantastic.”

Built to handle the most demanding AI and HPC applications, the Azure NDv2 instance can scale up to 800 NVIDIA V100 Tensor Core GPUs interconnected with Mellanox InfiniBand.

For the first time, researchers and others can rent an entire AI supercomputer on demand, matching the capabilities of large-scale, on-premise supercomputers that can take months to deploy.

AI researchers needing fast solutions can quickly spin up multiple Azure NDv2 instances and train complex conversational AI models in just hours, Huang explained.

For example, Microsoft and NVIDIA engineers used 64 NDv2 instances on a pre-release version of the cluster to train BERT, a popular conversational AI model, in roughly three hours.

Magnum IO Software

Helping AI researchers and data scientists move data in minutes, rather than hours, Huang introduced the NVIDIA Magnum IO software suite.

A standing-room only crowd of some 1,400 researchers and technologists came to hear NVIDIA’s keynote at the start of SC19, the world’s top supercomputing event.

Delivering up to 20x faster data processing for multi-server, multi-GPU computing nodes, Mangum IO eliminates a key bottleneck faced by those carrying out complex financial analysis, climate modeling and other high-performance workloads.

“This is an area that is going to be rich with innovation, and we are going to be putting a lot of energy into helping you move information in and out of the system,” Huang said.

A key feature of Magnum IO is NVIDIA GPUDirect Storage, which provides a direct data path between GPU memory and storage, enabling data to bypass CPUs and travel unencumbered on “open highways” offered by GPUs, storage and networking devices.

NVIDIA developed Magnum in close collaboration with industry leaders in networking and storage, including DataDirect Networks, Excelero, IBM, Mellanox and WekaIO.