What Is NVLink? And How Will It Make the World’s Fastest Computers Possible?

by Sumit Gupta

The numbers are big and so is the news. The U.S. Department of Energy today unveiled plans to build two GPU-powered supercomputers. Each will deliver at least 100 petaflops of compute performance.

And one – the Summit system at Oak Ridge National Laboratory, designed for open science – is expected to be 150 petaflops. That’s more than three times the peak speed of today’s fastest supercomputer.

These big machines will have big goals. Summit and Sierra, at the Lawrence Livermore National Laboratory, promise breakthroughs in energy efficiency, climate modeling, natural disaster prediction, safe nuclear material storage, and more. (For more, see our overview of the project.)

NVLINK: Big New Posibilities

The story behind the headlines: a key new technology we’re developing, called NVLink, will be the backbone of these supercomputers. NVLink will connect the machines’ processors – CPUs and GPUs – so they can exchange data 5 to 12 times faster.

It’s no secret that GPU accelerators now power many of the world’s fastest supercomputers. With thousands of computing cores in a single GPU chip, compared to tens of cores in a CPU, a GPU can process massive amounts of scientific data in a hurry. Roughly 10 times faster than a CPU.

While GPU performance has been increasing fast, the pipe that feeds data to GPUs has not kept up. Supercomputers today rely on a technology called PCI Express to connect GPUs to CPUs. That’s a technology you can find in the desktop and notebook computers you use right now. It’s fast, but not fast enough.

So, What Is NVLink?

NVLink, the world’s first high-speed GPU interconnect, offers a faster alternative. NVLink will let data move between GPUs and CPUs five to 12 times faster than they can today. Imagine what would happen to highway congestion in Los Angeles if the roads expanded from 4 lanes to 20.

That’s fast enough to let the GPU suck data from the CPU as quick as a CPU can get it from its own memory (see “How NVLink Will Enable Faster, Easier Multi-GPU Computing”).

There are other benefits, too. NVLink lets CPUs and GPUs connect in new ways to enable more flexibility in server design. NVLink is also much more energy efficient than PCI Express.

Text
NVLink lets data move between CPUs and GPUs five to 12 times faster than they can today.

This flexibility – and efficiency – will play a key role in Summit and Sierra. NVIDIA GPUs and IBM POWER CPUs, connected with the NVLink interconnect technology, will power both machines.

Summit and Sierra, in turn, are a step towards much larger, exascale computers. And NVLink is one of the technologies U.S. Department of Energy believes will get us there. Such a machine could crunch 1 quintillion floating-point operations per second, or a one followed by 18 zeros.

That’s a big number. And that means while today’s news is big, the biggest news is yet to come.