NVIDIA Wins $18 Million DOE Grant for Exascale Computing Research

by Bill Dally

Developing exascale computing capabilities is the key to next-generation scientific research, national security and ensuring economic competitiveness.

That’s why I’m thrilled that NVIDIA has received an $18 million grant from the U.S. Department of Energy, under its “FastForward 2” program, to accelerate the development of next-generation supercomputers.

With our expertise in high performance computing, we’ll build on our work in the original FastForward program, bringing a strong focus on energy efficiency, programmability and resilience.

NVIDIA has a long record of winning research awards that reflect our pioneering work. These awards ensure that we can help address some of the world’s greatest scientific challenges through the development of powerful computing systems.

The latest award, part of $100 million in funding given to a handful of technology companies, is tied to the DOE’s plan for driving innovative research and development essential to delivering next-generation capabilities that are affordable and energy efficient.

DOE’s Exascale Goal

The DOE’s goal is to have exascale systems that operate at quintillions of floating point calculations per second. Such systems would be 30-60 times faster than today’s leading petaflop (quadrillion floating point operations per second) supercomputers. The world’s fastest computer today is in China with about 55 petaflops peak performance.

Exascale computing is seen as the next big challenge in supercomputing. The DOE believes that a highly parallel, heterogeneous computing model, the kind developed by NVIDIA, offers the best approach to get there.

As part of our effort to investigate node architectures for future exascale computer systems, we’ll work with scientists across seven DOE laboratories. Research will focus on processor architecture, circuits, memory architecture, high-speed signaling and programming models to enable an exascale computer at a reasonable power level.

Key challenges will include energy efficiency, performance, data movement, concurrency, reliability and programmability—all of which are interconnected. Areas of focus in the node architecture program include application co-design, memory system architecture, resilience, circuits and integrated circuit design techniques, among others.

NVIDIA researchers will collaborate with DOE application developers on parallel algorithm development and optimization for DOE applications. We’ll also work on providing extreme energy efficiency in both the processor core and memory system.

For the memory system architecture, researchers will focus on developing better memory system performance and efficiency to end applications. Programming models and tools will also be developed aimed at aiding a broad spectrum of programmers in reaching their goals on any exascale architecture.

Energy Efficiency

Power consumption is considered a leading design constraint for future systems, so the FastFoward 2 program will focus on making systems developed energy efficient.

The DOE’s target is to facilitate the development of an exascale system that consumes less than 20 megawatts by 2020. Our FastForward 2 research aims to meet this goal through research on energy-efficient processors, efficient communication technology, better programming systems and improved memory technology.

With today’s Kepler GPUs, the most efficient GPU computing system, the Tsubame KFC (No. 1 on the Green500 list) has an efficiency of 4.5 gigaflops per watt. Using today’s technology, an exaflops system would consume over 200MW. The research we will be doing under FastForward 2 aims to reduce this by an order of magnitude so we can achieve an exaflops system at 20MW.

NVIDIA is a pioneer in the use of massively parallel accelerators for supercomputing, and is well placed to lead new development. Our GPUs have thousands of cores to process parallel workloads efficiently, in order to handle multiple tasks simultaneously.

This offers unprecedented application performance by offloading compute-intensive portions of an application to the GPU. And as we’ve demonstrated at NVIDIA, GPUs do their work very efficiently. The top 15 systems on the latest Green500 list of the world’s most efficient supercomputers are all GPU powered.