NVIDIA CEO Jensen Huang announced HGX-2, a “building block” cloud-server platform that will let server manufacturers create more powerful systems around NVIDIA GPUs for high performance computing and AI.
It’s the latest addition to a computing platform that has grown 500x more powerful in five years and is supported by an ecosystem that includes every computer maker and ISV, Huang said at the GPU Technology Conference Taiwan Wednesday.
GPUs are at the center of a computing ecosystem poised to transform multi-trillion-dollar industries around the world, Huang said. He described a string of breakthroughs in materials science, energy, and medicine that are just within reach with the addition of more computing power.
“Computing demand is greater than ever, so more than ever, we need this computing performance to continue to extend, we need to extend Moore’s law,” Huang told a packed house of more than 2,000 technologists, developers, researchers, government officials and media in Taipei. GTC Taiwan is the second of seven AI conferences NVIDIA will be holding in key tech centers this year.
Huang’s two-hour keynote kicked off a day of breakout sessions on AI topics led by specialists from around the region and multinational corporations.
Huang detailed a “Cambrian explosion” of technologies driven by GPU-powered deep learning. In less than a decade, the computing power of GPUs has grown 20x — representing growth of 1.7x per year, far outstripping Moore’s law, Huang said.
But demand for that power is “growing, not slowing,” thanks to AI, Huang said. “Before this time, software was written by humans and software engineers can only write so much software, but machines don’t get tired,” he quipped.
“As long as there is data, so long as there is knowledge in how to create the architecture, we can create absolute enormous software,” Huang said. “And every single company in the world that develops software will need an AI supercomputer.”
Server-ing It Up
The latest step in that direction: the second generation of NVIDIA HGX, a single cloud-server platform.
HGX-2 incorporates such breakthrough features as NVIDIA’s NVSwitch interconnect fabric, linking 16 NVIDIA Tensor Core GPUs to work as a single, giant GPU. Partners will deliver the first HGX-2-based systems later this year.
“With your partnership, anybody who wants to use this future way of fused computing, with HPC, high-performance computing, and AI, can,” Huang said, thanking NVIDIA’s partners throughout the computer industry for their support. “We have servers of every single kind.”
From Big to Bigger
At the HGX-2’s heart: NVIDIA Tesla V100 GPU — equipped with 32GB of high-bandwidth memory capacity — to delivers 125 teraflops of deep learning performance.
Weave together as many as 16 Tesla V100 GPUs with NVSwitch and the result is what Huang calls “the world’s largest GPU.”
“Every one of the GPUs can talk to every one of the GPUs simultaneously at a bandwidth of 300 GB/s, 10 times PCI Express,” Huang said. “So everyone can talk to each other all at the same time.”
Huang also detailed NVIDIA’s new NVIDIA DGX-2, the first system built using the HGX-2 server platform. The 350-pound machine offers 2 petaflops of computing power and 512GB of HBM2 memory.
“This is the fastest single computer humanity has ever created: one operating system, one programming model, you can program this as one computer,” Huang said. “This is like a PC, except it’s incredibly fast.”
A New Law in Town
Compared to its predecessor, DGX-2 represents a 10x leap in computing power in just six months, Huang said. And a 500x performance leap on the AlexNet image-recognition benchmark over what could be done five years ago on a pair of NVIDIA GPUs.
“There’s a new law in town,” Huang said. “This new law of computing says ‘If you are able, and if you are willing to optimize across the entire stack, the performance improvement you can achieve is incredibly fast.’”
The result: a series of deep learning speed records up and down the technology stack, from single chip performance on up to sprawling data center systems.
The NVIDIA GPU Cloud makes this power accessible on an even larger scale. It lets researchers burst from desktop and server systems to cloud systems offered by providers such as Amazon, Google, Alibaba and Oracle.
“Every layer of this software has been tuned, it’s been tested,” Huang said, adding that 20,000 companies have now downloaded the NGC Cloud.
The real-world results are stunning. Huang on stage worked with an NVIDIA AI researcher, removing in real time details from photographs, such as trees or street lights as the audience looked on.
Moving beyond demos to deploy such deep learning applications on a massive scale involves mastering seven challenges: programmability, latency, accuracy, size, throughput, energy efficiency and rate of learning, Huang said. Together, they form the acronym PLASTER.
The ability to scale up will be key to putting a new generation of AI services to work — a process known as inferencing — for everything from speech synthesis and recognition, image and video processing, and recommender services, among others.
To show what this looks like, Huang offered a stunning demo of a flower recognition system scaling up from four images a second on a CPU to 2,500 images per second on a single GPU to four times that rate, thanks to the ability to quickly add support for more GPUs living in desktops, data centers and cloud service providers, via Kubernetes.
“We call this Kubernetes on NVIDIA GPUs, so KONG,” Huang said.
“Kubernetes is the hyperscale, if you will, operating system,” Huang said. “If you look at this entire stack from the GPU, to all these APIs and libraries, which you can put into a docker, into a container, which runs on top of Kubernetes, that software stack is so complicated, it has been the work of hundreds of our engineers for several years.”
Big New GPUs, Big New Markets
Such servers will supply multi-trillion-dollar industries with computing power they can get nowhere else.
- In the $2 trillion entertainment industry, NVIDIA’s new RTX technology for accelerating ray tracing — the standard for cinema quality — coupled with real-time graphics technology and artificial intelligence, promises to accelerate today’s traditional, CPU-driven render farms.
- In the $7 trillion healthcare industry, efforts such as our Project Clara promise to put GPU computing to work to redefine medical imaging, wringing more fidelity out of today’s medical instruments, or allowing them to generate images of the same quality using less power. That can reduce the energy dose from scanners by a factor of six, making them safe to use on children.
- Safe cities represent another $2 trillion opportunity.
- Transportation — which NVIDIA is addressing with its end-to-end DRIVE platform — represents another $10 trillion. “Everything in the future, that moves, will be autonomous,” Huang said, detailing NVIDIA’s ability to help collect and process data, train models, simulate billions of miles of driving, and driving.
Shrink to Fit: Teleporting into Tight Spaces
Ending his keynote with a flourish, Huang used VR to shrink one of our colleagues and teleport him into a miniature car, which he then drove around a miniature city.
Humans, in other words, will be able to use VR to become backups for AI machines, Huang explained. Wherever these machines are. Whatever their size.
“In the future you will be able to merge with the robot,” Huang said. “You can have telepresence, you can go anywhere you want.”