Live: Jensen Huang Keynotes NVIDIA’s 2018 GPU Technology Conference

We're live at the GTC 2018 keynote in Silicon Valley, refresh your browser for the latest news and updates.
by Bob Sherbin

Check our blog for highlights from our 2018 GPU Technology Conference all this week.

11:45 – That’s it folks, Jensen thanks the audience for coming and invites them to enjoy the rest of GTC. If you’re not here in Silicon Valley, tune into our blog for updates throughout the conference.

11:34 – In a final gesture, Jensen invites the audience into a lab, our Holodeck. If you have a car in self-driving car, humans are the backup system. But what about an autonomous machine with no operators? It could be a tractor with no driver. So, how do we create a backup system. We know virtual reality has ability to teleport us into a new world. And so onto the demo.

Imagine there’s a car somewhere in the world that we need to help.

So in the Holodeck VR environment, we’ve created a virtual-reality car – it’s rendered like in the Matrix. And it’s shown live. The virtual reality driver is sitting on the stage. He’s inside the Holodeck. Around the Holodeck is video that’s being piped in. Tim in virtual form takes over the car in virtual reality, but it’s translated into a real car that’s on the street. “So, it’s like three layers of Inception,” Jensen says. Tim is in reality, projected into a VR environment which lets him control a car that’s in the real world.

The car gets stuck behind an actual truck and Tim, remotely, is able to extricate it. He then, remotely, pulls in a parking space. So, imagine, through VR we can teleport ourselves into the mind of an autonomous car through VR. “Teleportation, the future has arrived,” Jensen says.

11:26 – Now, it’s to the last section, Robotics.

Autonomous machines will revolutionize every industry. We’re releasing today, the NVIDIA Isaac Robotics Platform. In the Isaac simulated lab, we’ll develop the various capabilities for robots to navigate, it runs on the Isaac SDK.

It includes lots of software and a simulator, together with a small computer called Jetson.

11:22 –We now have an open drive platform and we’re working now with 370 partners that are developing on NVIDIA Drive – sensor companies, startups, suppliers, mobility service companies, cars, trucks. At CES just a few months ago we had 320 partners. That’s now grown by 50.

11:21 – Civilization drives 10 trillion miles each year. And in the U.S. 770 accidents come in every billion miles. A fleet of 30 test cars covers 1 million miles per year. We’re trying to build a system that’s better than humans, and it’s not possible in real life.

So we’re recreating reality in all conditions, with fidelity and performance that simulates reality. We can use this to create extreme, corner-case scenarios.

We call this DRIVE Sim – imagine how much technology is brought to bear, the image generator which generates the world. We’ve replaced the person with an AV computer, a self-driving car computer. Inside it is DRIVE Xavier and Pegasus.

They’re connected so it perceives this virtual world around it – we call this DRIVE CONSTELLATION.

11:13 – Why don’t we just put a car and demo it? Well, we’re trying to create an autonomous vehicle driving flow and infrastructure so the entire industry can take advantage of this and create the future of autonomous vehicles. But every car should have the benefits of AI to monitor us, make sure our gaze is active and not sleepy.

We’re not trying to build an autonomous vehicle system based on computers which are software defined. If you’re a software engineer, you care about the architecture. We’re a software company, we care about architecture and want to make the software better and better over time.

We created on architecture roadmap. It started with Drive PX Parker with one chip, which evolved to DRIVE PX 2 with four chips, we then created DRIVE Xavier (contains computational capabilities of those four chips and shrunk it into one – it’s the largest single chip we’ve ever made short of Volta, with 9B transistors; it’s now sampling.

From Xavier we created a four-chip system with two Xaviers and two Voltas – this 300 watt computer is being used in robotic cars, which will be in production later this year. They’re auto grade, super energy efficient, and it’s ASIL-D, it stands up to the highest standard of functional safety.

It’s a rigor like you’ve never seen before. 100 percent of IP was created here at NVIDIA. It’s a gigantic investment and we take it seriously,  but we’re not stopping here.

DRIVE Pegasus, as powerful as it is, multiple ones are being used in self-driving cars. Our next step is called Orin – we’ll take eight chips, two Pegasuses, and put them into two Orrins. This is our drive roadmap.

11:02 – Jensen now details NVIDIA’s Perception infrastructure.

  • Every car is collecting petabytes of data, we label it in a data factor – 1500 people labeling 1 million items a month
  • We train on NVIDIA DGX systems, which we then validate and verify
  • In the end it create networks, we now have 10 networks in the car. There are 10 DGX’s assigned to each network

These 10 networks cover perception, free space perception distance perception, weather, lidar perception, camera-based mapping, camera localization to HD maps, lidar localization to HD maps, path perception, and scene perception.

Jensen then shows what he calls a home-made video that highlights this wide range of neural nets running simultaneously. It shows how a self-driving car can operate and handle stop lights, left-hand turns, and other complexities.

There are several thousand engineers working on this and they’ll work for two or three more years before we begin shipping in volume. This is one of the most complex computing problems we’ve ever encountered, and it’s multiplied exponentially by safety concerns.

10:55 – We now shift to transportation. Everything that moves, Jensen, says, will become autonomous.

The reason for this is that people are moving further from cities because of overcrowding, online shopping requires atoms to be driven to us, another 1 billion vehicles will come into society in next 12 years; parking lots are required and are being built in city centers.

Jensen says, “Safety is the single most important thing. It’s the hardest computing problem. With the fatal accident, we’re reminded that this work is vitally important. We need to solve this problem step by step by step because so much is at stake. We have the opportunity so save so many lives if we do it right.”

This is the ultimate deep learning, AI problem. We have to manage faults even when we detect them. The bar for functional safety is really, really high. We’ve dedicated our last five to seven years to understanding this system. We are trying to understand this from end to end. There are four pillars: collecting data, training models, simulation, driving.

10:48 – We now have up to 40K downloads of TRT and it’s picking up speed. Companies want to deploy AI software into devices, supercomputers, in companies with SAP. With our new battery of acceleration with inferencing. We can now literally blanket world with this new approach.

To recap, for the NVIDIA AI Platform we announced:

  • We’re doubling the memory Volta V100 to 32GB
  • The new DGx 2 with V100 32 GB
  • NGC is now on AWS, Google cloud, Alicloud, and Oracle
  • The NVIDIA GPU cloud with 30 optimized containers
  • NVIDIA AI inference
  • TITAN V is still out of stock

NVIDIA Research gives me great joy. It’s now 200 people strong, led by Bill Dally, the former chairman of Stanford’s computer science department. Incredible productivity. What’s special is that it’s a hub and spoke program, they all work together. Its inventions include NVSwitch and cuDNN. They’ve done enormous work in progressing GANs. They also new work in noise-to-noise denoising – it’s basically super ray tracing, using artificial intelligence to predict where pixels should be.

10:40 – How do we deploy this all into data center?

There’s something called Kubernetes, Kubernetes on NVIDIA GPUs will bring you so much joy. It allows us to take massive workloads servicing billions of people, it orchestrates workloads coming in across the sea of servers in a datacenter – it’s now GPU aware.

Jensen now shows flower identification. It’s a demo he’s give us before. It starts recognizing 4 flower types a second while running on CPUs.

Running on NVIDIA GPUs with Volta it identifies 873 per second on just one GPU, same network.

Running in Kubernetes, making replicas of the job, it can now no recogniton almost 7K images a second.

“This is like magic,” Jensen says.

Kubernetes assigns a pod onto one GPU, or many GPUs on one server, or many GPUs on many servers, even across many datacenters. It all happens invisibly.

10:34 – There are 30M hyperscale servers in the world.

We started working on TensorRT which takes computational graphs and targets for the principle of PLASTER.

TensorRT was introduced in Sept 2016, TensorRT 2 in April 2017; Tensor RT 3 in September 2017. Now TensorRT 4 – it can handle recurrent neural networks, has deep integration into TensorFlow

When you’re done training the network you can run it directly on the device.

We now have full optimization across the entire software stack – TensorFlow, Kaldi optimizaiton, ONNX, WinML

We can now accelerate voice, speech, NLP, natural language understanding.

We’ve pulled together the results: images are accelerated 190X, NLP is accelerated 50X, recommender engines 45X, speech 36X, speech recognition by 60x.

In aggregate we speed up hyperscale datacenters by 100X. We’ll save a lot of money.

10:29 – There’s a single slide, white font against a black background that readers PLASTER. It’s an acronym for an important principle in machine learning. He spells it out:

  • Programability
  • Latency
  • Accurate
  • Size
  • Throughput
  • Energy efficiency
  • Rate of learning

This is something I want you to remember, Jensen says. Inference is complicated, it’s not easy. Hyperscale datacenters are the most complicated computers ever made.

Building one computer for one user is hard. Building a supercomputer? It can’t possibly be easy. Inference is hard but PLASTER underscores each important part of it.

10:26 – “We are all in on AI,” Jensen says. The computation is growing exponentially. Deep learning modelsare growing in effectiveness at a double exponential – more data, more computing – are creating a double exponential for AI.

The system is complicated. The software is complicated. So, we’ve created containers to hold optimized software. Think of them as tupperware. We call it the NVIDIA GPU Cloud.

These NGC containers are fabulous. Irrespective of what cloud, you can use the same stack. 20K registered users, 30 containers, up from just a handful last year.

NGC has been certified on the datacenters we run on — AWS, Google Cloud, Oracle cloud Ali cloud have all been certified. It’s the only architecture that runs on any cloud.

10:23 – Alexnet five years ago took six days to train with 2 GTX 580s. That can now be done in 18 minutes on DGX-2.

10:20 – This stop-motion animation was shot with 112 shots with the slow-mo explosion shot of DGX-2. “Tim Burton, You’ve got nothing on NVIDIA.”

DGX-2 provides 10X the processing power of DGX-1 of six months ago, unveiled in September 2017.

The exploration space of AI, the number of layers, the training rates sweeping through different fraemworks, with bigger networks, more experimentation, DGX-2 couldn’t come at a better time.

How much should we charge is the question? It tooks hundreds of millions of dollars of engineering to create this.

It’s $399K for the world’s most powerful computer. This replaces $3M of 300 dual-CPU servers consuming 180 kilowatts. This is 1/8th the cost, 1/60th of the space, 18th the power. “The more you buy, the more you save,” Jensen says, repeating a phrase he’s used at a number of these.

10:15 – Jensen unveils world’s largest GPU. It’s a box the size of a dazzling steamer trunk from the future. What you’re looking at is this switch has 2 billion transistors, and there are 12 switches in this system.

Every GPU can speak to every GPU in a non-blocking fabric switch. All the reads, writes work perfectly. Incredibly low latency because it’s not a network, it’s a switch.

Altogether, NVIDIA’s largest graphics card is the DGX-2. It’s 2 petaflops, 512 gigabyte, 10 kilowatts. The amount of airflow is amazing to cool it. It weighs 350 pounds.

“No humans could lift it,” Jensen says.

It has 200 times the bandwidth of the highest speed NICs on the planet. It has eight NICS from Melanox. It has two CPUs, 30 teraflops of storage on the system because we’re going to crunch through data very fast.

10:08 – AlexNet, a pioneering network that won the ImageNet competition five years, has spawned thousands of AI networks. What started out with eight layers with millions of parameters, is now hundreds of layers with billions of parameters. The growth is 500x in five years. Moore’s law would only have suggested 10X

Jensen calls it a Cambrian Explosion of networks – convolutional, recurrent, generative adversarial, and reinforcement networks.

So, the world wants a “gigantic GPU.”

Today, I announce the world’s largest GPU – it’s 16 Volta equivalents connected by 12 new NVSwitches. It creates the equivalent of a 512 gigabyte memory. The way you address the memory uses the very same software. In total, 14 terabytes per second of aggregate bandwidth. 1,440 movies could be transferred over this switch in one second

Altogether, 81,900 CUDA cores, 2 petaflops. I told you before fastest supercomputer in world is 125 petaflops, fastest in U.S. is 100 petaflops. And this is 2 petaflops.

10:03 – Deep Learning has completely turbocharged AI. It detects important data and create knowledge representations. If you give it enough data, it will become more robust and recognize a larger diversity of that space. It becomes more intelligent. The thing is deep learning needs is a ton of data. And a ton of compute.

Our strategy at NVIDIA is to advance GPU computing at speed of light, from processor development to interconnect to software layers, making it available to CSPs everywhere. Wherever you are, we’ll support it, from end to end. So, frameworks will be deployable in large scale, we can even make it possible to access our platform however you want: if you want to build a personal supercomputer or rent time in the cloud, or build your own supercomputer cluster. We’re making it possible to work at speed of light.

First we’re doubling our GPU. V100 is now V100 32 gig – it’s now twice as big from previous 16 gig version. It’s available as of now and in the cloud shortly. It’s in volume production.

9:59 – You can scan using your existing scanners and we’re able to put images through a supercomputer and infer images. This is a medical imaging supercomputer. Working with us are dozens of healthcare companies, research institutions, and startups.

9:57 – Jensen shows a 15 year old traditional medical scan, and you stream the ultrasound information into the data center, and you can now get a volumetric, modern view of it. Using additional AI technology, it can infer what details like the left ventricle on a heart would look like. It segments it out in 3D. “That’s pretty incredible,” Jensen says.

9:54 – Work we do in modern medical imaging is one of the things I’m proudest of, Jensen says.

Each mode of medical imaging in recent years has been revolutionized with deep learning.

He shows an ultrasound that’s 15 years old and compares it to a contemporary one.  You can see the difference between gray pixels and a beautifully rendered fetus, with accurate flesh tones.

This technology is used everywhere, CT, MRI, PET scans. We can now reconstruct images better than every before. We can visualize images in a way that release more insight with cinematic rendering.

The entire software stack, the solvers and libraries, integrated into this is identical to what we’ve talked about. The unfortunate thing is there around 3 million medical instruments installed, but only 100k sold each year. It would take 30 years to update everything

To avoid this we have an initiative for Project Clara – a virtualized data center, remoted, multi-modality, multi-user. It’s a medical imaging supercomputer. It’s possible now for us to virtually update every system that’s out there.

9:49 – The benefit to science of this is incredible, a supercomputer rack would have 600 dual-CPU servers, consuming 360K watts. A GPU accelerated supercomputer with 40 quad-GPU servers consumes just 48 kilowatts. Same work, at 1/5 the cost, the 1/7 the space, and 1/7 the power.

9:45 – One of best decisions we made was to make our computing general purpose.

We started GTC because we wanted to enable industry to create applications on our GPU. We’ve advanced architecture to accelerate applications through work on deep learning.

AI software is writing software. Timing couldn’t be more perfect. We’re up to a million GPU developers, GTC attendees at 8,500 is up 4x in five years. The number of CUDA downloads is up 5x in five years to 8M, with half done in last year. Total GPU flops in the world’s 50 fastest supercomputers is up 15x in five years to 370 petaflops.

We need larger computers, Jensen says. The world needs larger computers for all kinds of reasons – to make energy more efficient, predict weather, understand HIV, map the Earth’s core.

Jensen shows a chart that tracks GPU accelerated computing’s progress between the introduction of Fermi in 2013, to Volta in 2018 . That’s 25X growth in five years, whereas Moore’s Law would have suggested 10x growth.

9:37 – Jensen says there are 400 games made every year, it’s one of the largest industries in the world. They use ray tracing to render entire game in advance. As a result you get wonderful shadows and details and whole world comes to life. The film industry uses this in 500 movies each year – every frame is rendered multiple times. Imagine if one CPU takes hours for a frame just how long this would take. Additionally there are 12M designers and 150,000 architects, many of whom aren’t making square buildings but round ones.

Now there are 1 billion images rendered every year, and that could go up 10x as more real-time rendering is reduced by Quadro to one-fifth the cost, one-seventh the space, one-17th the power

Virtually everyone is adopting RTX Technology – he presents a slide with three dozen key partners, gaming, design, film, architecture. This technology is the single most important advance in computer graphics in 15 years.

9:32 – Quadro GV100 is the world’s first workstation GPU based on Volta architecture. It also has a new interconnect called NVLink 2 that extends the programming and memory model out of our GPU to a second one. They essentially function as one GPU. These two combined have 10,000 CUDA cores, 236 teraflops of Tensor Cores, all used to revolutionize modern computer graphics, with 64GB of memory

9:30 – Steve Parker, an NVIDIA ray-tracing expert, talks through how ray tracing is able to show light striking a  surface, bouncing off the surfacing, and then striking additional surfaces. Recreating this follows billions of rays.

Doing so, Jensen says, takes supercomputers than can calculate these rays. The more reflections, the more refractions, the harder it is. Everything that’s being seen visually is in real time.

This complete demo is running on just one DGX Station – not a supercomputer running one frame in hours, but one DGX-Station that costs $68K, with four Voltas, doing this in real time.

So, Jensen announced NVIDIA RTX technology that runs on a Quadro GV100 processor. It’s a big deal, he says, because for now we can bring real-time ray tracing to the market. The technology has been encapsulated into multiple layers. You’re also seeing deep learning in action, without that we couldn’t trace all the rays. It predicts rays.

9:24 – Then, Jensen shows a scene that looks like it was from the Last Jedi with chatting storm troopers, but Jensen said it was actually rendered in real time with ray tracing.

9:21 – Jensen moves right into fourth gear. He’s talking about graphics being the driving force of GPUs.

Computer scientists have been recreating photo-realistic images for decades, he notes. Ray tracing follows every photon as it bounces through a scene, based on the materials that it strikes. This is how the film industry does it, it requires thousands of CPUs. One CPU would take hours to compute one frame and there’s hundreds of thousands of frames in movie. For four decades we’ve been trying to close the gap to create a full movie.

He says we use all kinds of different tricks to improve visuals – like ambient occlusion, light baking – so long as things don’t move very much, the light conditions are well produced. Global illumination – where light moves from ever possible source and bounces accurately — helps bring a scene to life with extra-sharp detail.

But ray tracing is the most beautiful way of doing it. This is tough, especially with curved transparent surfaces like a drinking cup (regardless of what you’re drinking).

Some objects absorb light and then release some, things like jade or gummy bears, or car paint, or our skin – light goes through, picks up shades and then re-emits.

9:14 – And with that, Jensen takes the stage. He breaks out a new leather jacket ever year and he looks in this new one a bit like a superhero.

GTC is the GPU developers conference, which we do for all of you, whose work is impossible without a supercharged computer, he says.

We’re going to talk about amazing graphics, amazing science, amazing AI and amazing robots. So let’s get going, he says.

9:11 – And now an updated version of our I AM AI video swells up on the screen.

It depicts how AI can support humans in a broad field of endeavors, with a voice over that professes to be AI. And it comes with a sound track itself that’s composed by AI and played by a true-life orchestra.

“I am a protector,” says the narrator. And there’s a scene of sweeping work by the NASA FDL/Seti Institute’s project to better predict comets that might have Earth’s number. Also, some amazing work by Wildbook, which is able to identify individual members of a species that might otherwise look the same – like the zebras here. It has lots of use for tracking and helping to sustain the diversity of species.

“I am a healer,” is next. It shows how Image biopsy uses deep learning to quickly analyze the shape of the knee, with some snazzy color-coded keys for measurements.

“I am a guardian,” and then “I am a helper” which has some cool scenes. There’s an IAmRobotics autonomously navigating through warehouses, identifying objects and plucking them off shelves – at a rate of 200 picks an hour. That’s faster than a water pick. And back on the keyboard is jazz pianist Jason Barnes, who lost an arm in a work accident, is shown using Georgia Tech’s prosthetic tech that detects individual finger movement. And there’s a full autonomous selfie Skydio drone with 13 onboard cameras tracking a jogger.

It’s all brought to us, the video says, by NVIDIA and brilliant minds everywhere.

9:08 – A list of sponsors, topped by IBM and Facebook, comes up. The soundtrack pumps up. You could just about bop your head to it. The NVIDIA logo comes up again, and then the sponsors, following by the ever-connecting synapses of the visualized neural network.

9:05 – Okay, the music is beginning to shift. The blasphemously named Voice of God comes up and tells everyone in the nicest possible way to sit up and behave.  The event is about to begin.

9:01 – NVIDIA starts planning for GTC virtually before the previous one wraps up. It’s an all-on challenge that Jensen leads. A ton of the work falls on the Creative department who have created a mini-Netflix full of videos for the show. The first one, the opening one, is always a highlight of the show and helps set the tone for the next couple hours. I can tell you in full confidence that this year’s is terrific.

8:56 – The stage itself is pretty cool. It’s broad, running about two-thirds the length of the wall, with the screen a deep, inky black. It’s flanked by large NVIDIA-green airplane-wing like triangles that are interconnected. Triangles, of course, are the building block of computer graphics and, by the time, they’re the theme of our iconic new headquarters building in Santa Clara, comprised of two five-acre-sized triangular floorplates, covered with an undulating roof of interlocking triangles, with the occasional triangular skylight peeking through.

8:55 – More time than you’d think goes into picking out the walk-in track for GTC, which is reliably warm, soulful and upbeat.

Among the tunes we’re hearing, there’s “Crazy,” by the Lost Frequencies and “Glorious,” by Macklemore. Also we’ll wrap up with a couple tracks from The Greatest Showman soundtrack — “This is Me” and the eponymous  “The Greatest Showman.”

There are some fabulous graphics coming up against the deep black screen. They’re green synapses, like those in the brain, that connect to other points of light. They suggest the workings of a neural network, which underpins how deep learning works, with connections being made that might otherwise not be apparent, ultimately sorting out data in an image that enables the network to recognize an apple from an appliance, a banana from a bandana, a carrot from a carob. Well, you get the point.

8:50 – Folks are beginning to come into the hall at the Convention Center. There’s a rub. It seats 4,000 or so, but there are twice as many attendees. Good thing there are spillover rooms. The biggest fans among GTC attendees got in line early. First guy showed up at 6:45. By 8:30, there was a line that ran the full length of the center and ultimately out the door, onto the sidewalk.

You may have a more comfortable seat watching the webcast, which will have tens of thousands tuning in.

8:40 – Well, we’re back at GTC, 10 months after last year’s. But, yikes, a lot’s changed, with huge progress in AI, which is changing just about every service that’s delivered. The way just about any company with data operates.

The first one of these was held down the street here in San Jose at the Fairmont Hotel, where we had about 800 attendees. This year there are 8,500. That’ growth of about 30 percent a year each year. And that’s just in Silicon Valley, our seven GTCs around the world last year drew in a total of 22,000 – from Beijing and Tokyo to Tel Aviv and Washington, DC.

Monday 11 am – Less than 24 hours until NVIDIA CEO Jensen Huang delivers the keynote at our ninth annual GPU Technology Conference in Silicon Valley, and the action’s already begun.

The crowd more more than 8,000 surging into the McEnery Convention Center — which includes researchers, press, technologists, analysts and partners from all over the globe — is our largest yet.

The 600+ talks on the docket may be the best testament to the spread of GPUs into every aspect of human endeavor.

Attendees are already crowding into conference rooms to hear about how GPUs can be used to model the formation of galaxies, generate dazzling special effects for blockbuster movies, and even analyze scans of the human heart.

Their mood: happy. At least, that’s what the Emotions Demo, set up on the convention’s main concourse, tells us. The demo uses deep learning to instantly read the facial expressions of people nearby in real time – whether they’re happy, neutral, afraid, or disgusted.

Also on the show floor: a pop up store selling NVIDIA Gear. The best sellers? The NVIDIA “I Am AI” t-shirt, and our much sought after NVIDIA Ruler, according the store’s staff.

We’ll be buttonholing speakers from a broad cross-section of these talks and interviewing them for AI Podcast, where we’re recording in a sleek glass booth positioned on the show floor.

If all this makes your heartbeat a little faster, check back for live updates from our keynote Tuesday. And keep an eye on our blog throughout the week for the latest news from the show.

And if you’re feeling nostalgic, check out last year’s live GTC keynote blog.