Bring Your A(I) Game to the Data Center at GTC Digital

Learn from the world’s leading AI infrastructure experts about NVIDIA DGX systems.
by Chris Kawalek

Editor’s note: GTC San Jose has changed to GTC Digital, a fully online, interactive event. The sessions below will be released over a four-week period, with new sessions publishing each Thursday.

AI is the next great technology transforming virtually everything we do — health and wellness, business and technology, transportation and entertainment, and much more.

At the heart of AI initiatives around the globe are data centers supercharged with NVIDIA DGX systems. To learn how you can leverage our AI expertise to advance your projects in 2020, there’s no better place than the GTC Digital — register for free today.

For those just getting started, several sessions focused on NVIDIA DGX systems and the data center can help get you up to speed quickly. And if you’re already an expert, you can learn new and innovative ways to approach your AI infrastructure.

Whatever part of the journey you’re on, these sessions and many more at GTC Digital can help you establish an AI center of excellence, opening the door to more AI-driven breakthroughs for your organization.

Validating DGX-POD AI Reference Architectures with NVIDIA DGX Storage Partners (S21738)

  • Jacci Cenci-McGrody, senior technical marketing engineer, NVIDIA

NVIDIA DGX SuperPOD is a first-of-its-kind AI supercomputing infrastructure that delivers groundbreaking performance and deploys in two weeks as a fully integrated system. This session will provide the insights on planning, design considerations, and best practices for validating reference architecture for a flexible and scalable AI infrastructure.

Multiphysics Software Development in the Age of AI (P21687)

  • Christopher Lamb, vice president, compute software, NVIDIA
  • Sukirt S., engineer, system software, NVIDIA
  • Maziar Raissi, senior software engineer, NVIDIA
  • Oliver Hennigh, software engineer, NVIDIA
  • Sanjay Choudhry, senior director, Compute Software, NVIDIA
  • Susheela Narasimhan, senior thermal engineer, NVIDIA

Learn about the main ideas of physics-informed neural networks in the context of an industrial multiphysics problem — namely, designing a heat sink for NVIDIA DGX systems. The speakers address several major drawbacks encountered with the traditional methods of solving partial differential equations (PDEs) in terms of usability, speed, scalability, and expertise.

Scalable Speech Recognition with GPUs: From Cloud to Edge (S21263)

  • Vitaly Lavrukhin, senior applied research scientist, NVIDIA
  • Jocelyn Huang, deep learning software engineer, NVIDIA

The speakers will present their latest automatic speech recognition models that reach state-of-the-art accuracy while having almost 10x fewer parameters than Jasper, the previous flagship model. The small model size enables deployment on a broad spectrum of GPU-accelerated devices, from NVIDIA DGX systems to the tiny Jetson Nano.

Performance and Model Fidelity of BERT Training from a Single DGX Through DGX SuperPod (S21385)

  • Chris Forster, senior CUDA algorithms software engineer, NVIDIA
  • Thor Johnsen, senior deep learning software engineer, NVIDIA

BERT is a natural language processing model that performs well on a wide variety of tasks, including question answering, natural language inference, and classification. The speakers will cover how you can use their open-source code to train BERT models and some of the challenges and solutions to delivering both computational performance and model fidelity on large distributed machines, such as NVIDIA DGX SuperPOD.

Optimization Strategies for Large-Scale DL Training Workloads: Case Study with RN50 on DGX Clusters (S21733)

  • Arslan Zulfiqar, senior architect, NVIDIA
  • Joshua Mora Acosta, principal architect, NVIDIA

This tutorial will expose a list of optimizations for large-scale deep learning training workloads. Speakers will showcase those optimization strategies on training RN50 on large clusters of NVIDIA DGX-1 and DGX-2 systems with up to 1,500 GPUs, which delivered a 2x performance improvement on the same amount of hardware.

Scaling the Transformer Model Implementation in PyTorch Across Multiple Nodes (S21351)

  • Arslan Zulfiqar, senior architect, NVIDIA
  • Robert Knight, software engineer, NVIDIA

This session is a deep dive behind the scenes into the Transformer model implementation in PyTorch to understand its performance weaknesses and work to make it scale across multiple nodes. Speakers will describe an analysis of system-level profiling data of an example Transformer workload, spanning multiple NVIDIA DGX-2 systems.

Overcoming Latency Barriers: Strong Scaling HPC Applications with NVSHMEM (S21673)

  • Mathias Wagner, senior developer technology engineer, NVIDIA

For scientific advancement through HPC, ever-increasing simulation capabilities are not the only key to success. Obtaining timely results is often even more important. Reducing the time-to-solution generally requires the application to be strong-scalable. However, scaling up improved single-GPU performance faces many obstacles. Learn how to improve the strong-scaling on systems equipped with NVIDIA GPUs. The speakers will show results obtained on NVIDIA DGX-1 and DGX-2 systems, as well as scaling to 1,000 GPUs in InfiniBand-connected systems, including Summit.

To learn more about the conference and register for free, visit the GTC Digital website.