NVIDIA Eos Revealed: Peek Into Operations of a Top 10 Supercomputer

A blueprint for enterprises worldwide, NVIDIA’s groundbreaking DGX AI supercomputer is designed to power the next frontier in AI innovation.
by Charlie Boyle

Providing a peek at the architecture powering advanced AI factories, NVIDIA Thursday released a video that offers the first public look at Eos, its latest data-center-scale supercomputer.

An extremely large-scale NVIDIA DGX SuperPOD, Eos is where NVIDIA developers create their AI breakthroughs using accelerated computing infrastructure and fully optimized software.

Eos is built with 576 NVIDIA DGX H100 systems, NVIDIA Quantum-2 InfiniBand networking and software, providing a total of 18.4 exaflops of FP8 AI performance. This system is a sister to a separate Eos DGX SuperPOD with 10,752 NVIDIA H100 GPUs, used for MLPerf training in November.

Revealed in November at the Supercomputing 2023 trade show, Eos — named for the Greek goddess said to open the gates of dawn each day — reflects NVIDIA’s commitment to advancing AI technology.

Eos Supercomputer Fuels Innovation

Each DGX H100 system is equipped with eight NVIDIA H100 Tensor Core GPUs. Eos features a total of 4,608 H100 GPUs.

As a result, Eos can handle the largest AI workloads to train large language models, recommender systems, quantum simulations and more.

It’s a showcase of what NVIDIA’s technologies can do, when working at scale.

Eos is arriving at the perfect time. People are changing the world with generative AI, from drug discovery to chatbots to autonomous machines and beyond.

To achieve these breakthroughs, they need more than AI expertise and development skills. They need an AI factory — a purpose-built AI engine that’s always available and can help ramp their capacity to build AI models at scale

Eos delivers. Ranked No. 9 in the TOP500 list of the world’s fastest supercomputers, Eos pushes the boundaries of AI technology and infrastructure.

It includes NVIDIA’s advanced accelerated computing and networking alongside sophisticated software offerings such as NVIDIA Base Command and NVIDIA AI Enterprise.

Eos’s architecture is optimized for AI workloads demanding ultra-low-latency and high-throughput interconnectivity across a large cluster of accelerated computing nodes, making it an ideal solution for enterprises looking to scale their AI capabilities.

Based on NVIDIA Quantum-2 InfiniBand with In-Network Computing technology, its network architecture supports data transfer speeds of up to 400Gb/s, facilitating the rapid movement of large datasets essential for training complex AI models.

At the heart of Eos lies the groundbreaking DGX SuperPOD architecture powered by NVIDIA’s DGX H100 systems.

The architecture is built to provide the AI and computing fields with tightly integrated full-stack systems capable of computing at an enormous scale.

As enterprises and developers worldwide seek to harness the power of AI, Eos stands as a pivotal resource, promising to accelerate the journey towards AI-infused applications that fuel every organization.

Editor’s note: This post was updated on Feb. 19, 2024, to clarify that there are two Eos systems.