New NVIDIA Storage Partner Validation Program Streamlines Enterprise AI Deployments

Global enterprises now have multiple NVIDIA-validated storage options for NVIDIA OVX computing systems.
by Jason Schroedl

A sharp increase in generative AI deployments is driving business innovation for enterprises across industries. But it’s also posing significant challenges for their IT teams, as slowdowns from long and complex infrastructure deployment cycles prevent them from quickly spinning up AI workloads using their own data.

To help overcome these barriers, NVIDIA has introduced a storage partner validation program for NVIDIA OVX computing systems. The high-performance storage systems leading the way in completing the NVIDIA OVX storage validation are DDN, Dell PowerScale, NetApp, Pure Storage and WEKA.

NVIDIA OVX servers combine high-performance, GPU-accelerated compute with high-speed storage access and low-latency networking to address a range of complex AI and graphics-intensive workloads. Chatbots, summarization and search tools, for example, require large amounts of data, and high-performance storage is critical to maximize system throughput.

To help enterprises pair the right storage with NVIDIA-Certified OVX servers, the new program provides a standardized process for partners to validate their storage appliances. They can use the same framework and testing that’s needed to validate storage for the NVIDIA DGX BasePOD reference architecture.

To achieve validation, partners must complete a suite of NVIDIA tests measuring storage performance and input/out scaling across multiple parameters that represent the demanding requirements of various enterprise AI workloads. This includes combinations of different I/O sizes, varying numbers of threads, buffered I/O vs. direct I/O, random reads, re-reads and more.

Each test is run multiple times to verify the results and gather the required data, which is then audited by NVIDIA engineering teams to determine whether the storage system has passed.

The program offers prescriptive guidance to ensure optimal storage performance and scalability for enterprise AI workloads with NVIDIA OVX systems. But the overall design remains flexible, so customers can tailor their system and storage choices to fit their existing data center environments and bring accelerated computing to wherever their data resides.

Generative AI use cases have fundamentally different requirements than traditional enterprise applications, so IT teams must carefully consider their compute, networking, storage and software choices to ensure high performance and scalability.

NVIDIA-Certified Systems are tested and validated to provide enterprise-grade performance, manageability, security and scalability for AI workloads. Their flexible reference architectures help deliver faster, more efficient and more cost-effective deployments than independently building from the ground up.

Powered by NVIDIA L40S GPUs, OVX servers include NVIDIA AI Enterprise software with NVIDIA Quantum-2 InfiniBand or NVIDIA Spectrum-X Ethernet networking, as well as NVIDIA BlueField-3 DPUs. They’re optimized for generative AI workloads, including training for smaller LLMs (for example, Llama 2 7B or 70B), fine-tuning existing models and inference with high throughput and low latency.

NVIDIA-Certified OVX servers are now available and shipping from global system vendors, including GIGABYTE, Hewlett Packard Enterprise, Lenovo and Supermicro. Comprehensive, enterprise-grade support for these servers is provided by each system builder, in collaboration with NVIDIA.

Availability 

Validated storage solutions for NVIDIA-Certified OVX servers are now available, and reference architectures will be published over the coming weeks by each of the storage and system vendors. Learn more about NVIDIA OVX Systems.