As GPU-accelerated applications in areas such as AI, data analytics, computer-aided design and computer-generated imagery have become more mission-critical, companies are faced with the question of how to support these workloads at scale. They can no longer afford to spend time designing and building custom hardware every time they need to deploy another accelerated application.
The NVIDIA-Certified Systems program answers this need by bringing together NVIDIA GPUs and NVIDIA networking in systems from leading vendors. The systems conform to NVIDIA’s design best practices and pass a set of certification tests that validate the best configurations for performance, manageability, scalability and security.
By choosing an NVIDIA-Certified System, enterprises can confidently choose preconfigured, performance-optimized servers to power accelerated computing workloads at any scale.
New Partners, GPU Choices
The NVIDIA-Certified Systems program has seen tremendous uptake since first announced in January. New systems have been certified from partners such as ASUS, Atos, BOXX Technologies, Fujitsu, H3C, Lenovo, Nettrix and QCT.
Certifications also now include the NVIDIA A40, for the best graphics capabilities, and the NVIDIA T4 Tensor Core GPU, for more economical, lower-powered systems. These are in addition to the NVIDIA A100, for customers seeking the best compute performance. For a complete list of supported GPU and networking components, see the NVIDIA-Certified Systems page.
To date, we have almost 40 NVIDIA-Certified Systems from nearly a dozen partners, with more being added every month. Consult the Qualified Server Catalog page to see which servers and GPUs have been certified, or ask your preferred vendor which certified servers they have available. We’ll be adding new NVIDIA GPUs to the certification program in the future, including the recently announced A30 and A10.
Expanded Workload Coverage
One of the main benefits of using an NVIDIA-Certified System is the configuration that allows for a wider variety of accelerated workloads. The certification test suite checks the performance and functionality of each server design by running a set of software that represents real-world applications of many types.
Since the introduction of this program, we’ve expanded the certification test suite to encompass a greater range of representative workloads and exercise the servers in even more ways. Some of the applications that are part of the testing include:
- Deep learning training with TensorFlow and PyTorch, including multi-node training
- AI inference with TensorRT and Triton Inference Server
- Data science with RAPIDS and Apache Spark
- Core acceleration algorithms with CUDA and the NVIDIA HPC SDK
- Batch rendering with Blender, Octane, Redshift and V-Ray
The tests also include end-to-end AI application workflows, which can exercise a system in multiple ways and validate good configurations for real-life use. These tests are performed using NVIDIA AI frameworks from the NVIDIA NGC catalog such as: NVIDIA DeepStream for intelligent video analytics, NVIDIA Clara for healthcare applications and NVIDIA Riva for conversational AI.
In addition, there are a number functional tests designed to ensure that servers are configured for best manageability, security and scalability. These tests include:
- Remote management with Redfish
- Host security with TPM, ChipSec and UEFI
- Network performance
- Accelerated data transfer using GPUDirect RDMA and GPUDirect Storage
In summary, the test suite simulates the applications and use cases that enterprise customers will encounter in data centers. The systems must pass the performance threshold for all these different tests and only then are they certified.
Fully Supported Enterprise Software
For companies looking to bring modern accelerated computing from the realm of data science and developers into their mainstream applications, NVIDIA is now offering fully supported, enterprise-grade software packages.
NVIDIA-Certified Systems provide the server platform for running these applications. The systems’ optimized design, predictable performance and ability to scale out make them the best choice for businesses that want to implement an enterprise-grade accelerated computing solution.
Announced last month at NVIDIA GTC:
- NVIDIA AI Enterprise — an end-to-end, cloud-native suite of AI and data analytics software that is optimized, certified and supported by NVIDIA to run on VMware vSphere with NVIDIA-Certified Systems. It includes key enabling technologies and software from NVIDIA for rapid deployment, management and scaling of AI workloads in the modern hybrid cloud NVIDIA runs tests of the NVIDIA AI Enterprise software to validate optimal performance and functionality on vSphere with NVIDIA-Certified Systems.
- NVIDIA Omniverse Enterprise — a revolutionary platform for virtual collaboration and true-to-reality simulation. Globally dispersed teams can accelerate their workflows with one-click interoperability between leading software tools and seamlessly collaborate in a shared virtual world. The platform includes licensable software and full enterprise support. NVIDIA Omniverse Enterprise is optimized to run on NVIDIA-Certified Systems in the data center, as well as NVIDIA RTX laptops and workstations.
NVIDIA-Certified Systems Configuration Guide
NVIDIA-Certified Systems are validated to have the best base configuration for running general accelerated computing workloads. Customers can adjust this configuration to better match the primary workload they intend to run on these systems. For example, if they plan to do deep learning training on large models, then they might want to add more GPUs to their servers.
To assist customers with this process, the NVIDIA-Certified Systems Configuration Guide provides the server topology and system configuration recommendations for inference and deep learning training, with other workloads to be added in the future. The guide details component sizing and balancing, PCIe topology, storage and more. When a customer wishes to adjust the base configuration of an NVIDIA-Certified System, they can use this new guide to ensure they maintain an optimized design.
Learn more about NVIDIA-Certified Systems with these resources: