NVIDIA to Acquire GPU Orchestration Software Provider Run:ai

Israeli startup promotes efficient cluster resource utilization for AI workloads across shared accelerated computing infrastructure.
by Alexis Bjorlin

To help customers make more efficient use of their AI computing resources, NVIDIA today announced it has entered into a definitive agreement to acquire Run:ai, a Kubernetes-based workload management and orchestration software provider.

Customer AI deployments are becoming increasingly complex, with workloads distributed across cloud, edge and on-premises data center infrastructure.

Managing and orchestrating generative AI, recommender systems, search engines and other workloads requires sophisticated scheduling to optimize performance at the system level and on the underlying infrastructure.

Run:ai enables enterprise customers to manage and optimize their compute infrastructure, whether on premises, in the cloud or in hybrid environments.

The company has built an open platform on Kubernetes, the orchestration layer for modern AI and cloud infrastructure. It supports all popular Kubernetes variants and integrates with third-party AI tools and frameworks.

Run:ai customers include some of the world’s largest enterprises across multiple industries, which use the Run:ai platform to manage data-center-scale GPU clusters.

“Run:ai has been a close collaborator with NVIDIA since 2020 and we share a passion for helping our customers make the most of their infrastructure,” said Omri Geller, Run:ai cofounder and CEO. “We’re thrilled to join NVIDIA and look forward to continuing our journey together.”

The Run:ai platform provides AI developers and their teams:

  • A centralized interface to manage shared compute infrastructure, enabling easier and faster access for complex AI workloads.
  • Functionality to add users, curate them under teams, provide access to cluster resources, control over quotas, priorities and pools, and monitor and report on resource use.
  • The ability to pool GPUs and share computing power — from fractions of GPUs to multiple GPUs or multiple nodes of GPUs running on different clusters — for separate tasks.
  • Efficient GPU cluster resource utilization, enabling customers to gain more from their compute investments.

NVIDIA will continue to offer Run:ai’s products under the same business model for the immediate future. And NVIDIA will continue to invest in the Run:ai product roadmap, including enabling on NVIDIA DGX Cloud, an AI platform co-engineered with leading clouds for enterprise developers, offering an integrated, full-stack service optimized for generative AI.

NVIDIA HGX, DGX and DGX Cloud customers will gain access to Run:ai’s capabilities for their AI workloads, particularly for large language model deployments. Run:ai’s solutions are already integrated with NVIDIA DGX, NVIDIA DGX SuperPOD, NVIDIA Base Command, NGC containers, and NVIDIA AI Enterprise software, among other products.

NVIDIA’s accelerated computing platform and Run:ai’s platform will continue to support a broad ecosystem of third-party solutions, giving customers choice and flexibility.

Together with Run:ai, NVIDIA will enable customers to have a single fabric that accesses GPU solutions anywhere. Customers can expect to benefit from better GPU utilization, improved management of GPU infrastructure and greater flexibility from the open architecture.