NVIDIA and Mellanox introduced new software yesterday that will increase cluster application performance by as much as 30% by reducing the latency that occurs when communicating over Mellanox InfiniBand to servers equipped with NVIDIA Tesla GPUs.
The system architecture of a GPU-CPU server requires the CPU to initiate and manage memory transfers between the GPU and the InfiniBand network. The new software solution will enable Tesla GPUs to transfer data to pinned system memory that a Mellanox InfiniBand solution is able to read and transmit it over the network. The result is increased overall system performance and efficiency.
Prof. Satoshi Matsuoka from the Tokyo Institute of Technology spoke about the impact this technology will have on their next-generation supercomputer, Tsubame 2.0,
“NVIDIA Tesla GPUs deliver large increases in performance across each node in a cluster, but in our production runs on TSUBAME 1 we have found that network communication becomes a bottleneck when using multiple GPUs,” said Prof. Satoshi Matsuoka from Tokyo Institute of Technology. “Reducing the dependency on the CPU by using InfiniBand will deliver a major boost in performance in high performance GPU clusters, thanks to the work of NVIDIA and Mellanox, and will further enhance the architectural advances we will make in TSUBAME2.0.”