Bank on It: Tesla Platform Shatters Record on Risk-Management Benchmark

The newest addition to the Tesla Accelerated Computing Platform – the Tesla K80 GPU – has been on the market for just a few weeks and is already breaking records.

STAC, an industry benchmarking organization that includes major financial institutions, just released results showing that the Tesla platform and K80 achieved a rare milestone.

Tesla swept performance records across the board in one of the industry’s most closely watched benchmark suites, STAC-A2.

Developed with leading banks, STAC-A2 measures the performance of various platforms for pricing and market risk analytics. Financial institutions can use these benchmarks to help decide which computing solutions will best manage market risk – a task that has taken on greater importance since the 2008 financial crisis.

For more on how Tesla achieved this record performance, see “How We Achieved Record Finance Benchmark Performance on Tesla K80” on Parallel Forall. 

Better risk management ultimately means lower risk. That’s a huge benefit for any investor, particularly those holding or managing retirement portfolios.

With the Tesla platform, financial institutions have a powerful collection of technologies, including Tesla GPUs, system software and developer tools, to accelerate their data center workloads dedicated to minimizing risk.

The Tesla Hat Trick

STAC-A2 measures performance in three key categories, along with numerous others. Among these is capturing the full calculation time of the “Greeks” – a measure of how changes in various parameters, such as the price of one particular asset, affects the price of an overall derivative in both “warm” and “cold” runs, as well as energy efficiency.

Traditionally, no single computing system was best-suited for all three categories. Different configurations were optimized for performance for distinct features.

The Tesla K80 accelerator overturns that tradition.

A system equipped with a single Tesla K80 took home the performance crown in all three categories by outperforming three different systems.

Details below:

Benchmark Previous Fastest System Tesla K80 System Tesla K80 vs Previous Fastest
“Warm” Greeks 2x Xeon E5-2699v3 (36 CPU cores), 1x Xeon Phi 7120A 1x Tesla K80 (+ 2 CPU cores), 128GB RAM 1.85x
“Cold” Greeks 4x Xeon E7-4890v2 (60 CPU cores), 1TB RAM 1x Tesla K80 (+ 2 CPU cores), 128GB RAM 1.65x
Energy Efficiency 2x Xeon E5-2697v2 (24 CPU cores) 1x Tesla K80 (+ 2 CPU cores), 128GB RAM 1.4x

In STAC-A2, a “cold” run represents performance of the entire application, including initialization and memory allocation. A “warm” run represents just the computationally intensive portion of the application.

Comparing against the system with dual-socket Haswell CPUs (2x E5-2699v3 CPUs with 36 cores total) and Xeon Phi 7120A, in the “warm” run, the Tesla K80 system delivered 85 percent faster performance.

STAC-A2

In the “cold” run, the Tesla K80 GPU accelerated-system was 4X faster than the same Haswell system with a Xeon Phi co-processor. Without the Xeon Phi co-processor the Haswell system sped up, but the Tesla K80 GPU-based system was still 2.2x faster.

In testing for energy efficiency, the Tesla K80 GPU-based system was 2.53x more efficient than the same dual-socket Haswell system.

In each test, the K80 did all the heavy lifting, only requiring two CPU cores to manage jobs for the GPU while other systems required all CPU cores to be fully utilized. Additionally, the code that we developed for STAC was designed to be read and maintained effortlessly, without complex x86 intrinsics or manual vectorization.

So, take a test drive of the Tesla K80, the most powerful accelerator we’ve ever built. If you have a GPU-accelerated application that can run on multiple GPUs, we encourage you to try the Tesla K80 for free by visiting our Test Drive website.

For more information on NVIDIA GPUs in computational finance, visit our website.

For more on how Tesla achieved this record performance, see “How We Achieved Record Finance Benchmark Performance on Tesla K80” on Parallel Forall. 

Similar Stories

  • jipe4153

    Another benchmark underlining the utter failure of the Phi architecture 🙂 latency and problems with vectorization renders it utterly useles,, all intel can do is buy their customers by selling at a loss.

    However, Intel is very serious about retaining the HPC space and I suspect Nvidia will need to lay a smackdown for at least 2-3 generations before chipzilla decides to cut its losses.