by Steve Wildstrom

Financial derivatives and quantitative analysis of securities took something of a reputational hit in the 2008 crash. But they are still very much a part of the mathematically sophisticated Wall Street scene. And it turns out that the intense computations needed for trading financial options is well suited to GPU processing—and a number of CUDA-based tools have been developed to make it easier.

The idea of an option is relatively simple. Basically an option gives you the right to buy (a call option) or sell (a put option) some good or security at a fixed price by some date in the future. A call is a bet that the price will go up. If you have an option to buy 100 shares of stock X at $100 per share and the stock hits $110 before the option expires, you score a profit of $1,000 (10 X $100) less the initial cost of the option. A put works the same but in reverse; you profit if the price of the stock falls below the option’s “striking price.”

While the concept is straightforward, the pricing of options, and decisions on when to buy or exercise them is anything but. Quantitative analysts, or quants, use statistical models to consider large numbers of possibilities over time and to find the likeliest outcome.

One of the oldest methods for assessing prices over time is the Monte Carlo simulation, a technique that requires generating large numbers of random data points following some distribution. According to, the big French bank BNP Paribas ran up against capacity constraints running its Monte Carlo pricing models on a cluster of CPUs and moved to a CUDA-based GPU system to get higher performance.

Probably the best known approach to financial modeling is the Black-Scholes equation, for which two of its developers, Myron Scholes and Robert Merton, won the 1997 Nobel Prize in economics (the third, Fischer Black, had died in 1995.) Black-Scholes is a differential equation that assumes the security prices are a random walk, that is, that day-to-day price fluctuations are random.

OnEye, and Australian firm specializing in high-performance financial computing, found running a CUDA-based Black-Scholes model on an NVIDIA GPU produced results almost 700 times faster than running the same analysis on a 2.21 GHz AMD Athlon 64 X2 CPU alone. The GPU system was able to price over 4 billion options per second.


Another way to analyze options is to look at prices over time. In the binomial model, a price goes up or down by a fixed amount at each time interval with each direction of movement having its own probability. The result is a tree-like structure in which each level represents a time period and each leaf a possible price. Marayam Ganesan, Roger D. chamberlain. And Jeremy Buhler at Washington University in St. Louis ran a CUDA binomial model on an NVIDIA GPU. “We theorize an optimal speedup of 15× over a comparable parallel implementation for a problem size of 1000 time steps,” they wrote. “In general the expected speed-up is proportional to the square root of the problem size.”

The trinomial model is a little more complicated in that it lets a price go up, down, or stay the same at each time interval. Two French researchers, Gregoire Jauvion and Tuan Nguyen built a trinomial option pricing model for a system with an Intel quad-core 3.2 GHz Core 2 Duo CPU and an NVIDIA GPU. The CUDA code on the GPU ran nearly 32 times faster for a model pricing 64 options over 1,024 time periods.

One theme that runs through all the research reports of the use of CUDA for options pricing models is that getting the maximum performance gain requires careful planning to take advantage of the strengths and minimize the weaknesses of GPU processing. But beyond that, it doesn’t seem to have posed any particular challenges. As John nelson, who writes the Path Dependent blog on programming, complex systems, and trading wrote:

“I started learning CUDA yesterday; I wrote my first simple CUDA program today. The library does have a non-negligible learning curve, but it is not steep. It largely is a matter of learning the most efficient ways to work with CUDA (e.g. shared, local, or constant memory). Happily, this is an incremental process; You can learn to write bad yet working CUDA applications while slowly learning to write them better; And, as a bonus, even your bad code is likely to run laps around your CPU (for finance apps anyway.)”

This post is an entry in The World Isn’t Flat, It’s Parallel series running on nTersect, focused on the GPU’s importance and the future of parallel processing. Today, GPUs can operate faster and more cost-efficiently than CPUs in a range of increasingly important sectors, such as medicine, national security, natural resources and emergency services. For more information on GPUs and their applications, keep your eyes on The World Isn’t Flat, It’s Parallel.