IBM, Orange Use GPUs for Next Generation Enterprise Big Data Analytics at GTC

by Sumit Gupta

IBM is taking a big step in applying GPU technology to help solve some of the biggest enterprise IT challenges.

At next week’s GPU Technology Conference (GTC) in San Jose, Calif. (March 24-27), IBM will demo GPU accelerated machine learning for data clustering using the Hadoop and Mahout framework (GTC Booth #921).

Such technology opens the door for all types of enterprise companies optimize their customers’ experiences – allowing them to mine, and make better use of, the vast amounts of data they collect on a daily basis.

For example, it will allow retailers, entertainment sites, and internet browsing companies to make far more accurate, user-specific recommendations for new products and services.

But, they need to get over the big data hurdle first, which is where IBM – and GPUs – come into play.

Mountains of Data

HadoopThe digital age has increased the velocity and volume of data collected — about 2.5 exabytes is generated every day, or 250,000 times the size of the printed collection at the U.S. Library of Congress.

And every second, more data crosses the internet than was on the entire internet 20 years ago.

This presents a huge opportunity for enterprise companies to mine the data – enabling them to find customers and consumer interests, and deliver better products and services.

Retailers, for example, will be able to group their customers into segments with similar behavior, so that they can create customized products and target marketing programs more effectively.

A computational technique called segmentation or clustering, which identifies non-obvious patterns in data by analyzing hundreds of different dimensions, can be used for this.

IBM is demonstrating the use of GPU accelerators on a distributed computing system (required for such an enormous data set) for clustering using Hadoop. With GPU accelerators working alongside IBM Power CPUs, the demo runs eight-times faster than with a Power system without GPUs.

GPU acceleration dramatically shortens the time to insights, and enables running many more scenarios and more sophisticated analytics that otherwise would be prohibitively expensive.

Orange Makes Big Data Analytics SQream

Another example of big data analytics using GPUs comes from Orange Silicon Valley, the US innovation center for the international telecom giant, Orange. Together with SQream Technologies, Orange Silicon Valley is conducting tests to benchmark SQL database queries.

Orange is using a single NVIDIA GPU to process over 4 billion anonymous call data records – the equivalent of four months of multimedia internet traffic.

In fact, the performance of SQream’s GPU accelerated database analytics solution is far faster, and 40 times less expensive in infrastructure costs, than the popular Teradata solution widely used in telecom applications today.

Data Analytics at GPU Technology Conference

IBM, recently recognized  for its leadership in The Forrester Wave: Big Data Hadoop Solutions report, SQream, and several other companies are presenting their work at the GTC next week.

If you haven’t already registered for GTC, you can still do so here.