Ending Analysis Paralysis: NVIDIA and MapD Solve Massive Big Data Woes Across Industries

by Renee Yao

More data has been created in the past two years than in the entire history of the human race. To visualize and gain insights from this galaxy of information, MapD offers a new approach, accelerated by NVIDIA GPUs.

MapD’s database intelligently partitions, compresses and caches data across all GPUs, providing users with up to 100x faster database queries — no indexing or optimizations required. When paired with the MapD Immerse analytics front-end, the system delivers instant visual insights into datasets with billions of records.

Mark Litwintschik, a consultant, blogger and database fanatic from the UK, recently tested more than a dozen different databases and configurations using a massive dataset first released late last year. This dataset comprises staggering detail on 1.2 billion individual taxi, limo and Uber trips in New York City over more than five years — including full GPS, transaction type, passenger counts and timestamps.

MapD real-time data analytics at JFK airport
Visualization of transit activity at JFK airport.

Most of Litwintschik’s prior work had been on CPU-based systems. Testing MapD powered by NVIDIA GPUs resulted in 55x faster performance.

“For me personally, the future of business intelligence reporting is GPU-based,” said Litwintschik. “The cards these benchmarks ran on are based on an architecture that’s two generations old, yet the query times are 55x faster in some cases than I’ve seen anywhere else — including large, clustered CPU solutions.”

According to Litwintschik, “the future looks extremely bright” for the world of BI. And in fact, NVIDIA and MapD have been working together to help companies across a variety of industries to filter and visualize their massive datasets without lag.

MapD real-time data analytics
MapD uses NVIDIA GPUs to provide real-time data analytics across massive, complex datasets like the NYC traffic data visualized here.

Verizon Tunes Database to Handle Unrivaled Volume, Velocity of Data

Across every line of business — and from marketing and sales to network and content operations — few industries can rival telecommunications when it comes to the volume and velocity of data. Whether forensics on dropped calls, sensor data, log files, customer churn, device stats or data center performance, the data arrives constantly and relentlessly. Complicating matters is the need to see that data in real time — making pattern recognition and root cause analysis difficult.

Verizon has applied MapD’s GPU-tuned database to the challenge of monitoring the smartphones in its network to assess a variety of metrics. Before MapD, this query would take hours to run, so the company only did it periodically. With MapD, these same queries complete in milliseconds and render instantaneously. This allows Verizon to quickly identify the root causes of problems, helping customers and the company’s operations and logistics teams.

“Because the database is using the true power of the GPUs, the data is available almost immediately to the processors,” said Abdul Subhan, senior solutions architect at Verizon.

When Billions Are at Stake

To create competitive advantage, financial firms have invested billions of dollars in core technology, such as high-speed networks, immense data stores and algorithmic trading models. But the speed of hypothesis generation and testing has lagged, pulled down by CPU technologies that are ill suited to querying and visualizing billions of records with mere millisecond delays.

NVIDIA and MapD have worked with a hedge fund client that developed a rich proprietary dataset that has grown exponentially over time. The firm’s ability to efficiently ask questions of that dataset had not kept pace. For a fund of its size, the opportunity cost of seemingly small delays can run to the millions of dollars — on a single trade.

Using MapD’s offering, the company can query and render results graphically in milliseconds, enabling it to build an informational advantage over competitors. With MapD’s GPU-powered data exploration platform, new investment ideas can be tested immediately, leading to a more fluid and creative process for portfolio managers, traders and analysts alike.

Managing the World of JavaScript

While the millions of images and videos hitting the likes of Twitter, Facebook and Snapchat each day garner the headlines, the explosion in data is driven as much by machines as by humans. Less glamorous information on customer activity, users, transactions, applications, servers, mobile devices and networks is piling up as machine data.

This highly dimensional data, along with its stunning volume and velocity, confounds CPU-bound approaches. Superior performance is what led, npm, Inc., the most widely used package manager for JavaScript to choose NVIDIA and MapD for its database challenges.

Npm hosts over a quarter million packages of reusable code and is used daily by over 4 million developers worldwide. Collectively, they make more than 20 billion requests a month. With the parallel processing power of GPUs and MapD’s GPU-tuned database, npm was able to query the data in milliseconds vs. minutes.It could grasp exactly what was occurring within the JavaScript community at any given moment — at a fraction of the cost of less performant solutions.

“With 20 billion requests a month, we wanted a lightning-fast, industrial-grade database that could handle our need for ad-hoc data analysis,” said Laurie Voss, chief technology officer at npm. “Our requirements demanded exceptional performance and scalability to power through large, complex queries and we found the answer in MapD.”

DGX-1 deep learning supercomputer
The NVIDIA DGX-1 deep learning supercomputer.

DGX-1: A Quantum Leap in Performance

Customers in telecoms, finance and tech are just the start. MapD is working closely with NVIDIA on its latest innovation: the NVIDIA DGX-1. This supercomputer-in-a-box can deliver throughput equal to 250 conventional servers in a single system with eight Tesla P100 GPUs and 128GB of GPU memory.

Companies that can benefit from faster, better performance with GPU-enabled solutions span retail, insurance, manufacturing, healthcare and many more industries.