In the Wake of GTC 2018, It’s Full Stream Ahead for GPU-Accelerated Analytics

by Renee Yao

For businesses trying to stay competitive, it’s not easy to learn from increasingly vast volumes of data, cope with the complexity of analysis or keep up with siloed analytics solutions while on legacy infrastructure.

Waiting for slow queries to load, manually figuring out correlations among data or tediously copying and converting data back and forth is a losing proposition.

By turning to GPU-accelerated analytics, companies can overcome slow queries, dynamically correlate among data and enjoy zero copy. To do so, they’re leveraging a variety of GPU-accelerated solutions, including:

  • Databases for processing queries in-memory or from disks — and avoiding wasteful time spent on serialization, deserialization and copying of data.
  • Deep learning-enabled data-prep solutions to extract transform load, label or augment data.
  • Immersive visualization platforms to clearly and quickly view and understand correlations among data to train machine learning and deep learning models for accurate business insights.
  • Machine learning algorithms, recipes and platforms for automatic feature engineering — and faster and more scalable inferencing.

From SQL and streaming, to graph computation, to machine learning and deep learning, companies are taking advantage of GPU computing platforms to address the challenges in each category.

Many of NVIDIA’s GPU-accelerated partners made announcements at the GPU Technology Conference last week. They unveiled game-changing results, and made their solutions more accessible, seamless, connected and faster for customers across industries. Here are a few:

Deep Learning Partner

  • Deepgram is the first end-to-end deep learning speech recognition system in production that uses NVIDIA GPUs for inferencing and training. (They’re also the newest member of NVIDIA’s GPU Ventures portfolio.) Serving 5,000 clients — from financial institutions to journalists — around the world, their inferencing is 100 times more efficient than when using CPUs.

Machine Learning Partner

  •, the GPU-accelerated open leader in machine learning, announced that their H2O4GPU (GPU-accelerated algorithms) and DriverlessAI (GPU-accelerated ML platform) are available on NVIDIA Volta and CUDA 9. Their algorithms now include:
    • Truncated SVD (singular value decomposition) and PCA (principal component analysis) for dimensionality reduction and feature engineering
    • A new R API brings the benefits of GPU-accelerated machine learning to the R user community
    • Improved K-means and XGBoost performance on multi-GPU workloads for faster performance and the ability to handle larger datasets

Accelerated Analytics Partners

  • BlazingDB, makers of a GPU-accelerated SQL data warehouse solution, announced their version 2. Now it’s fundamentally integrated Apache Parquet into their SQL analytics engine, and can run SQL queries directly on Parquet files. No ingest. No duplication of terabyte datasets. No consistency management.
  • announced the launch of its high performance computing software, FDIO Engine, the first GPU-native stream processing engine that leverages NVIDIA GPUs and Apache Arrow to provide real-time performance up to 1,000 times faster than any CPU-based software for processing big data in motion. They also recently secured $5 million in seed funding.
  • Graphistry announced the industry’s first GPU visual investigation platform for security and fraud teams. Frontline federal and enterprise teams are leveraging its new enterprise tier to gain 360-degree views across all their data sources and visually streamline workflows around data gathering, correlation and analysis. Graphistry has seen analysts digging through 3x more data sources per investigation, going 4x queries deeper and seeing 100x more correlated data points.
  • Kinetica, the instant insight engine powered by GPUs for the extreme data economy, announced its availability on NVIDIA GPU Cloud and DGX systems. Also, enterprise product veteran Irina Farooq joined Kinetica last week as their new vice president of product management.
  • MapD, the extreme analytics platform provider, provides a GPU-accelerated database and visualization platform called Immerse. They announced MapD Cloud, the first software as a service analytics platform, available at several subscription tiers. Users, who can start with a free, two-week trial for up to 100 million rows, can access the platform anywhere with an email and a few clicks.
  • SQream, which recently announced a strategic partnership with Alibaba Cloud, is another GPU database designed to enable unparalleled business intelligence from massive data stores. They combined with X-IO Axellio Edge Micro-Datacenter and can ingest and analyze petabytes of raw data run queries up to 2.5x faster than other flash-based hardware solutions. DataDirect Networks also announced its strategic partnership with SQream to deliver real-time big data analytics solutions. DDN’s SFA and IME high-performance NVMe SSD storage platforms provide business acceleration of more than 1,000 percent in AI and deep learning applications.

Many of these partners participate in the GPU Open Analytics Initiative (GoAi), which brings the open source community together and allows data scientists to explore data, train machine learning algorithms and build applications that benefit from processing the majority of workloads on GPUs.

To learn more about GoAi, listen in on Josh Patterson’s talk on GoAi One Year Later, which is available now to GTC attendees and will be made available on demand to the general public next month.

Check out our GPU-accelerated analytics website for the latest updates.