Hey Developers: GPU at Heart of World’s Fastest Supercomputer Hits Laptops

by Mark Ebersole

I do a fair amount of traveling, so most of my work life is spent on my laptop. From writing e-mails to hacking away at CUDA code, it’s my computer of choice. 

And it isn’t just me. Laptops are becoming the de facto work platform for many students, programmers and research scientists because they combine desktop-level performance with the benefit of portability.

So, when I learned that our new GeForce 700M-series GPUs (GT 730M, GT 735M, and GT740M) – based on the Kepler architecture and supporting the latest CUDA 5 programming features – are shipping in a new crop of laptops from Acer, Dell, HP, Lenovo, Toshiba and others, I couldn’t wait to get my hands on one.

These are the same CUDA features available on the latest supercomputers like the Titan system at Oak Ridge National Labs and Blue Waters at NCSA, which make them ideal for parallel programming – whether you’re at the office, at home, in a café, or on a plane.

You can do serious science on the super-thin new Razer Blade, but we won't tell if you play a game now and then, too.
You can do serious science on the super-thin new Razer Blade, but we won’t tell if you play a game now and then, too.

CUDA Programming on the Go

One of the great things about the CUDA programming model and GPU computing is that you can develop, test and debug your code on any GPU-based computer. You don’t need a large workstation-class GPU or access a large supercomputing cluster.

But until now, most conveniently sized notebooks were a generation (or more) behind the latest programming features.

The new 700M GPUs – based on the GK208 chip – combine the latest NVIDIA notebook features. Among them are Boost 2.0 technology, which dynamically adjusts clocks up to increase performance, and Optimus, which powers off the GPU when it’s not needed to extend battery life. These complement such key compute features as:

  • Dynamic Parallelism – lets the GPU to operate more autonomously from the CPU by generating new work for itself at run-time
  • Hyper-Q for CUDA Streams – allows for more efficient use of the GPU by making it easy to launch concurrent work
  • Warp shuffle instructions – quickly swap data between threads within a warp
  • Up to 255 registers per-thread – increased from 64 registers in previous generations, allowing each thread to store more data in the fastest possible memory inside the GPU

Develop on GeForce, Deploy on Tesla

Once you’ve got your CUDA code ready, you’ll want to deploy your applications on a higher-end system with a heavyweight number crunching GPU, like a Tesla K20 or K20X accelerator.

Tuning for the Kepler architecture in the GeForce 700M-series GPUs will deliver scalable performance on larger Tesla accelerators. Tesla GPUs also provide crucial data center reliability and manageability features not available on GeForce GPUs, including:

  • NVIDIA GPUDirect RDMA for InfiniBand performance
  • Hyper-Q for MPI
  • ECC protection for all internal and external registers and memories
  • Support for advanced GPU cluster management tools, including Bright Cluster Manager,  Ganglia, Moab Cluster Suite, PBS Works, Platform HPC, Rocks+HPC, and the
    Tesla Deployment Kit

Get Started Today

Whether you’re a CUDA novice or ninja, a GeForce 700M-based laptop will give you the cutting-edge tools you need to take advantage of the performance delivered by massively parallel GPUs in your applications.

All the software you need to start developing with CUDA is available for free at http://www.nvidia.com/getcuda. For beginners, Udacity’s free online “Introduction to Parallel Programming” course is a fantastic starting point to build your skills.

We’re always interested in hearing what you’re doing with CUDA and the power of parallelism. Please use the comments section below or join our developer forums community to share how you’re using GPUs.