CloudSight Delivers Visual Cognition Powered by Scalable Deep Learning

“Nothing kills creativity more than knowing that playtime costs money.” This insight, expressed by CloudSight CEO and founder Brad Folkens, captures the essence of why organizations like his turn to NVIDIA DGX to power their deep learning initiatives.
by Tony Paikeday

As humans, we know that the picture in the header of this blog is that of a broken coffee cup. We’ve  been conditioned, over time, to recognize the visual cues – the cup’s edges, the handle, the color of the spilled liquid, and even the logo printed on the side. Our brains add it all up and within milliseconds come to the correct conclusion.

But what if you’ve never seen a broken coffee cup before?

Chihuahuas, Muffins and Broken Coffee Cups

The story starts with the problem statement: How do you go beyond traditional image recognition to deliver a deep learning-powered service that helps organizations tap new meaning from their data? In this case, we’re talking about the ability “see” deeper into a given image, and understand on a near-human level, what the picture is really about, as in the case of our broken cup example.

Photo courtesy @teenybiscuit

Take the example on the left. This is the classic “Chihuahua or Muffin?” dilemma, which has successfully confused some of the best-tuned models as to whether it’s observing a cute dog or a delicious treat. We’ll come back to this in a minute, but let’s first explore how CloudSight built a differentiated service, powered by NVIDIA DGX, to help its customers derive new insights from data.

Fear of Mistakes Can Stifle Creativity

For CloudSight to see beyond the limits of traditional image recognition, they needed to spend (and continue to spend) serious cycles on deep learning training. More specifically, over a half-billion images, and lots of experiments and iterations.

CloudSight’s development journey started in the cloud, where elasticity of scale was easy, but compute cycles were expensive. So when it came to experimentation, their data scientists were extremely careful in preparing, verifying and re-verifying the settings for each training run.

The cost of avoiding mistakes in the cloud started to get eclipsed by the cost of practitioner labor focused on getting everything “just right.” This mode of operation started to stifle creativity and impeded the path to building the remarkable service they offer today.

Driving Down the Cost per Experiment

This is when CloudSight made the strategic decision to bring deep neural net training in-house, and leverage on-premises infrastructure, starting with an NVIDIA DIGITS DevBox, and ultimately landing on the DGX-1.

While DevBox afforded them more control and agility in executing training runs, and exploring more “what-if” scenarios, they soon realized that investing in more GPU-accelerated horsepower would result in more training jobs that could be executed on the platform, helping to quickly reduce the cost per experiment.

60x Faster Training with NVIDIA DGX

One of the early revelations CloudSight experienced as they moved their training runs to DGX was the power of the optimized software stack in terms of delivering a speed-up that goes above and beyond what the hardware alone was capable of.

DGX software includes the latest deep learning framework optimizations, developed by NVIDIA engineers, for maximized GPU performance across the stack – from the framework to the drivers to supporting libraries. When CloudSight started using these containers for training, they noticed a dramatic speed-up in training time – 61x faster than what they were accustomed to.

With this kind of performance now in-house, defraying the cost per experiment over an exponentially growing volume of training runs created a liberating effect for CloudSight’s scientists and engineers. Now they could experiment freely and pursue more angles of investigation, without fear of the cost of failure.

The result of this continued effort and investment is CloudSight API – a DGX-powered service that CloudSight’s customers can use for real-time inferencing of their images. With CloudSight API, organizations have the power to transform processes and bring products to market faster, saving considerable expense that would otherwise be required for building an in-house deep-learning practice and platform, not to mention the op-ex of managing the infrastructure.

Back to our Chihuahua/muffin dilemma. It turns out that CloudSight’s service was able to nail over 81 percent caption accuracy, but went beyond the goal and identified specific breeds/color. You can try out CloudSight API with your own image challenge here.

Learn More about Visual Cognition and NVIDIA DGX

To learn more about visual cognition and how CloudSight overcame the barriers to deep learning scale with NVIDIA DGX, check out our webinar. It’ll give you first-hand insights from CloudSight CEO and founder Brad Folkens.

Learn more about NVIDIA DGX at

Additional reading from