What’s the Difference Between Developing AI on Premises and in the Cloud?

by Paresh Kharya

Choosing between an on-premises GPU system and the cloud is a bit like deciding between buying or renting a home.

Renting takes less capital up front. It’s pay as you go, and features like the washer-dryer unit or leaky roof repair might be handled by the property owner. If their millennial children finally move out and it’s time to move to a different-sized home, a renter is only obligated to stick around for as long as contract terms dictate.

Those are the key benefits of renting GPUs in the cloud: a low financial barrier to entry, support from cloud service providers and the ability to quickly scale up or down to a different-sized computing cluster.

Buying, on the other hand, is a one-time, fixed cost — once you purchase a property, stay there as long as you’d like. Unless they’re living with teenagers, the owner has full sovereignty over what goes on inside. There’s no lease agreement, so as long as everyone fits in the house, it’s okay to invite over a few friends and relatives for an extended stay.

And that’s the same reasoning for investing in GPUs on premises. An on-prem system can be used for as much time and as many projects as the hardware can handle, making it easier to iterate and try different methods without considering cost. For sensitive data like financial information or healthcare records, it might be essential to keep everything behind an organization’s firewall.

Depending on the use case at hand and the kind of data involved, developers may choose to build their AI tools on a deskside system, on-prem data center or in the cloud. More likely than not, they’ll move from one environment to another at different points in the journey from initial experimentation to large-scale deployment.

Using GPUs in the Cloud

Cloud-based GPUs can be used for tasks as diverse as training multilingual AI speech engines, detecting early signs of diabetes-induced blindness and developing media-compression technology. Startups, academics and creators can quickly get started, explore new ideas and experiment without a long-term commitment to a specific size or configuration of GPUs.

NVIDIA data center GPUs can be accessed through all major cloud platforms, including Alibaba Cloud, Amazon Web Services, Google Cloud, IBM Cloud, Microsoft Azure and Oracle Cloud Infrastructure.

Cloud service providers aid users with setup and troubleshooting by offering helpful resources such as development tools, pre-trained neural networks and technical support for developers. When a flood of training data comes in, a pilot program launches or a ton of new users arrive, the cloud lets companies easily scale their infrastructure to cope with fluctuating demand for computing resources.

Adding to cost-effectiveness, developers using the cloud for research, containerized applications, experiments or other projects that aren’t time-sensitive can get discounts of up to 90 percent by using excess capacity. This usage, known as “spot instances,” effectively subleases space on cloud GPUs not in use by other customers.

Users working on the cloud long term can also upgrade to the latest, most powerful data center GPUs as cloud providers update their offerings — and can often take advantage of discounts for their continued use of the platform.

Using GPUs On Prem 

When building complex AI models with huge datasets, operating costs for a long-term project can sometimes escalate. That might cause developers to be mindful of each iteration or training run they undertake, leaving less freedom to experiment. An on-prem GPU system gives developers unlimited iteration and testing time for a one-time, fixed cost.

Data scientists, students and enterprises using on-prem GPUs don’t have to count how many hours of system use they’re racking up or budget how many runs they can afford over a particular timespan.

If a new methodology fails at first, there’s no added investment required to try a different variation of code, encouraging developer creativity. The more an on-prem system is used, the greater the developer’s return on investment.

From powerful desktop GPUs to workstations and enterprise systems, on-prem AI machines come in a broad spectrum of choices. Depending on the price and performance needs, developers might start off with a single NVIDIA GPU or workstation and eventually ramp up to a cluster of AI supercomputers.

NVIDIA and VMware support modern, virtualized data centers with NVIDIA Virtual Compute Server (vCS) software and the NVIDIA NGC container registry. These help organizations streamline the deployment and management of AI workloads on virtual environments using GPU servers.

Healthcare companies, human rights organizations and the financial services industry all have strict standards for data sovereignty and privacy. On-prem deep learning systems can make it easier to adopt AI while following regulations and minimizing cybersecurity risks.

Using a Hybrid Cloud Architecture

For many enterprises, it’s not enough to pick just one method. Hybrid cloud computing combines both, taking advantage of the security and manageability of on-prem systems while also leveraging public cloud resources from a service provider.

The hybrid cloud can be used when demand is high and on-prem resources are maxed out, a tactic known as cloud bursting. Or a business could rely on its on-prem data center for processing its most sensitive data, while running dynamic, computationally intensive tasks in the hybrid cloud.

Many enterprise data centers are already virtualized and looking to deploy a hybrid cloud that’s consistent with the business’ existing computing resources. NVIDIA partners with VMware Cloud on AWS to deliver accelerated GPU services for modern enterprise applications, including AI, machine learning and data analytics workflows.

The service will allow hybrid cloud users to seamlessly orchestrate and live-migrate AI workloads between GPU-accelerated virtual servers in data centers and the VMware Cloud.

Get the Best of Both Worlds: A Developer’s AI Roadmap

Making a choice between cloud and on-prem GPUs isn’t a one-time decision taken by a company or research team before starting an AI project. It’s a question developers can ask themselves at multiple stages during the lifespan of their projects.

A startup might do some early prototyping in the cloud, then switch to a desktop system or GPU workstation to develop and train its deep learning models. It could move back to the cloud when scaling up for production, fluctuating the number of clusters used based on customer demand. As the company builds up its global infrastructure, it may invest in a GPU-powered data center on premises.

Some organizations, such as ones building AI models to handle highly classified information, may stick to on-prem machines from start to finish. Others may build a cloud-first company that never builds out an on-prem data center.

One key tenet for organizations is to train where their data lands. If a business’s data lives in a cloud server, it may be most cost-effective to develop AI models in the cloud to avoid shuttling the data to an on-prem system for training. If training datasets are in a server onsite, investing in a cluster of on-prem GPUs might be the way to go.

Whichever route a team takes to accelerate their AI development with GPUs, NVIDIA developer resources are available to support engineers with SDKs, containers and open-source projects. Additionally, the NVIDIA Deep Learning Institute offers hands-on training for developers, data scientists, researchers and students learning how to use accelerated computing tools.

Visit the NVIDIA Deep Learning and AI page for more.

Main image by MyGuysMoving.com, licensed from Flickr under CC BY-SA 2.0.