5 Predictions for Data Center AI Infrastructure in 2020

by Tony Paikeday

The AI “gold rush” is on — and in 2020 more enterprises will reap its rewards via bread-and-butter use cases that enhance customer relationships, deliver better outcomes for healthcare patients, and improve the quality of life for citizens.

As the year unfolds, a number of key trends will accelerate the pace at which mainstream enterprises invest in AI infrastructure, while providing a clearer path to faster, more transparent return on investment.

1. AI is the New UI

With the proliferation of AI assistants, virtual support agents and conversational AI,  investment in natural language processing infrastructure will grow significantly. This is enabled by a dramatic compression in the training time for the most complex algorithms that unlock natural language understanding at superhuman levels. Models that once took weeks to train are now solvable in just under an hour.

This year, more organizations will infuse their business with more interactive human/machine experiences. Simultaneously, this trend will drive a deeper commitment to scaled AI infrastructure that supports the rapid innovation cycle required to deliver these models and continually improve them.

2. More Businesses Will Industrialize AI Development

The process of going from a concept, to a model prototype, to production training and inference has largely been hand-guided and “artisanal” in nature. Organizations are now focusing intently on how to mechanize this workflow while enabling better specialization of AI roles that are part of it.

With the right system design in place, IT teams will empower data scientists to focus on productive experimentation and model prototyping. They’ll leave the toil of “systems integration” and squarely focus on rapid iteration and innovation.

Meanwhile, data engineers and machine learning operations teams will focus on delivering a streamlined data pipeline that can support the end-to-end model development workflow. This industrialized workflow will move AI into the top-tier of workloads managed by enterprise IT. Which brings us to …

3. IT Leaders Will Build AI Centers of Excellence

AI applications and infrastructure will come out of the shadows, with their disparate development teams, budgets and silos. Led by visionary CIOs, businesses will deploy AI centers of excellence that consolidate silos and centralize talent development, tools and best practices.

What used to be found only in the world’s largest supercomputing sites and academia is making its way into enterprise. The latest TOP500 list shows a number of enterprises submitted results from their own highly efficient, high-performance superclusters that not only crush HPC benchmarks but also fuel business innovation. These winning entries were deployed in weeks by using NVIDIA DGX systems as the compute building block to effectively “systemize” the building of large-scale AI infrastructure.

4. Accessing Infrastructure Gets Easier 

Not all businesses are ready to revamp their data center to support the facilities demands associated with AI compute. Yet, many need to rein in their escalating AI training operations expenditures to gain better control of how much compute they procure, when to procure it and how to pay for it.

This year we’ll see more service providers deliver AI infrastructure hosting solutions as well as AI compute infrastructure-as-a-service offerings. These will help get companies out of the data center CapEx cycle, and onto an affordable OpEx model that’s as convenient as cloud, but with deterministic performance of a dedicated, on-prem system — just not their prem necessarily.

For those that have done the math, and know that ownership makes sense compared to cloud or cloud-like, infrastructure try-and-buy and test-drives will provide a way to kick the tires before committing capital.

5. The Reality of “Data Gravity” Will Force IT to Reconsider Where to Train 

Train where your data lands.” It seems like common sense, and yet many businesses find themselves in the escalating OpEx cycle of snowballing datasets that need to be pushed from their data lake to their compute instance — which is often cloud.

Cloud GPUs are often essential when developers need easy access and productive experimentation. But as datasets and model complexity grows, IT managers are finding rising costs due to data transit, hosting and compute cycles.

Recognizing this inflection point, more businesses will repatriate their AI compute resources to where their datasets live. We’ll see a tiered architecture that supports not only metro core data center training, but even mobile supercomputing scenarios, where a multi-petaflops system might be dropped into place, anywhere you need it, when you need it.

Summing up, in 2020, enterprise AI will unlock transformative use cases that have tangible impact on the top and bottom lines. The resources required to train these models will demand unprecedented scale of compute infrastructure, making forward-leaning enterprises appear more and more like supercomputing sites that can streamline the AI development workflow from concept to production.

And, putting it all together will no longer mean owning the infrastructure or even owning a data center, thanks to new service offerings that eliminate the barrier to getting the deterministic performance of an on-prem system, with the simplicity and ease of cloud.