AI is the most powerful new technology of our time, but it’s been a force that’s hard to harness for many enterprises — until now.
Many companies lack the specialized skills, access to large datasets or accelerated computing that deep learning requires. Others are realizing the benefits of AI and want to spread them quickly across more products and services.
For both, there’s a new roadmap to enterprise AI. It leverages technology that’s readily available, then simplifies the AI workflow with NVIDIA TAO and NVIDIA Fleet Command to make the trip shorter and less costly.
Grab and Go AI Models
The journey begins with pre-trained models. You don’t have to design and train a neural network from scratch in 2021. You can choose one of many available today in our NGC catalog.
We’ve curated models that deliver skills to advance your business. They span the spectrum of AI jobs from computer vision and conversational AI to natural-language understanding and more.
Models Show Their AI Resumes
So users know what they’re getting, many models in the catalog come with credentials. They’re like the resume for a prospective hire.
Model credentials show you the domain the model was trained for, the dataset that trained it, how often the model was deployed and how it’s expected to perform. They provide transparency and confidence you’re picking the right model for your use case.
Leveraging a Massive Investment
NVIDIA invested hundreds of millions of GPU compute hours over more than five years refining these models. We did this work so you don’t have to.
Here are three quick examples of the R&D you can leverage:
For computer vision, we devoted 3,700 person-years to labeling 500 million objects from 45 million frames. We used voice recordings to train our speech models on GPUs for more than a million hours. A database of biomedical papers packing 6.1 billion words educated our models for natural-language processing.
Transfer Learning, Your AI Tailor
Once you choose a model, you can fine tune it to fit your specific needs using NVIDIA TAO, the next stage of our expedited workflow for enterprise AI.
TAO enables transfer learning, a process that harvests features from an existing neural network and plants them in a new one using NVIDIA’s TAO Toolkit, an integrated part of TAO. It leverages small datasets users have on hand to give models a custom fit without the cost, time and massive datasets required to build and train a neural network from scratch.
Sometimes companies have an opportunity to further enhance models by training them across larger, more diverse datasets maintained by partners outside the walls of their data center.
TAO Lets Partners Collaborate with Privacy
Federating learning, another part of TAO, lets different sites securely collaborate to refine a model for the highest accuracy. With this technique, users share components of models such as their partial weights. Datasets remain inside each company’s data center so data privacy is preserved.
In one recent example, 20 research sites collaborated to raise the accuracy of the so-called EXAM model that predicts whether a patient has COVID-19. After applying federated learning, the model also could predict the severity of the infection and whether the patient would need supplemental oxygen. Patient data stayed safely behind the walls of each partner.
Taking Enterprise AI to Production
Once a model is fine tuned, it needs to be optimized for deployment.
It’s a pruning process that makes models lean, yet robust, so they function efficiently on your target platform whether it’s an array of GPUs in a server or a Jetson-powered robot on the factory floor.
NVIDIA TensorRT, another part of TAO, dials a model’s mathematical coordinates to an optimal balance of the smallest size with the highest accuracy for the system it will run on. It’s a crucial step, especially for real-time services like speech recognition or fraud detection that won’t tolerate system latency.
Then, with the Triton Inference Server, users can select the optimal configuration to deploy, whatever the model’s architecture, the framework it uses or target CPU or GPU it will run on.
Once a model is optimized and ready for deployment, users can easily integrate it with whatever application framework that fits their use case or industry. For example, it could be Riva for conversational AI, Clara for healthcare, Metropolis for video analytics or Isaac for robotics to name just a few that NVIDIA provides.
With the chosen application framework, users can launch NVIDIA Fleet Command to deploy and manage the AI application across a variety of GPU-powered devices. It’s the last key step in the journey.
Zero to AI in Minutes
Fleet Command connects NVIDIA-Certified servers deployed at the network’s edge to the cloud. With it, users can work from a browser to securely pair, orchestrate and manage millions of servers, deploy AI to any remote location and update software as needed.
Administrators monitor health and update systems with one-click to simplify AI operations at scale.
Fleet Command uses end-to-end security protocols to ensure application data and intellectual property remain safe.
Data is sent between the edge and the cloud, fully encrypted, ensuring it’s protected. And applications are scanned for malware and vulnerabilities before they are deployed.
An AI Workflow That’s on the Job
Fleet Command and elements of TAO are already in use in warehouses, in retail, in hospitals and on the factory floor. Users include companies such as Accenture, BMW and Siemens Digital Industries
A demo (below) from the GTC keynote shows how the one-two-three combination of NGC models, TAO and Fleet Command can quickly tailor and deploy an application using multiple AI models.
You can sign up for Fleet Command today.
Core parts of TAO, such as the TAO Toolkit and federated learning, are available today. Apply now for early access to them all, fully integrated into TAO.