How to Get Started with Deep Learning Frameworks

A guide to the top deep learning frameworks for AI research.
by Isha Salian

Think of a deep learning framework as a grocery store.

Rather than laboring in their own backyard farms, most people shop at markets when they want to whip up a meal.

Just as they don’t pick lettuce and uproot carrots when they have a hankering for salad, developers don’t want to start from scratch every time they build a deep learning neural network.

Deep learning models are large and complex, so instead of writing out every function from the ground up, programmers rely on frameworks and software libraries to build neural networks efficiently. The top deep learning frameworks provide highly optimized, GPU-enabled code that is specific to deep neural network computations.

Differences Between Deep Learning Frameworks

Different grocers specialize in unique sets of inventory. You might find basic recipe ingredients at your local supermarket. But you might prefer an upscale grocer for an exotic salad green, and a farmer’s market for organic, tree-ripened fruit. Or a big-box store when cooking for a crowd.

Similarly, though a developer can build most types of networks (like CNNs or RNNs) in any deep learning framework, frameworks vary in the quantity of examples available and the frequency of updates to these examples. There are also differences in the number of contributors actively adding new features and the way the framework exposes functionality through its APIs.

The top frameworks are open-source and under active development — with the majority having been released since 2014.

How to Choose a Deep Learning Framework

Developers may choose a deep learning framework based on how closely its front-end interface matches their skillset, how much community support is available, or how quickly new features are being developed in their particular area of interest.

Frameworks can be accessed through the command line, script interfaces in programming languages such as Python or C/C++, and graphical interfaces like NVIDIA DIGITS, which allow developers to build a deep neural network in a more user-friendly web application.

If you’re a developer integrating your deep learning application with NVIDIA GPUs, check out the NVIDIA Developer Program to learn more.

How to Move Models Between Deep Learning Frameworks

Depending on the application at hand, developers may want to build and train a deep learning model using one framework, then re-train or deploy it for inference using a different framework.

The Open Neural Network Exchange, known as ONNX, is a format for deep learning models that allows developers to move their models between frameworks. ONNX supports conversion between most major frameworks.

When a deep learning application has been trained and is ready for deployment, our TensorRT software optimizes models for high-performance inference on NVIDIA GPUs. TensorRT is tightly integrated with TensorFlow and MATLAB, and also supports importing from the ONNX format.

Here are some of the most popular frameworks used for deep learning, with examples of how companies and researchers are building GPU-accelerated applications for healthcare, disaster prediction and cell biology.

Apache MXNet

Apache MXNet is a deep learning framework created by the Apache Software Foundation in 2015. Seattle-based startup Magic AI is using a deep learning model to monitor horse health, built with MXNet and run on NVIDIA GPUs. Using a video feed within the stable, the neural network analyzes the frames and sends owners an alert if there’s an animal about to give birth, a horse showing colic symptoms or strangers entering the stables.

For inference, developers can export to ONNX, then optimize and deploy with NVIDIA TensorRT.


The Caffe deep learning framework originated at the University of California, Berkeley in 2014, and has led to forks like NVCaffe and new frameworks like Facebook’s Caffe2 (now merged with PyTorch). Half the people diagnosed with lung cancer, the most common cancer worldwide, die within a year. St. Louis-based startup Innovation DX is using deep learning and NVIDIA GPUs to spot lung cancer sooner using chest X-rays. Early detection tools like this one, powered by neural networks and the Caffe framework, could triple survival rates.

To optimize and deploy models for inference, developers can leverage NVIDIA TensorRT’s built-in Caffe model importer.


Created in 2015, Chainer is developed by Japanese venture company Preferred Networks. The company used this Python-based framework in collaboration with industrial automation giant FANUC for the Amazon Picking Challenge in 2016. The event challenged autonomous robots to figure out how to pick and stow objects. Preferred Networks and FANUC used convolutional neural networks and an NVIDIA GeForce GTX 870M notebook GPU for the competition, in which they came second.

For inference, developers can export to ONNX, then optimize and deploy with NVIDIA TensorRT.


Keras is an interface that can run on top of multiple frameworks such as MXNet, TensorFlow, Theano and Microsoft Cognitive Toolkit using a high-level Python API. Created in 2014 by researcher François Chollet with an emphasis on ease of use through a unified and often abstracted API. A team of Korean researchers is using Keras to improve the speed and accuracy of hurricane predictions. Their deep learning models, built with Keras on TensorFlow and running on NVIDIA GPUs, predict a storm’s path and precipitation levels hours in advance. As these neural networks are able to forecast storms further ahead of time, they could give locals more warning time to evacuate before a hurricane hits.


MATLAB allows engineers who are already familiar with its software to develop deep learning tools using MATLAB code. Researchers at the University of Alberta leveraged MATLAB and NVIDIA GPUs to help avoid unnecessary prostate cancer biopsies. The team’s deep learning model analyzed biomarker data from extracellular vesicles to predict the presence of cancer cells.

For inference, developers can use TensorRT through MATLAB GPU Coder to automatically generate optimized inference engines.

Microsoft Cognitive Toolkit

Originally called CNTK, this deep learning framework from Microsoft was introduced in 2014 and powers the company’s own AI models, like Cortana. Health tech company IRIS uses NVIDIA Tesla GPUs and Microsoft Cognitive Toolkit to help prevent diabetic retinopathy, or diabetes-induced blindness. Unless spotted by regular eye exams, the condition can sneak up on a patient. IRIS’ neural networks analyze retinal images, recommending whether a patient needs to be referred to a physician.

For inference, developers can export to ONNX, then optimize and deploy with NVIDIA TensorRT.


First there was Torch, a popular deep learning framework released in 2011, based on the programming language Lua. Then in 2017, Facebook introduced PyTorch, which takes Torch features and implements them in Python. PyTorch was used for the first predictive 3D model of a live human cell, powered by an NVIDIA DGX Station and TITAN Xp GPUs. Developed by researchers at the Allen Institute of Cell Science, the model allows scientists to digitally visualize and manipulate cell behavior in a virtual environment. An alternative to the expensive process of fluorescence microscopy, a cell model built using CNNs gives scientists the potential to understand and predict cell dynamics in a way that was previously nearly impossible.

For inference, developers can export to ONNX, then optimize and deploy with NVIDIA TensorRT.


TensorFlow is a deep learning framework created in 2015 by Google. Researchers from the University of Texas MD Anderson Cancer Center are using TensorFlow to determine high-precision radiation therapy treatments. Radiologists typically review a cancer patient’s medical scans to figure out how much radiation should be used to target tumors without damaging normal tissues. With NVIDIA Tesla GPUs, the researchers developed a deep learning model that learned to identify and recreate the patterns physicians designed to identify target areas for radiation.

For inference, developers can either use TensorFlow-TensorRT integration to optimize models within TensorFlow, or export TensorFlow models, then use NVIDIA TensorRT’s built-in TensorFlow model importer to optimize in TensorRT.

A Broad Ecosystem of Frameworks

deep learning frameworksNVIDIA works with many of the above frameworks, and others like Baidu’s PaddlePaddle, to enable deep learning applications.

New deep learning frameworks are being created all the time, a reflection of the widespread adoption of neural networks by developers. Early players like Theano and Torch have powered many deep learning applications, but the creators in 2017 announced that they would stop developing the frameworks.

NVIDIA’s deep learning frameworks team works on many of these open-source efforts directly, making more than 800 total contributions last year to improve ease of use and performance.

The NGC container registry provides instant access to many of the frameworks mentioned above, delivering optimal GPU-accelerated performance on demand.

For additional resources and install information for deep learning frameworks, see the NVIDIA Developer site. This hub also provides example neural network training scripts for some of the most common deep learning frameworks and applications, such as computer vision, translation and object detection. Our packaged deep learning containers can be found in the NVIDIA GPU Cloud catalog.