If you want to know what the next big thing will be, ask someone at a company that invents it time and again.
“AI is a key tool for the next era, so we are providing the computing resources our developers need to generate great AI results,” said Yuichi Kageyama, general manager of Tokyo Laboratory 16, in R&D Center for Sony Group Corporation.
Called GAIA internally, the lab’s computing resources act as a digital engine serving all Sony Group companies. And it’s about to get a second fuel injection of accelerated computing for AI efforts across the corporation.
Sony’s engineers are packing machine-learning smarts into products from its Xperia smartphones, its entertainment robot, aibo, and a portfolio of imaging components for everything from professional and consumer cameras to factory automation and satellites. It’s even using AI to build the next generation of advanced imaging chips.
More Zip, Fewer Tolls
To move efficiently into the AI era, Sony is installing a cluster of NVIDIA DGX A100 systems linked on an NVIDIA Mellanox InfiniBand network. It expands an existing system now running at near full utilization with NVIDIA V100 Tensor Core GPUs, commissioned in October when the company brought AI training in house.
“When we were using cloud services, AI developers worried about the costs, but now they can focus on AI development on GAIA,” said Kageyama.
An in-house AI engine torques performance, too. One team designed a deep-learning model for delivering super-resolution images and trained it nearly 16x faster by adding more resources to the job, shortening a month’s workload to a day.
“With the computing power of the DGX A100, its expanded GPU memory and faster InfiniBand networking, we expect to see even greater performance on larger datasets,” said Yoshiki Tanaka, who oversees HPC and distributed deep learning technologies for Sony’s developers.
Powering an AI Pipeline
Sony posted fast speeds in deep learning back in 2018, accelerating its Neural Network Libraries on a system at Japan’s National Institute of Advanced Industrial Science and Technology. And it’s already rolling out products powered with machine learning, such as its Airpeak drone for professional filmmakers shown at CES this year.
There’s plenty more to come.
“We will see good results in our fiscal 2021 because we have collaborations with many business teams who have started some good projects,” Kageyama said.
NVIDIA is putting its shoulder to the wheel with software and services to “build a culture of using GPUs,” he added.
For example, Sony developers use NGC, NVIDIA’s online container registry, for all the software components they need to get an AI app up and running.
Sony even created a container of its own, now available on NGC, sporting its Neural Network Libraries and other utilities. It supplements NVIDIA’s containers for work in popular environments like PyTorch and TensorFlow.
Drivers Give a Thumbs Up
Developers tell Kageyama’s team that having their code in one place helps simplify and speed their work.
Some researchers use the system for high performance computing, tapping into NVIDIA’s CUDA software that accelerates a diverse set of technical applications including AI.
To keep it all running smoothly, NVIDIA provided a job scheduler as well as additions for Sony to NVIDIA’s libraries for scaling apps across multiple GPUs.
“Good management software is important for achieving fairness and high utilization on such a complex system,” said Masahiro Hara, who leads development of the GAIA system.
An Eye Toward Analytics
NVIDIA also helped Sony create training programs on how to use its software on GAIA.
Looking ahead, Sony is interested in expanding its work in data analytics and simulations. It’s evaluating RAPIDS, open-source software NVIDIA helped design to let Python programmers access the power of GPUs for data science.
At the end of a work-from-home day keeping Sony ahead of the pack in AI, Kageyama enjoys playing with his kids who keep their dad on his digital toes. “I’m a beginner in Minecraft, and they’re much better than me,” he said.