As part of our effort to weave AI into the fabric of modern cities, we brought together 150 of the brightest minds in research and academia for the IEEE Smart World NVIDIA AI City Challenge.
The two-month-long challenge culminated on August 5 in Silicon Valley with the University of Illinois at Urbana Champaign and the University of Washington, Seattle, winning NVIDIA TITAN Xp GPUs. All 14 teams that worked on Jetson-based inferencing received an NVIDIA Jetson TX2 Developer Kit.
The competition aims to make cities smarter and safer by using technology that’s already ubiquitous — cameras. There are millions of roadside cameras cities can use to extract insights for improving traffic flow and pedestrian safety.
Our vision for this challenge is to do for urban traffic video analysis what ImageNet — the groundbreaking image recognition competition — did for general-purpose image analysis, explained Milind Naphade, CTO of NVIDIA AI Cities. “Every transportation department will find value in using these models,” he said.
A Challenge at the Speed of Light
Partnering with IEEE and academia, we designed the AI City Challenge to overcome the transportation hurdles faced in urban areas using an edge-to-cloud solution.
The researchers and scientists taking on this challenge hail from academic labs around the globe – Brazil, China, Greece, India, Italy, Japan, Turkey and the United States. Twenty-nine teams submitted proposals.
To seed the challenge, event organizers captured more than 75 hours of high-quality video from intersections near our Silicon Valley headquarters, combined with 50 hours of data from Nebraska and Virginia.
Organizers captured daytime and nighttime conditions along with rush hour traffic. To preserve privacy, organizers obscured faces that appeared in the images.
Eighteen teams from 15 universities went on to the next phase by collaborating to label the dataset. This resulted in 1.4 million objects in 150,000 keyframes marked with class labels recommended by transportation departments in multiple cities.
Once the data was labeled, the competitors had less than three weeks to build and deploy meaningful models. Organizers gave teams access to two NVIDIA DGX AI supercomputing systems for model training as well as Jetson TX2 modules for deployment. Teams used various deep learning frameworks and experimented with a variety of networks for object detection and classification.

A Winning Proposition: When AI and Traffic Analytics Collide
The event’s judges included John Garofolo, senior advisor for information access programs at the National Institute of Standards and Technology; Dr. Maulin Patel, product general manager for intelligent enterprises at GE; Dr. Jenq-Neng Hwang, professor of electrical engineering at the University of Washington; and Farzin Aghdasi, senior software manager at NVIDIA.
For track 1, judges selected the University of Illinois at Urbana Champaign for its performance on object detection, localization and classification. For track 2, judges selected the University of Washington for the value and innovation of its approach, along with the success of its demonstration.
“NVIDIA’s tremendous leadership in conceiving and executing this challenge in such a brief time, and its collaboration with academia, is exemplary,” said Garofolo.” These kinds of activities and data are essential to fostering the development of a public-safety-focused academic research community and to generate critical innovations in public safety analytic technologies that will benefit all stakeholders.”