With a new university campus nearby and an airport under construction, the city of Liverpool, Australia, 27 kilometers southwest of Sydney, is growing fast.
More than 30,000 people are expected to make a daily commute to its central business district. Liverpool needed to know the possible impact to traffic flow and movement of pedestrians, cyclists and vehicles.
The city already hosts closed-circuit televisions to monitor safety and security. Each CCTV captures lots of video and data that, due to stringent privacy regulations, is mainly combed through after an incident has been reported.
The challenge before the city was to turn this massive dataset into information that could help it run more efficiently, handle an influx of commuters and keep the place liveable for residents — without compromising anyone’s privacy.
To achieve this goal, the city has partnered with the Digital Living Lab of the University of Wollongong. Part of Wollongong’s SMART Infrastructure Facility, the DLL has developed what it calls the Versatile Intelligent Video Analytics platform. VIVA, for short, unlocks data so that owners of CCTV networks can access real-time, privacy-compliant data to make better informed decisions.
VIVA is designed to convert existing infrastructure into edge-computing devices embedded with the latest AI. The platform’s state-of-the-art deep learning algorithms are developed at DLL on the NVIDIA Metropolis platform. Their video analytics deep-learning models are trained using transfer learning to adapt to use cases, optimized via NVIDIA TensorRT software and deployed on NVIDIA Jetson edge AI computers.
“We designed VIVA to process video feeds as close as possible to the source, which is the camera,” said Johan Barthelemy, lecturer at the SMART Infrastructure Facility of the University of Wollongong. “Once a frame has been analyzed using a deep neural network, the outcome is transmitted and the current frame is discarded.”
Disposing of frames maintains privacy as no images are transmitted. It also reduces the bandwidth needed.
Beyond city streets like in Liverpool, VIVA has been adapted for a wide variety of applications, such as identifying and tracking wildlife; detecting culvert blockage for stormwater management and flash flood early warnings; and tracking of people using thermal cameras to understand people’s mobility behavior during heat waves. It can also distinguish between firefighters searching a building and other building occupants, helping identify those who may need help to evacuate.
Making Sense of Traffic Patterns
The research collaboration between SMART, Liverpool’s city council and its industry partners is intended to improve the efficiency, effectiveness and accessibility of a range of government services and facilities.
For pedestrians, the project aims to understand where they’re going, their preferred routes and which areas are congested. For cyclists, it’s about the routes they use and ways to improve bicycle usage. For vehicles, understanding movement and traffic patterns, where they stop, and where they park are key.
Understanding mobility within a city formerly required a fleet of costly and fixed sensors, according to Barthelemy. Different models were needed to count specific types of traffic, and manual processes were used to understand how different types of traffic interacted with each other.
Using computer vision on the NVIDIA Jetson TX2 at the edge, the VIVA platform can count the different types of traffic and capture their trajectory and speed. Data is gathered using the city’s existing CCTV network, eliminating the need to invest in additional sensors.
Patterns of movements and points of congestion are identified and predicted to help improve street and footpath layout and connectivity, traffic management and guided pathways. The data has been invaluable in helping Liverpool plan for the urban design and traffic management of its central business district.
Machine Learning Application Built Using NVIDIA Technologies
SMART trained the machine learning applications on its VIVA platform for Liverpool on four workstations powered by a variety of NVIDIA TITAN GPUs, as well as six workstations equipped with NVIDIA RTX GPUs to generate synthetic data and run experiments.
In addition to using open databases such as OpenImage, COCO and Pascal VOC for training, DLL created synthetic data via an in-house application based on the Unity Engine. Synthetic data allows the project to learn from numerous scenarios that might not otherwise be present at any given time, like rainstorms or masses of cyclists.
“This synthetic data generation allowed us to generate 35,000-plus images per scenario of interest under different weather, time of day and lighting conditions,” said Barthelemy. “The synthetic data generation uses ray tracing to improve the realism of the generated images.”
Inferencing is done with NVIDIA Jetson Nano, NVIDIA Jetson TX2 and NVIDIA Jetson Xavier NX, depending on the use case and processing required.