Into the Omniverse: How Smart City AI Agents Transform Urban Operations

Cities are deploying AI agents, digital twins and computer vision to turn fragmented urban infrastructure into intelligent, responsive spaces.
by Esther Lee

Editor’s note: This post is part of Into the Omniverse, a series focused on how developers, 3D practitioners and enterprises can transform their workflows using the latest advancements in OpenUSD and NVIDIA Omniverse.

Cities worldwide face unprecedented challenges as urban populations surge and infrastructure strains to keep pace.

Operational challenges like traffic congestion and coordinating emergency services are compounded by fragmented data pipelines, siloed local government processes and disparate systems. Technical barriers prevent cities from accessing the comprehensive, real-time insights needed for effective decision-making and city management.

Leading cities and technology partners are deploying the NVIDIA Blueprint for smart city AI, a reference application that provides the complete software stack to build, test and operate AI agents in simulation-ready (SimReady) digital twins.

OpenUSD is an open and extensible framework that connects to each stage of this physical AI workflow. OpenUSD-enabled digital twins serve as SimReady environments where cities can simulate “what if” scenarios and generate physically accurate sensor data.

The blueprint powers a three-stage workflow: 1) simulate with the NVIDIA Cosmos platform and NVIDIA Omniverse libraries to generate synthetic data, 2) train and fine-tune vision AI models, and 3) deploy real-time video analytics AI agents with the NVIDIA Metropolis platform and the NVIDIA Blueprint for video search and summarization (VSS). This enables cities to move from reactive to proactive operations.​

Based on these simulations, cities can deploy operational platforms where weather data, traffic sensors and emergency response systems converge, supporting rapid testing of rare scenarios, real-time monitoring, city infrastructure planning and optimization of urban systems.

From Kaohsiung City, Taiwan, cutting incident response times by 80% with street-level AI to Raleigh, North Carolina, achieving 95% vehicle detection accuracy and French rail networks optimizing energy consumption by 20%, cities across the globe are using digital twins and AI agents to transform urban operations at scale.

Smart Cities in Action

Akila, With SNCF Gares&Connexions, Uses Digital Twins to Improve Rail Operations

Akila’s digital twin application helps French rail operator SNCF Gares&Connexions optimize its network of nearly 14,000 daily trains with live scenario planning for solar heating, air flow and crowd movement. The OpenUSD-enabled digital twins deliver a 20% reduction in energy consumption, 100% on-time preventive maintenance and a 50% reduction in downtime and response times.

Linker Vision Taps Physical AI for Street-Level Intelligence

Linker Vision’s physical AI system recognizes infrastructure events in Kaohsiung City, including damaged streetlights and fallen trees, eliminating manual city inspections and enabling faster emergency response. To scale its street-level intelligence to more cities, Linker Vision uses Omniverse libraries for simulation, Cosmos Reason for world understanding and the VSS blueprint for deployment powered by OpenUSD.

Esri and Microsoft Enable Comprehensive Urban Intelligence in the City of Raleigh

The City of Raleigh achieved 95% vehicle detection accuracy using the NVIDIA DeepStream software development kit, boosting traffic analysis workflows for engineers. This data enhances Raleigh’s digital twin, enabled by Esri’s ArcGIS geospatial platform to support visualization and analysis for critical infrastructure planning and management. Integrating this computer vision pipeline with a vision AI agent powered by the NVIDIA VSS blueprint provides comprehensive real-time visibility and insights in ArcGIS on Azure Cloud.

Milestone Systems’ VLM Automates Video Review

Milestone Systems is soon launching its Hafnia VLM, which will include a VLM plug-in for its video management software XProtect as well as a VLM-as-a-service. Fine-tuned on more than 75,000 hours of video data, the Hafnia VLM can reduce operator alarm fatigue by up to 30% by automating video review and filtering out false alarms. It was developed with NVIDIA Cosmos Reason VLMs and Metropolis. The Hafnia VLM plug-in for XProtect will make generative AI more easily accessible for XProtect operators and users.

K2K Analyzes Italy Video Streams

K2K’s platform uses NVIDIA Cosmos Reason and the VSS blueprint to analyze over 1,000 video streams in Palermo, Italy, processing 7 billion events annually and automatically notifying city officials through natural language queries and video events when critical conditions are extracted and analyzed.

Learn more about how cities are transforming with simulation, vision AI and digital twins by watching this on-demand NVIDIA GTC session, “Leadership Strategies to Transform Public Services.”

Get Started With Smart City AI

Learn more about OpenUSD and computer vision workflows through these resources:

Stay up to date by subscribing to NVIDIA Omniverse news, joining the Omniverse community and following Omniverse on Discord, Instagram, LinkedIn, Threads, X and YouTube.