All Over the Map: Startup Uses AI to Recreate the World, One Photograph at a Time

by Serge Lemonde

The days of wrestling with fold-up paper maps to find your way are long gone. Most people rely on digital maps to point them in the right direction. Autonomous vehicles need highly detailed maps as well.

However, future maps will need to provide more than just directions. People want easy access to up-to-date visual information about their highways and roads. They want to know if infrastructure is in place, such as bike lanes and bike racks, so they can easily cycle to their destination. And, for safety, self-driving cars will need a 360-degree understanding of the traffic environment.

Mapillary, a Sweden-based startup and member of our Inception program, is working to help develop these maps by integrating computer vision technology with community collaboration. Using data from street-level images from any camera, the company is visualizing the world to improve maps, help cities plan their development and contribute to the progress of the automotive industry.

Not All Those Who Wander Are Lost

From mapping 300 Banyan trees in Karnataka, India, to documenting Disneyland in 3D, Mapillary is stitching the world together, one photograph at a time. In fact, the company receives hundreds of thousands of images every day from its community of individual contributors, nonprofit organizations, companies and governments.

Processing this large amount of data into a useful result is no mean feat. To do so, the team at Mapillary uses a deep learning technique called semantic segmentation. This process involves breaking down images into semantically meaningful parts and then classifying them. For Mapillary’s map data extraction, two subtypes of semantic image segmentation are needed: standard and HD.

Standard segmentation is applied to every image in Mapillary’s database in a process tuned for cost-effectiveness, while accepting some loss in accuracy. An HD segmentation model is applied only to select images where a high level of accuracy is required. But this focus on maximizing accuracy means longer runtimes and higher memory demands.

Mapillary’s challenge was to cost-effectively run HD semantic segmentation — and provide its customers the most detailed map data possible — while keeping up with the ever increasing number of images flowing into its platform.

The company was already using NVIDIA Tesla GPU accelerators on Amazon EC2 P2 for production, with NVIDIA TITAN Xp GPUs used for training its algorithms. More recently, Mapillary benchmarked TensorRT 3.0 running on Tesla V100 GPUs via the Amazon Web Services EC2 P3 instance.

The result was a 27x speed-up of HD segmentation while reducing memory demands by 81 percent. Standard segmentation was boosted by 18x, with a 74 percent memory reduction.

“With the optimization of TensorRT together with Tesla V100, we are able to increase the image resolution during inference for semantic segmentation at the same processing cost,” said Yubin Kuang, computer vision lead at Mapillary. “This allows us to recover fine details and smaller objects with semantic segmentation.”

The technology improvement enables Mapillary to produce map data more cost-effectively and thus help create better, more detailed and smarter maps.

NVIDIA Inception Program

Mapillary is one of more than 2,000 startups in our Inception program. The virtual accelerator program provides startups with access to technology, expertise and marketing support.