Smartphones are the Swiss Army knives of the digital age, serving as cell phone, camera, map, social networking hub, game console and more. Now one more amazing utility can be added to the list: eyesight.
Aipoly is one such smartphone app. It can nearly instantaneously identify over 4,000 objects around the home, such as tools, cutlery and bathroom items. It displays a text identification on screen and speaks the object’s name aloud to the user.
Aipoly’s vocabulary is now about as complex as that of a five-year-old child. The app offers paid upgrades that expand its available vocabulary, and new words and subjects are constantly being added.
Comprehensive image recognition training is key to accurate identification, and it is this quality that is essential to assisting those with visual impairment.
“With a blind person, you can’t have 70 percent accuracy and call it a day,” said Alberto Rizzoli, co-founder of San Francisco-based Aipoly.
Consistently accurate image identification requires extensive training — this is where NVIDIA GPUs make a crucial difference. Comparing the training time required by GPUs with its alternatives is “like the difference between baking a cake and aging whiskey,” said Rizzoli.
Deep Learning-Powered Image Recognition
It’s the ability to give people freedom that they’ve never had before. — Brad Folkens, co-founder of CloudSight, Inc.
Brad Folkens, co-founder of Los Angeles-based CloudSight, agrees. He co-developed TapTapSee, a free, open-source app for the visually impaired. Users can double tap their phone’s screen to take a photo of anything at any angle. The app then says aloud what the object is.
Folkens cited NVIDIA technology as a key element behind the app’s deep learning-powered image recognition.
“We could take the mass library of images we have and, using the (NVIDIA DIGITS) DevBox, we used a sample of those images to train neural networks,” said Folkens.
Folkens was particularly enamored of CloudSight’s recently acquired NVIDIA DGX-1 supercomputer.
“We can handle a tremendous amount of images and training volume with the DGX-1 that we weren’t able to do before,” he said. “Now we can do much, much larger batches of images, and actually get the training done in a reasonable amount of time.”
Both Rizzoli and Folkens find their true passion in the effect their apps have on users.
“One person told us that they were able to go to the grocery store for the first time totally unassisted,” said Folkens.
“To have narration around you is something special for someone who has lost their eyesight,” added Rizzoli.