How GPUs Help Machines Understand What You’re Saying
People talking to machines figure in nearly every sci-fi movie and TV series.
From “Knight Rider” to “Lost in Space” to “Star Trek,” Hollywood has long envisioned a future where people interact naturally with computers and other devices.
This anticipated future isn’t that far off, thanks to Nuance, the leader in speech recognition technology.
An age of ubiquitous “intelligent systems” is coming, according to the company’s CTO, Vlad Sejnoha, in a recent story in Forbes magazine. It’s one where computers will “communicate with people through voice, text, vision, touch and gestures, and will factor in ambient information like location or motion to understand context, giving greater relevance of every interaction,” he wrote.
Sejnoha isn’t talking about advances 5, 10 or 20 years down the line. You can already see them right now.
Training these systems to run accurate speech recognition software takes a massive amount of computational power, which is why the company has turned to NVIDIA’s GPUs.
Training Neural Networks Using GPUs
Nuance’s speech recognition software has already been adopted by companies across a broad number of industries, including the entertainment, financial, healthcare, and mobile markets.
Powering 12 billion customer service calls, 5 billion mobile cloud transactions and 25 million voice-enabled cars each year, Nuance uses NVIDIA GPUs to train models that mimic the structure of the human brain, called “artificial neural networks.”
Neural network models learn in a manner similar to the way children learn new words by being presented with lots of examples.
Nuance trains their neural network models to recognize different words by using vast quantities of audio data.
The larger the training set and the greater the size of the neural network, the better the speech recognition. But teaching these large neural networks to identify meaning from the variability introduced by differing environments and accents can take many weeks on a traditional CPU-only computer. With GPUs, Nuance cuts this time to days.
“GPUs are significantly speeding up the training on very large amounts of our data, and allow us to rapidly explore novel algorithms and training techniques,” said Sejnoha. “The resulting enhanced models deliver improved accuracy across all of Nuance’s core speech technologies used in healthcare, enterprise and mobile-consumer markets.”
Nuance wants to fundamentally change how technology adapts to people. And, Sejnoha envisions a day soon in which giving a voice command to your phone will be like having your own personal assistant standing by.
Couple this with speech recognition technology and self-driving cars and “Knight Rider” will become a reality.
I am a big science fiction fan. I’d love to hear from other sci-fi fans on what new capabilities they think this will enable.