Eleven years ago, Carnegie Mellon University alumni Anthony Gadient, Edward Lin and Rob Rutenbar were hunkered down in a garage, chowing pizza over late nights of coding. Eighteen months later, voice startup Voci emerged as a spinout from CMU.
Voci, like that of many early AI researchers, became a reality as a startup because of breakthroughs in deep neural networks paired with advances in GPU computing.
“Our academic roots are based in this idea that you can do better by taking advantage of application-specific hardware such as NVIDIA GPUs,” said Gadient, Voci’s chief strategy officer and co-founder.
Automated Speech Recognition
Voci’s V-Blaze automated speech recognition offers real-time speech-to-text and audio analytics to analyze conversations between customers and call center representatives. The data can be used by customers to understand the sentiment and emotion of speakers.
Voci can provide customers with an open API to pipe the data into customer experience and sales applications.
Companies can use Voci to track what customers are saying about competitive products and different features offered elsewhere.
“There’s valuable data in those call center communications,” said Gadient.
AI Closes Deal
Voci’s automated speech recognition provides data to indicate how well sales representatives are handling calls, allowing companies to improve interactions with real-time feedback on best practices drawn from Voci’s metadata that drives products from analytics companies.
“Sales is very interesting in terms of understanding what message is effective and what is the reaction emotionally on the part of the potential buyer to different messaging,” he said.
Understanding the underlying emotion and sentiment is valuable for a number of these applications, said Gadient.
Voci’s customers include analytics companies such as Clairabridge, Call Journey and EpiAnalytics, which tap into the startup’s API for metadata that can highlight issues for customers.
Biometrics for Voice
Voci is also addressing a problem that plagues automated customer service systems: caller verification. Many of these systems ask callers a handful of verification questions and then ask those same questions again if live support is required or if the call gets transferred.
Instead, Voci has developed an API for “voiceprints” that can identify people by voice, bypassing the maze of verification questions.
“Biometrics for voice is a problem worth solving, if only for our collective sanity. It enables machine verification of callers in the background instead of those maddening repeated questions you can face when handed off from operator to operator in a call center,” said Gadient.
GPU-Accelerated NLP
Voci uses a multitude of neural networks and techniques to offer its natural language processing services. The service is offered either on premises or in the cloud and taps into NVIDIA V100 Tensor Core GPUs for inference.
For example, the company uses convolutional neural networks to process audio data and recurrent neural networks for language modeling to make predictions about text.
Developers at Voci trained their networks on more than 20,000 hours of audio from customers seeking results for their businesses.
“It took approximately one month to train the neural nets on a network of machines running a combination of NVIDIA P100 and V100 GPUs,” said Gadient.
Voci is a member of NVIDIA Inception, a virtual accelerator program that helps startups get to market faster.
“Better access to GPUs and NVIDIA technology helped us train models quicker to scale our business faster when we were a scrappy startup strapped for resources,” said Gadient.