Speaking of AI: Startup Empowers Indian Language Speakers with Deep Learning

by Isha Salian

A flood of new smartphone users will come online in the next couple years — and many don’t speak or read a word of English, the internet’s most common language.

To make web adoption smoother for hundreds of millions of these new users, one Bangalore-based startup is building AI speech tools for 10 different languages spoken in India. India will have more than 600 million smartphone owners by 2020, but the country has just 125 million English speakers — most of whom speak it as a second language.

“While internet adoption is increasing in India, there’s still a gap in the market for users who don’t know how to read and write English,”.said Ananth Nagaraj, co-founder of Gnani.ai, a member of the NVIDIA Inception program. “Even if something is written in their own language, it may not necessarily be easy for every user to read. We can empower those customers to interact with voice in their native language.”

India’s linguistic diversity presents a challenge for government agencies and private companies trying to communicate with the country’s 1.37 billion people. The country has 22 major languages and around 100 other languages that each have 10,000 or more speakers.

AI speech engine tools that process multiple languages can facilitate conversation by serving as a voice assistant, fielding customer service calls or conducting voice-based transactions.

Gnani.ai provides APIs and voice assistant solutions to e-commerce enterprises, insurance companies, banking and finance firms. Developed using cloud-based NVIDIA GPUs, its tools support languages spoken across the entire subcontinent: Indian English, Hindi, Bengali, Gujarati, Kannada, Malayalam, Marathi, Punjabi, Tamil and Telugu.

Now AI’s Speaking My Language 

Although the linguistic makeup of online content has shifted from 80 percent English in the 1990s to just over 25 percent English today, there’s still a dearth of user-friendly interfaces for Indian language speakers.

Even Indians who speak English as a second language often prefer to consume online content in their native language. But keyboards on computers and mobile devices largely default to the QWERTY keyboard layout, making it slower to type in Indian scripts like Devanagari, used for several languages including Hindi — which is spoken by half a billion people.

Local governments in India have to publish every communication in English and the official language of a given state. Gnani.ai’s voice-to-text tools could speed up this process by up to 4x, Nagaraj said.

The startup’s voice assistant software can integrate with a business’ mobile apps and websites, or be used as an interactive voicebot on customer service telephone lines.

Gnani.ai has collected more than 50,000 hours of annotated audio data to build its AI models. The startup develops its algorithms on Amazon EC2 P3 instances powered by NVIDIA V100 Tensor Core GPUs, accelerating the training process up to 20x compared to using CPUs.

The company chose AWS cloud-based GPUs because they were easier to spin up multiple clusters at once for large-scale data training, Nagaraj said. Gnani.AI uses CUDA matrix libraries and NVIDIA’s automatic mixed precision feature for TensorFlow designed to speed up neural network training up to 3x.

Starting the Conversation 

Nagaraj said the team believes that AI voice assistants can make the customer support experience more efficient and personalized. With multilingual bots, enterprises can provide personalized service experiences for customers with AI — and allow human agents to devote more time to complex queries from callers.

Bank clients incorporating Gnani.ai’s software could allow the automated system to help customers access their account statements or freeze a credit card, while passing more detailed processes on to staffers. The voice assistant could even reach out to insurance customers in their preferred language to coordinate policy payments, help elderly clients book taxis or provide farmers with pricing information for their crops.

As a Bangalore-based company, Nagaraj said, “we have a significantly higher accuracy compared to some of the global providers because we understand the nuances of the languages and dialects of a diverse country. That helps us tune our AI algorithms to perform better for this market.”

Since its founding in 2016, Gnani.ai has piloted or deployed voice assistant solutions with more than 20 large enterprises in India. The company — which recently received funding from Samsung’s investment arm — plans to expand its call center automation AI tools to other countries, including the United States, in 2020.