Enterprises Build LLMs for Indian Languages With NVIDIA AI

Namaste, vanakkam, sat sri akaal — these are just three forms of greeting in India, a country with 22 constitutionally recognized languages and over 1,500 more recorded by the country’s census. Around 10% of its residents speak English, the internet’s most common language.

As India, the world’s most populous country, forges ahead with rapid digitalization efforts, its enterprises and local startups are developing multilingual AI models that enable more Indians to interact with technology in their primary language. It’s a case study in sovereign AI — the development of domestic AI infrastructure that is built on local datasets and reflects a region’s specific dialects, cultures and practices.

These projects are building language models for Indic languages and English that can power customer service AI agents for businesses, rapidly translate content to broaden access to information, and enable services to more easily reach a diverse population of over 1.4 billion individuals.

To support initiatives like these, NVIDIA has released a small language model for Hindi, India’s most prevalent language with over half a billion speakers. Now available as an NVIDIA NIM microservice, the model, dubbed Nemotron-4-Mini-Hindi-4B, can be easily deployed on any NVIDIA GPU-accelerated system for optimized performance.

Tech Mahindra, an Indian IT services and consulting company, is the first to use the Nemotron Hindi NIM microservice to develop an AI model called Indus 2.0, which is focused on Hindi and dozens of its dialects. Indus 2.0 harnesses Tech Mahindra’s high-quality fine-tuning data to further boost model accuracy, unlocking opportunities for clients in banking, education, healthcare and other industries to deliver localized services.

Tech Mahindra will showcase Indus 2.0 at the NVIDIA AI Summit, taking place Oct. 23-25 in Mumbai. The company also uses NVIDIA NeMo to develop its sovereign large language model (LLM) platform, TeNo.

NVIDIA NIM Makes AI Adoption for Hindi as Easy as Ek, Do, Teen

The Nemotron Hindi model has 4 billion parameters and is derived from Nemotron-4 15B, a 15-billion parameter multilingual language model developed by NVIDIA. The model was pruned, distilled and trained with a combination of real-world Hindi data, synthetic Hindi data and an equal amount of English data using NVIDIA NeMo, an end-to-end, cloud-native framework and suite of microservices for developing generative AI.

The dataset was created with NVIDIA NeMo Curator, which improves generative AI model accuracy by processing high-quality multimodal data at scale for training and customization. NeMo Curator uses NVIDIA RAPIDS libraries to accelerate data processing pipelines on multi-node GPU systems, lowering processing time and total cost of ownership. It also provides pre-built pipelines and building blocks for synthetic data generation, data filtering, classification and deduplication to process high-quality data.

After fine-tuning with NeMo, the final model leads on multiple accuracy benchmarks for AI models with up to 8 billion parameters. Packaged as a NIM microservice, it can be easily harnessed to support use cases across industries such as education, retail and healthcare.

It’s available as part of the NVIDIA AI Enterprise software platform, which gives businesses access to additional resources, including technical support and enterprise-grade security, to streamline AI development for production environments.

Bevy of Businesses Serves Multilingual Population

Innovators, major enterprises and global systems integrators across India are building customized language models using NVIDIA NeMo.

Companies in the NVIDIA Inception program for cutting-edge startups are using NeMo to develop AI models for several Indic languages.

Sarvam AI offers enterprise customers speech-to-text, text-to-speech, translation and data parsing models. The company developed Sarvam 1, India’s first homegrown, multilingual LLM, which was trained from scratch on domestic AI infrastructure powered by NVIDIA H100 Tensor Core GPUs.

Sarvam 1 — developed using NVIDIA AI Enterprise software including NeMo Curator and NeMo Framework — supports English and 10 major Indian languages, including Bengali, Marathi, Tamil and Telugu.

Sarvam AI also uses NVIDIA NIM microservices, NVIDIA Riva for conversational AI, NVIDIA TensorRT-LLM software and NVIDIA Triton Inference Server to optimize and deploy conversational AI agents with sub-second latency.

Another Inception startup, Gnani.ai, built a multilingual speech-to-speech LLM that powers AI customer service assistants that handle around 10 million real-time voice interactions daily for over 150 banking, insurance and financial services companies across India and the U.S. The model supports 14 languages and was trained on over 14 million hours of conversational speech data using NVIDIA Hopper GPUs and NeMo Framework.

Gnani.ai uses TensorRT-LLM, Triton Inference Server and Riva NIM microservices to optimize its AI for virtual customer service assistants and speech analytics.

Large enterprises building LLMs with NeMo include:

Flipkart, a major Indian ecommerce company majority-owned by Walmart, is integrating NeMo Guardrails, an open-source toolkit that enables developers to add programmable guardrails to LLMs, to enhance the safety of its conversational AI systems.
Krutrim, part of the Ola Group of businesses that includes one of India’s top ride-booking platforms, is developing a multilingual Indic foundation model using Mistral NeMo 12B, a state-of-the-art LLM developed by Mistral AI and NVIDIA.
Zoho Corporation, a global technology company based in Chennai, will use NVIDIA TensorRT-LLM and NVIDIA Triton Inference Server to optimize and deliver language models for its over 700,000 customers. The company will use NeMo running on NVIDIA Hopper GPUs to pretrain narrow, small, medium and large models from scratch for over 100 business applications.

India’s top global systems integrators are also offering NVIDIA NeMo-accelerated solutions to their customers.

Infosys will work on specific tools and solutions using the NVIDIA AI stack. The company’s center of excellence is also developing AI-powered small language models that will be offered to customers as a service.
Tata Consultancy Services has developed AI solutions based on NVIDIA NIM Agent Blueprints for the telecommunications, retail, manufacturing, automotive and financial services industries. TCS’ offerings include NeMo-powered, domain-specific language models that can be customized to address customer queries and answer company-specific questions for employees for all enterprise functions such as IT, HR or field operations.
Wipro is using NVIDIA AI Enterprise software including NIM Agent Blueprints and NeMo to help businesses easily develop custom conversational AI solutions such as digital humans to support customer service interactions.

Wipro and TCS also use NeMo Curator’s synthetic data generation pipelines to generate data in languages other than English to customize LLMs for their clients.

To learn more about NVIDIA’s collaboration with businesses and developers in India, watch the replay of company founder and CEO Jensen Huang’s fireside chat at the NVIDIA AI Summit.