NVIDIA has recently introduced the Nemotron-4-Mini-Hindi-4B model, a compact language model designed specifically for Hindi. This model is part of NVIDIA’s NIM microservice and can be deployed on NVIDIA GPU-accelerated systems, providing optimized performance for various applications. The first company to implement this model is Tech Mahindra, which has created Indus 2.0, a platform focused on Hindi and its dialects.
The Nemotron Hindi model has 4 billion parameters and is based on a 15-billion parameter multilingual model, Nemotron-4. It was trained using real-world Hindi data as well as synthetic data, including English. After fine-tuning with NVIDIA NeMo, it has achieved top scores on multiple accuracy benchmarks for AI models with up to 8 billion parameters. This model is packaged as a microservice and can be used in various industries such as education and healthcare.
In India, many innovators and enterprises are utilizing NVIDIA NeMo to develop customized language models. For instance, Sarvam AI has created Sarvam 1, the country’s first multilingual language model trained on NVIDIA H100 GPUs. This model supports English and ten major Indian languages. Gnani.ai has also developed a multilingual speech-to-speech model that serves as an AI customer service assistant, handling approximately 10 million real-time interactions daily.
Large enterprises are also leveraging NeMo for their language models. For example, Flipkart has integrated NeMo Guardrails into its conversational AI systems to enhance safety. Krutrim is currently working on a multilingual foundation model using Mistral NeMo 12B. Zoho Corporation plans to use NVIDIA TensorRT-LLM and Triton Inference Server for its language models. Additionally, Tata Consultancy Services and Wipro are offering NVIDIA NeMo-accelerated solutions across various industries. TCS is creating domain-specific language models for telecommunications, retail, and financial services, while Wipro is developing custom conversational AI solutions for customer service interactions.