Earlier this year, Meta released the Llama 3.1 405B model, which allowed developers to use outputs from Llama models to improve other models. This update was particularly beneficial for developers and AI startups in India that were building Indic LLMs. Yann LeCun, Meta’s chief AI scientist, acknowledged the past issues developers faced with Llama in terms of model creation. Speaking at Meta’s Build with AI Summit in Bengaluru, LeCun stated that the company had taken note of these concerns and addressed them with the newer version of Llama.
During a fireside chat with influential Indian entrepreneur and co-founder of Infosys, Nandan Nilekani, LeCun mentioned that this development would make it easier for Indian AI startups to use LLMs and that it was not necessary for India to build LLMs from scratch. Nilekani added that India should focus on becoming the use-case capital of the world and concentrate on building small models quickly. He also stated that India could use Llama to create synthetic data, build small language models, and train them using appropriate data. Nilekani also invested in AI4Bharat, a research lab dedicated to creating open-source datasets, tools, models, and applications for Indian languages.
At Cypher 2024, Vivek Raghavan, the chief of Sarvam AI, revealed that they used Llama 3.1 405B to build Sarvam 2B, a 2 billion parameter model with 4 trillion tokens, of which 2 trillion are Indian language tokens. Sarvam 2B is part of a class of small language models (SLMs) that includes Microsoft’s Phi series, Llama 3 (8 billion), and Google’s Gemma models. Raghavan explained that this model serves as a viable alternative to using large models from OpenAI and Anthropic, while also being more efficient for specific use cases. He also mentioned that the company recently launched its latest model, Sarvam-1, which outperforms Google’s Gemma-2 and Llama 3.2 on Indic tasks. Raghavan attributed this success to the model’s secret sauce of 2 trillion tokens.
In conclusion, the release of Llama 3.1 405B has been a significant development for Indian AI startups, as it has allowed them to use LLMs more easily and efficiently. This has been demonstrated by the success of Sarvam AI, which has used Llama to build its models and outperform larger models from big players in the industry. With the support of influential figures like Nandan Nilekani and the availability of open-source datasets and tools, India is well on its way to becoming a leader in the use of small language models for Indian languages.