Databricks, a leading data and AI company, has recently announced a new feature that aims to improve the efficiency of large language model (LLM) inference. This feature, called Mosaic AI Model Serving, allows for simple, fast, and scalable batch processing of LLMs, making it easier for organizations to deploy these models in production environments and analyze unstructured data.
The new model supports batch inference, which means that multiple requests can be processed simultaneously, rather than one at a time. According to Databricks, this enhances throughput and reduces latency, which is crucial for real-time applications. The feature is designed for ease of use, providing a straightforward interface for users to set up and manage LLM inference tasks without extensive coding.
One of the key benefits of Mosaic AI Model Serving is its ability to efficiently scale with demand. This means that organizations can dynamically adjust resources based on workload, ensuring optimal performance during peak times. The feature also integrates with the Databricks platform, using existing data lakes and collaborative notebooks to enhance model training and deployment workflows.
In a recent blog post, Databricks stated, “No more exporting data as CSV files to unmanaged locations—now you can run batch inference directly within your workflows, with full governance through Unity Catalog.” This development solidifies Databricks’ position as a leader in the LLM space, addressing the growing demand for efficient AI solutions across industries.
In addition to this new feature, Databricks has also recently announced a partnership with Amazon Web Services (AWS). The five-year deal focuses on using Amazon’s Trainium AI chips, which could significantly reduce costs for businesses looking to build their GenAI apps. Databricks also acquired AI startup MosaicML last year in a $1.3 billion deal, further expanding its services to democratize AI and establish its Lakehouse as the top platform for GenAI and LLMs.
Databricks has raised $37 million and offers technology up to 15 times cheaper than its competitors. The company serves clients like AI2, Replit, and Hippocratic AI and claims that its MPT-30B LLM, a 30-billion parameter model, is of superior quality and more cost-effective for local deployment than GPT-3. With the introduction of Mosaic AI Model Serving, Databricks is solidifying its position as a leader in the AI space and addressing the growing demand for efficient AI solutions.