Friday, September 29, 2023

Mistral AI's Inaugural Large Language Model Now Available to a Global Audience


While the majority of widely-used language models are accessible via API, the concept of open models is gaining traction. French AI startup Mistral, which secured substantial seed funding in June, has recently unveiled its inaugural model. Notably, Mistral asserts that this model surpasses others of similar magnitude in performance, and it is offered completely free of charge without any usage limitations.

The Mistral 7B model is now accessible for download through multiple avenues, including a 13.4-gigabyte torrent file (currently supported by several hundred seeders). In addition, the company has initiated a GitHub repository and a Discord channel, fostering collaboration and offering assistance for users.

Crucially, the model has been made available under the Apache 2.0 license, which is an exceptionally permissive licensing framework with no limitations on usage or reproduction, except for the requirement of proper attribution. This implies that the model can be employed by individuals pursuing hobbies, large-scale corporations, or even government entities such as the Pentagon, provided they possess the necessary local computing infrastructure or are willing to invest in the requisite cloud resources.

Mistral 7B represents a notable advancement beyond other 'compact' large language models such as Llama 2, providing comparable capabilities (as indicated by specific standard benchmarks) at a significantly reduced computational expense. In contrast, foundational models like GPT-4 offer more extensive capabilities but come with a considerably higher cost and operational complexity, hence their availability is primarily facilitated through APIs or remote access.

In a blog post accompanying the model's release, Mistral's team articulated their aspiration to emerge as a foremost advocate for the open generative AI community, with the objective of elevating open models to the pinnacle of state-of-the-art performance. They emphasized that Mistral 7B's remarkable performance exemplifies the potential of smaller models when driven by unwavering commitment. Achieving this outcome required a dedicated three-month effort, which included the assembly of the Mistral AI team, the reconstruction of a high-performance MLops infrastructure, and the meticulous creation of a sophisticated data processing pipeline from the ground up.

While the outlined tasks may appear to be an extensive undertaking, potentially spanning beyond the scope of three months for many individuals, it's important to note that the founders enjoyed a head start. Their prior experience in developing analogous models during their tenures at Meta and Google DeepMind provided them with valuable insights and expertise. This familiarity didn't necessarily simplify the process, but it did afford them a comprehensive understanding of the task at hand.

Naturally, while the model is available for download and utilization by all, it's important to clarify that this doesn't categorize it as "open source" or any derivative thereof, as we explored in-depth during last week's Disrupt discussion. Despite the model being governed by an exceedingly permissive license, its development occurred within a private context, financed privately, and its datasets and weights remain proprietary.

This appears to be the crux of Mistral's business strategy: While the core model is available for free use, those desiring deeper integration will find value in their premium offerings. As elucidated in the blog post, '[Our commercial offering] will be disseminated as white-box solutions, encompassing the provision of both model weights and access to the source code. Concurrently, we are diligently developing hosted solutions and bespoke deployments tailored for enterprise needs.

I have reached out to Mistral seeking further clarification on aspects pertaining to their open approach and their prospective release plans. In the event of a response from their team, I will promptly update this post.

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home