Mistral AI has recently introduced two new language models, Ministral 3B and Ministral 8B, collectively referred to as les Ministraux. These models are particularly tailored for local inference applications and have demonstrated superior performance relative to other models of similar sizes across several large language model (LLM) benchmarks. The release follows a year after Mistral's initial offering, the Mistral 7B model, which marked the company's entry into the market.

Both Ministral 3B and 8B come in base and instruct versions and are equipped with a remarkable 128k context length. The Ministral 8B model distinguishes itself with its interleaved sliding-window attention mechanism, which allows for faster and more efficient inference processes. Unlike the previous Mistral 7B, which was released under the Apache 2.0 license permitting unrestricted use, les Ministraux necessitate a commercial license for their use in local settings, although a research-access route is available for the 8B model. Additionally, these models can be accessed via Mistral AI's API, broadening their usability.

Mistral AI stated, "Our most innovative customers and partners have increasingly been asking for local, privacy-first inference for critical applications such as on-device translation, internet-less smart assistants, local analytics, and autonomous robotics." The company designed les Ministraux to guarantee a compute-efficient and low-latency solution for such applications. When paired with larger language models like Mistral Large, les Ministraux are also positioned as efficient intermediaries, effectively managing tasks such as input parsing, task routing, and API calls based on user intent across various contexts, all while minimizing latency and costs.

Since the inception of Mistral 7B, the company has unveiled several other specialized models, primarily under the Apache 2.0 license. Earlier this year, InfoQ reported on the Mixtral 8x7B, a sparse mixture of experts (SMoE) LLM that competes closely with larger models like Llama 2 70B and GPT-3.5. Other advancements include Codestral, Mistral AI's inaugural code-focused model, as well as three open-weight models: Mistral NeMo, a 12B parameter general-purpose LLM; Codestral Mamba, a 7B parameter code-generation model; and Mathstral, also a 7B parameter model, fine-tuned for mathematical reasoning.

Mistral AI claims that les Ministraux "consistently outperform their peers," backed by performance metrics on benchmarks like MMLU, Winogrande, and GSM8k. Results indicate that Ministral 3B surpasses Llama 3.2 3B and Gemma 2 2B, while Ministral 8B outshines Llama 3.1 8B and even Mistral 7B itself. Evaluations performed by the independent site Artificial Analysis corroborate these findings, suggesting that both the 3B and 8B models compare favourably, especially on the HumanEval coding benchmark, while also showcasing faster inference speeds.

Public discussions concerning les Ministraux, particularly on platforms such as Hacker News, revealed mixed reactions regarding the new models' requirement for a commercial license in local hosting scenarios. Some users acknowledged the API as a viable alternative, noting that Mistral AI stands out as one of the few LLM API providers compliant with GDPR regulations in Europe. Observers also speculated on whether Mistral AI could effectively compete with larger entities like Meta. Lee Harris, head of R&D at Rev.AI, articulated concerns about Mistral's competitive standing, suggesting that the company may need to enhance its API capabilities: "I think for Mistral to compete with Meta they need a better API."

The model weights for the Ministral 8B Instruct are available for download on Huggingface for research uses, allowing further exploration of the models' capabilities within the academic and development communities. The continued development and release of these models represent significant strides in AI automation for businesses, reflecting the ongoing evolution of technology impacting industry practices.

Source: Noah Wire Services