Today, we are thrilled to announce that the Mixtral-8x22B large language model (LLM) from Mistral AI is now available for customers to deploy with just one click using Amazon SageMaker JumpStart for running inference. You can test out this model with SageMaker JumpStart, a hub for machine learning (ML) that gives you access to algorithms and models to quickly start with ML projects. In this article, we will guide you on how to find and deploy the Mixtral-8x22B model.
What is Mixtral 8x22B?
Mixtral 8x22B is Mistral AI’s latest open-weights model that sets a new standard for performance and efficiency among available foundation models. It is a sparse Mixture-of-Experts (SMoE) model that utilizes only 39 billion active parameters out of 141 billion, providing cost-efficiency for its size. Released under Apache 2.0, Mixtral 8x22B promotes innovation and collaboration by making the model available for exploration, testing, and deployment. It is a great choice for customers looking for high quality from mid-sized models while maintaining high throughput.
Mixtral 8x22B offers the following advantages:
– Multilingual capabilities in English, French, Italian, German, and Spanish
– Strong mathematics and coding capabilities
– Function calling for application development and tech stack modernization at scale
– 64,000-token context window for precise information recall from large documents
About Mistral AI:
Mistral AI is a Paris-based company founded by experienced researchers from Meta and Google DeepMind. The team at Mistral AI has a strong background in developing cutting-edge ML technology at major research labs. They focus on providing small foundational models with superior performance and commitment to model development. Mixtral 8x22B is part of Mistral AI’s family of publicly available models, designed to deliver cost-efficient performance.
What is SageMaker JumpStart?
SageMaker JumpStart allows ML practitioners to choose from a variety of best-performing foundation models. You can deploy these models to dedicated Amazon SageMaker instances within a secure environment and customize them using SageMaker for training and deployment. With SageMaker JumpStart, you can quickly discover and deploy Mixtral-8x22B with just a few clicks in Amazon SageMaker Studio or programmatically through the SageMaker Python SDK. The model is deployed in an AWS secure environment under your VPC controls, ensuring data encryption and compliance with security standards.
Discovering models:
You can access Mixtral-8x22B foundation models through SageMaker JumpStart in the SageMaker Studio UI and the SageMaker Python SDK. In SageMaker Studio, navigate to JumpStart in the navigation pane to search for models like Mixtral 8x22B. You can view details about the model, its license, training data, and deployment process. The Deploy button allows you to create an endpoint for the model.
Deploying a model:
To deploy a model, select Deploy and wait for the endpoint to be created. You can test the endpoint with sample inference requests or use the SDK for testing. SageMaker provides seamless logging, monitoring, and auditing for deployed models, ensuring insights into resource utilization and API calls.
Example prompts:
You can interact with the Mixtral-8x22B model using prompts for text generation. The Instruct version of Mixtral-8x22B accepts formatted instructions for conversation roles alternated between user prompts and model answers. The code provided in the article demonstrates how to format prompts for the Instruct model.
In conclusion, Mistral AI’s Mixtral-8x22B model offers high performance and efficiency for ML projects, and with SageMaker JumpStart, deploying and testing the model is simple and secure.
Source link