Amazon Web Services (AWS) has announced the availability of the first models in Meta’s new Llama 4 series—Llama 4 Scout 17B and Llama 4 Maverick 17B—as fully managed offerings on Amazon Bedrock. These cutting-edge multimodal models are accessible via a single API, enabling developers to easily integrate powerful AI capabilities into their applications. Designed to handle both text and image inputs, the Llama 4 models offer improved performance and reduced cost compared to their Llama 3 predecessors, while supporting a broader range of languages for global use cases.
Built on mixture-of-experts (MoE) architecture, Llama 4 models deliver high efficiency in processing multimodal tasks, enhanced compute optimization, and stronger safety measures. According to Meta, the Llama 4 Scout 17B is the most powerful multimodal model in its class. It is a general-purpose model with 17 billion active parameters, 16 experts, and a total of 109 billion parameters. Scout dramatically extends the context length from 128K tokens in Llama 3 to an industry-leading 10 million tokens—unlocking practical applications such as multi-document summarization, large-scale code reasoning, and in-depth personalized task parsing.
The Llama 4 Maverick 17B model, also general-purpose, incorporates 128 experts and 400 billion total parameters with a 1 million token context length. It offers exceptional capabilities in both image and text comprehension across 12 languages, making it well-suited for advanced virtual assistants and multilingual chat applications.
Meta’s Llama 4 models are currently available in Amazon Bedrock within the US East (N. Virginia) and US West (Oregon) AWS Regions, with cross-region inference support available in US East (Ohio). Developers can explore further details in the launch blog, product documentation, and pricing page. To begin using Llama 4 on Amazon Bedrock, they are encouraged to visit the Bedrock console.
You can learn more about Llama and how to set it up locally.