OpenAI GPT is now Open Source, also Available on Ollama and Amazon Bedrock

OpenAI has just launched something huge, and it’s free, open, and incredibly powerful. Say hello to gpt-oss-120b and gpt-oss-20b, two open-weight models that bring near-state-of-the-art performance into your hands, all under the highly permissive Apache 2.0 license. Whether you’re a solo developer looking to build something amazing or a large enterprise needing full control over your AI stack, this release is for you.

Smarter Models, Open Weights

These aren’t just any open models. gpt-oss-120b achieves near-parity with OpenAI’s o4-mini in reasoning benchmarks, while running efficiently on an 80 GB GPU. Meanwhile, gpt–oss-20b hits benchmarks similar to o3-Mini and runs on just 16 GB of memory, making it ideal for on-device use or cost-sensitive environments. They’re trained with reinforcement learning techniques informed by OpenAI’s latest proprietary systems, including o3 and other frontier models.

Built for Developers: Tool Use, CoT, Structured Output

Both models shine in agentic workflows: they support full Chain-of-Thought (CoT) reasoning, few-shot function calling, and structured outputs. You can easily adjust the reasoning effort (low, medium, high) to trade off performance and latency depending on your task. Web browsing, Python code execution, and tool chaining are all built in.

Under the Hood: Sparse and Efficient

Here’s what makes these models tick:

Model	Layers	Total Params	Active Params/Token	Experts/Layer	Active Experts	Context Length
gpt-oss-120b	36	117B	5.1B	128	4	128k
gpt-oss-20b	24	21B	3.6B	32	4	128k

They use Mixture-of-Experts (MoE), Rotary Positional Embeddings (RoPE), and Grouped Multi-Query Attention for memory efficiency and scaling. Tokenization is done via a new open-source tokenizer: o200k_harmony.

Training & Post-Training Like the Pros

Just like OpenAI’s best internal models, these were trained on a curated, mostly English text dataset rich in STEM, code, and general knowledge. Post-training included supervised fine-tuning and high-compute reinforcement learning, aligning them with the OpenAI Model Spec. That means these models don’t just sound smart—they think through problems before answering.

Real-World Performance: Benchmarks That Matter

Let’s talk numbers. Across coding, health, math, and reasoning tasks:

gpt-oss-120b outperforms o3-mini, and often matches or beats o4-mini.
gpt-oss-20b, despite its size, competes toe-to-toe with o3-mini and excels in math and health use cases.

This includes tests like MMLU, AIME, HealthBench, and Codeforces—and yes, they perform well with and without tools.

Safety First—Even for Open Models

Safety isn’t an afterthought. These models were built with:

CBRN-filtered pretraining
Deliberative alignment
Instructional refusal tuning

Plus, OpenAI went further by adversarially fine-tuning the models to simulate real-world misuse in domains like cybersecurity and bioengineering. The findings? Even with aggressive fine-tuning, the models remained within acceptable safety margins according to the Preparedness Framework.

Red Teaming Challenge: $500K Up for Grabs

To crowdsource safety research, OpenAI is launching a Red Teaming Challenge with a whopping $500,000 prize pool. If you can discover vulnerabilities or novel safety issues, you can contribute to safer AI and get rewarded for it.

Deployment-Ready, Anywhere

These models are designed to run anywhere:

Locally on your laptop or GPU
In the cloud via Hugging Face, Azure, AWS, Cloudflare, and more
On Windows via ONNX Runtime and VS Code

Reference implementations are available in PyTorch, Rust, and Apple Metal. You can also use the provided Harmony renderer to adapt to the model’s prompt format.

Why This Matters: Lowering Barriers, Driving Innovation

This release democratizes access to top-tier AI. It brings enterprise-grade reasoning and tool use to developers in underserved markets, research institutions, and startups. It’s a step toward a more open, equitable, and safe AI future.

Whether you’re fine-tuning for local use, deploying for a customer, or just tinkering with edge AI, GPT-OSS puts powerful tools in your hands—without vendor lock-in.

Ollama Partners with OpenAI to Launch gpt-oss 20B & 120B Models

In a groundbreaking collaboration, Ollama and OpenAI have joined forces to bring state-of-the-art open-weight models directly to your local machine. Say hello to gpt-oss-20b and gpt-oss-120b designed for blazing-fast, local AI experiences without compromising on reasoning power, developer flexibility, or agentic capabilities.

Whether you’re a developer building intelligent agents, a researcher exploring complex reasoning tasks, or a business looking for secure on-prem AI, this update is a game-changer.

Feature Highlights

Agentic Capabilities Out of the Box

Both models come pre-equipped for advanced function-calling, structured outputs, Python tool use, and even optional web browsing through Ollama’s built-in web search. This means you can augment the models with real-time information for highly dynamic tasks.

Transparent Chain-of-Thought Reasoning

You get full visibility into the models’ reasoning process. This not only boosts debuggability and trust but also enables deeper insights for researchers and developers alike.

Adjustable Reasoning Effort

Optimize for latency or depth: you can configure the reasoning effort (low, medium, high) depending on your use case. Whether you’re building a lightweight assistant or a deep-thinking agent, you’re in control.

Fine-Tuning Flexibility

These models are fine-tunable, allowing you to tailor them for domain-specific tasks with ease, from customer support bots to scientific research tools.

Apache 2.0 License Freedom

Built under a permissive Apache 2.0 license, you’re free to experiment, modify, and commercially deploy without copyleft concerns or patent issues. Innovation, unchained.

Performance Meets Portability with MXFP4 Quantization

To make local inference possible without heavy hardware, OpenAI and Ollama employ MXFP4 quantization — a 4.25-bit format for MoE (Mixture of Experts) weights that dominate the parameter count (~90%).

gpt-oss-20B can run on machines with just 16GB of RAM.
gpt-oss-120B fits into a single 80GB NVIDIA GPU.

This is made possible by:

Native MXFP4 format support in Ollama (no extra conversions needed)
Custom kernels developed for Ollama’s engine
Benchmarked parity with OpenAI’s reference implementations

Two Models, Endless Possibilities

`gpt-oss:20b`

20 billion parameters
Low-latency, high-efficiency
Perfect for local assistants, embedded AI, or specialized tasks

`gpt-oss:120b`

120 billion parameters
Designed for general-purpose reasoning and agentic intelligence
Ideal for production-grade apps, AI copilots, and decision support systems

Boosted by NVIDIA RTX: Local AI at Its Best

Ollama’s deepening partnership with NVIDIA ensures that both models are optimized for GeForce RTX and RTX PRO GPUs. This unlocks top-tier performance on your local RTX-powered PC — no cloud dependency, just raw AI horsepower at your fingertips.

How to Get Started

Download the latest version of Ollama and start running the models right from your terminal or new app interface:

Tip: Use :20b for fast, responsive tasks and :120b for advanced reasoning and complex workflows.

Introducing OpenAI GPT-OSS Models on Amazon Bedrock and SageMaker JumpStart

AWS is thrilled to introduce two powerful OpenAI models with open weights gpt-oss-120b and gpt-oss-20b, now available on Amazon Bedrock and Amazon SageMaker JumpStart. These models are not just powerful, they’re practical. They’re designed to empower developers, startups, and enterprises to build AI-native applications with greater transparency, fine-tuning capability, and full infrastructure control.

What’s New?

Both models are optimized for:

Natural language generation
Scientific and mathematical reasoning
Coding tasks
Tool-augmented and agentic workflows

They come with a 128K context window, adjustable reasoning levels (low, medium, high), and external tool support. You can plug them into frameworks like Strands Agents, making them ideal for building autonomous, chain-of-thought agents.

Access the Models Where You Build

1. Amazon Bedrock — Serverless Simplicity

In Amazon Bedrock, you can:

Request access to gpt-oss-120b and gpt-oss-20b via the Model Access section
Use the Chat/Test playground to prototype and explore
Deploy with OpenAI SDK compatibility using a Bedrock endpoint
Invoke the model with the OpenAI Python SDK like this:

Agent Support Example with Strands:

from strands import Agentfrom strands.models

import BedrockModel

bedrock_model = BedrockModel(
model_id=”openai.gpt-oss-120b-1:0″,
region_name=”us-west-2″
)

agent = Agent(model=bedrock_model)
agent("Tell me the square root of 42 ^ 3")

Once satisfied, you can deploy your agent using Amazon Bedrock AgentCore for full runtime, memory, and identity management.

2. Amazon SageMaker JumpStart — ML Flexibility

With SageMaker JumpStart, you can:

Select and deploy either model in just a few clicks
Choose your instance type and scale based on your workload
Fine-tune the models for specific use cases
Run inference directly in SageMaker Studio or using the AWS SDK

JumpStart provides full control over your deployment environment and model lifecycle, making it ideal for production-scale AI applications and experimentation.

Why This Matters: Open Weight, Open Innovation

Here’s why open-weight models on AWS are a game-changer:

Transparency: See the chain-of-thought output to understand how the model reasons
Customization: Fine-tune and adapt the models to your domain or data
Security First: Built-in safety mechanisms and compatibility with AWS security best practices
Open Ecosystem: Leverage the OpenAI API or Bedrock-native tools for deployment, orchestration, and monitoring

Region Availability

Platform	Available Regions
Amazon Bedrock	US West (Oregon)
SageMaker JumpStart	US East (N. Virginia, Ohio), APAC (Mumbai, Tokyo)

OpenAI GPT is now Open Source, also Available on Ollama and Amazon Bedrock

Smarter Models, Open Weights

Built for Developers: Tool Use, CoT, Structured Output

Under the Hood: Sparse and Efficient

Training & Post-Training Like the Pros

Real-World Performance: Benchmarks That Matter

Safety First—Even for Open Models

Red Teaming Challenge: $500K Up for Grabs

Deployment-Ready, Anywhere

Why This Matters: Lowering Barriers, Driving Innovation

Ollama Partners with OpenAI to Launch gpt-oss 20B & 120B Models

Feature Highlights

Agentic Capabilities Out of the Box

Transparent Chain-of-Thought Reasoning

Adjustable Reasoning Effort

Fine-Tuning Flexibility

Apache 2.0 License Freedom

Performance Meets Portability with MXFP4 Quantization

Two Models, Endless Possibilities

`gpt-oss:20b`

`gpt-oss:120b`

Boosted by NVIDIA RTX: Local AI at Its Best

How to Get Started

Introducing OpenAI GPT-OSS Models on Amazon Bedrock and SageMaker JumpStart

What’s New?

Access the Models Where You Build

1. Amazon Bedrock — Serverless Simplicity

2. Amazon SageMaker JumpStart — ML Flexibility

Why This Matters: Open Weight, Open Innovation

Region Availability

Leave a Comment Cancel Reply

Sign up for the Newsletter

Smarter Models, Open Weights

Built for Developers: Tool Use, CoT, Structured Output

Under the Hood: Sparse and Efficient

Training & Post-Training Like the Pros

Real-World Performance: Benchmarks That Matter

Safety First—Even for Open Models

Red Teaming Challenge: $500K Up for Grabs

Deployment-Ready, Anywhere

Why This Matters: Lowering Barriers, Driving Innovation

Ollama Partners with OpenAI to Launch gpt-oss 20B & 120B Models

Feature Highlights

Agentic Capabilities Out of the Box

Transparent Chain-of-Thought Reasoning

Adjustable Reasoning Effort

Fine-Tuning Flexibility

Apache 2.0 License Freedom

Performance Meets Portability with MXFP4 Quantization

Two Models, Endless Possibilities

gpt-oss:20b

gpt-oss:120b

Boosted by NVIDIA RTX: Local AI at Its Best

How to Get Started

Introducing OpenAI GPT-OSS Models on Amazon Bedrock and SageMaker JumpStart

What’s New?

Access the Models Where You Build

1. Amazon Bedrock — Serverless Simplicity

2. Amazon SageMaker JumpStart — ML Flexibility

Why This Matters: Open Weight, Open Innovation

Region Availability

Must Read

Leave a Comment Cancel Reply

`gpt-oss:20b`

`gpt-oss:120b`