Monthly roundup
Latest AI model releases and industry news: February 2026
February 2026 was loud. Four headline models in one month, a round of retirements, and a distillation scandal. Most recaps stop at the benchmark tables. This one dates every item to a primary source, then does the thing none of them do: it points out that the three model families that defined the month (Claude, Codex, and Gemini) are the exact three backends one native Mac agent can run side by side, and shows you the one-line function that picks between them.
Direct answer - verified June 22, 2026
In February 2026, Anthropic shipped Claude Opus 4.6 (Feb 5) and Sonnet 4.6 (Feb 17), OpenAI introduced GPT-5.3-Codex (Feb 5), and Google released Gemini 3.1 Pro in preview (Feb 19). The two biggest non-model stories were OpenAI retiring GPT-4o, GPT-4.1, GPT-4.1 mini, and o4-mini from the ChatGPT app (Feb 13, with no API changes at the time) and Anthropic accusing DeepSeek, Moonshot AI, and MiniMax of industrial-scale distillation attacks against Claude (Feb 23-24).
Primary sources are linked on each row below, starting with the Anthropic release notes and OpenAI's retirement post.
February 2026, dated to the source
Six entries: four model releases and the two industry stories that actually moved planning. Models carry a badge for the agent backend they map to, which is the thread this page pulls on later.
Claude Opus 4.6
AnthropicClaude backendAnthropic's flagship update. 1M-token context in beta, 80.8% on SWE-bench Verified, 72.7% on OSWorld for agentic computer use, priced at $5 / $25 per million tokens. The agentic-coding leader for most of the month until GPT-5.3-Codex landed the same day.
support.claude.com release notesGPT-5.3-Codex
OpenAICodex backendIntroduced February 5 with its system card dated the same day. OpenAI's most capable agentic coding model at the time, roughly 25% faster than GPT-5.2-Codex, a 400K-token context, and pricing from $1.75 / $14 per million tokens. The first OpenAI model the Codex team used to help build itself.
openai.com - introducing GPT-5.3-CodexChatGPT retires GPT-4o, GPT-4.1, GPT-4.1 mini, o4-mini
OpenAIindustry newsOpenAI removed GPT-4o, GPT-4.1, GPT-4.1 mini, and o4-mini from the ChatGPT consumer app, alongside the earlier GPT-5 Instant and Thinking retirement. Read the fine print: in the API there were no changes at the time. The models people lost in the dropdown were still callable by developers. GPT-4o stays in Custom GPTs for Business, Enterprise, and Edu until it is fully retired across plans after April 3, 2026.
openai.com - retiring GPT-4o and older modelsClaude Sonnet 4.6
AnthropicClaude backendThe mid-tier model that ate the flagship's lunch. Near-Opus performance at $3 / $15 per million tokens, 79.6% on SWE-bench Verified, 1M-token context in beta. Anthropic reported developers preferred it over Sonnet 4.5 roughly 70% of the time, and over Opus 4.5 around 59% of the time, in real coding tests.
support.claude.com release notesGemini 3.1 Pro (Preview)
Google DeepMindGemini backendPitched as a sturdier default for complex, long-horizon problem-solving. It posted 77.1% on ARC-AGI-2, more than double Gemini 3 Pro's 31.1%, and 94.3% on GPQA Diamond. It also added a three-tier thinking control (a new Medium setting) to trade latency for reasoning depth.
deepmind.google - Gemini 3.1 Pro model cardAnthropic accuses DeepSeek, Moonshot, MiniMax of distillation attacks
Anthropicindustry newsAnthropic published evidence that three Chinese labs ran coordinated distillation campaigns against Claude: an estimated 16 million-plus exchanges from around 24,000 fraudulently created accounts, with MiniMax driving the most traffic at over 13 million. Both Anthropic and OpenAI have framed this kind of distillation as a national-security concern.
anthropic.com - detecting and preventing distillation attacksWhat every February recap leaves out
Line the month up and a pattern appears that no ranking mentions. The three families that defined February (Anthropic's Claude 4.6 pair, OpenAI's GPT-5.3-Codex, and Google's Gemini 3.1 Pro) are not three products you have to pick between. They are three agent backends that a single native Mac app already drives: Claude Code through claude-agent-acp, Codex through codex-acp, and Gemini through gemini-cli.
That reframes the whole month. You are not choosing the February winner and living with it. In Fazm one window can run GPT-5.3-Codex on a refactor while another runs Sonnet 4.6 on a review, and a third runs Gemini 3.1 Pro on a long-context read, all at once. The model that topped a chart on February 19 stops being a commitment and becomes a dropdown.
The swap is a model-id prefix, not a separate app
Here is the uncopyable part. Fazm does not make you choose a backend and then choose a model inside it. You pick a model, and the bridge infers the backend from the id. The rule lives in Desktop/Sources/Providers/ChatProvider.swift in the open-source repo, so this is checkable, not a slogan.
ChatProvider.swift - the bridge routes by prefix
// gpt-*, codex-*, o[0-9]-* -> Codex backend (codex-acp)
static func isCodexModelId(_ modelId: String) -> Bool {
let lower = modelId.lowercased()
if lower.hasPrefix("gpt-") || lower.hasPrefix("codex-") { return true }
if lower.count >= 2,
lower.hasPrefix("o"),
lower[lower.index(after: lower.startIndex)].isNumber {
return true
}
return false
}
// gemini-* / auto-gemini-* -> Gemini (gemini-cli)
// everything else -> Claude (claude-agent-acp)The selected model is stored per window, not globally, so two windows can sit on two different backends at the same time. That state lives in FloatingControlBarState.swift:
// Per-window workspace + model selection. Pop-out windows track
// their own model independently.
@MainActor
final class WorkspaceSettingsState: ObservableObject {
@Published var selectedModel: String = ShortcutSettings.shared.selectedModel
@Published var workspaceDirectory: String = ""
}So when GPT-5.3-Codex shipped on February 5 and Gemini 3.1 Pro on February 19, nothing in the app had to change to adopt them. A new id with a known prefix routes to the right adapter. The picker merges all three backends' models into one list, ordered Claude, then Codex, then Gemini.
One window, three of February's models
The same query path handles all three backends. Change the model in the picker and the prefix sends the request to a different adapter.
How a model choice routes to a backend
Why the February 13 retirement is the real lesson
The model news that ages best from February is not a benchmark. It is the retirement. On February 13 OpenAI pulled GPT-4o and three other models out of the ChatGPT app, and the fine print mattered: the API was untouched. The models people lost in the dropdown were still callable by anything that talks to the backend directly.
That is the difference between a tool tied to a consumer chat UI and a tool tied to the agent loop. Fazm runs the backends through API and CLI adapters, not the ChatGPT app that did the retiring. A model leaving a vendor's consumer surface does not pull it out from under your sessions.
February's churn, two ways
| Feature | Single-vendor chat app | Fazm (agent loop) |
|---|---|---|
| Run Feb's new Claude, GPT, and Gemini together | One model family per app; switching means a different product | Pick any of the three per window; backend is inferred from the model id |
| A model is retired (OpenAI, Feb 13) | Disappears from the dropdown | Backends run via API/CLI adapters, not the consumer UI that retired it |
| A long session after a Mac restart | Browser tab; depends on the vendor | Persistent sessions auto-restore every window with full history |
| Branch a conversation to try a different model | Copy-paste into a new chat | One-click fork: new window, full prior context, original untouched |
| Context length on a long task | Auto-compacts when the vendor decides | No auto-compacting; full history stays live for the window's lifetime |
Fazm is macOS 14+ only and brings your own Claude, Codex, or Gemini access. It is not a hosted chat service.
The harness outlasts the month
February proved the point on its own. Opus 4.6 and GPT-5.3-Codex both on the 5th, retirements on the 13th, Sonnet 4.6 on the 17th, Gemini 3.1 Pro on the 19th, a distillation scandal on the 23rd. The leader changed week to week. If adopting each one means a new app or a rebuilt workflow, you fall behind by inertia.
That is why Fazm is built around the harness, not a model choice. The backend is a prefix. The loop on the other side is the real Claude Code (or Codex, or Gemini) over ACP, so when you switch models your persistent sessions still survive a restart, any conversation still forks in one click into a new window with the full prior context, and nothing auto-compacts for the window's lifetime. The model you point it at can change every February. That part does not.
Want to run February's models side by side on your Mac?
Walk through per-window backend routing, persistent forkable sessions, and bringing your own Claude, Codex, or Gemini access in a real agent loop.
Questions people searched alongside this
Frequently asked questions
What AI models were released in February 2026?
Four headline models. Anthropic shipped Claude Opus 4.6 on February 5 and Claude Sonnet 4.6 on February 17. OpenAI introduced GPT-5.3-Codex on February 5 (its system card carries the same date). Google DeepMind released Gemini 3.1 Pro in preview on February 19. February also saw smaller drops (xAI's Grok updates among them), but those four are the ones the daily agent crowd actually adopted.
What was the biggest AI industry news in February 2026 besides model releases?
Two stories. On February 13 OpenAI retired GPT-4o, GPT-4.1, GPT-4.1 mini, and o4-mini from the ChatGPT consumer app. On February 23-24 Anthropic publicly accused DeepSeek, Moonshot AI, and MiniMax of running industrial-scale distillation attacks against Claude, estimating over 16 million exchanges from roughly 24,000 fake accounts.
Did OpenAI's February 13 retirement remove GPT-4o from the API too?
No. The retirement was about the ChatGPT consumer app, not the API. At the time of the announcement OpenAI said there were no API changes, so developers could keep calling those models even after they vanished from the chat dropdown. GPT-4o remained available in Custom GPTs for Business, Enterprise, and Edu and was scheduled for full retirement across all plans after April 3, 2026. This distinction is the whole reason an agent that runs on API or CLI backends is more durable than one tied to a consumer chat UI.
How is Claude Sonnet 4.6 different from Opus 4.6?
Opus 4.6 (February 5) is the flagship: 80.8% on SWE-bench Verified, priced at $5 / $25 per million tokens. Sonnet 4.6 (February 17) is the mid-tier: 79.6% on SWE-bench Verified at $3 / $15 per million tokens, so close to Opus on coding that Anthropic reported developers preferred it over Opus 4.5 around 59% of the time. Both carry a 1M-token context in beta. For most agent work the gap is small and the price gap is not.
Can I run February's Claude, GPT, and Gemini releases in one app?
Yes, that is what Fazm does. Claude Opus 4.6 / Sonnet 4.6, GPT-5.3-Codex, and Gemini 3.1 Pro map to the three agent backends Fazm wraps: Claude Code via claude-agent-acp, Codex via codex-acp, and Gemini via gemini-cli. Each Fazm window stores its own model choice, so one window can run GPT-5.3-Codex while another runs Sonnet 4.6, side by side on the same Mac.
How does Fazm decide which backend to run for a given model?
By the model-id prefix, not a manual backend switch. In Desktop/Sources/Providers/ChatProvider.swift the function isCodexModelId returns true for ids starting with gpt-, codex-, or o followed by a digit; isGeminiModelId matches gemini- and auto-gemini-; everything else routes to Claude. The selected model lives per window in WorkspaceSettingsState.selectedModel, so switching the model in the picker switches the backend for that window without an app restart.
Why does a model-news page spend half its time on a routing function?
Because February 2026 made the churn obvious. Four headline models in one month, plus a wave of retirements, plus a distillation scandal. The model you point at changes constantly. The harness around it (which backend runs, whether your session survives a restart, whether you can fork it) is the part you keep. When the swap is a one-line prefix, adopting next month's leader is free.
Keep reading
AI model releases in 2026: the verified list so far
The full-year frontier timeline, dated to primary sources.
Large language model releases, May 2026
The following month's drops, dated and what each meant for a daily agent.
Gemini, Claude, and Qwen: new model releases (2026)
Cross-vendor releases and why the backend you run matters less than the harness.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.