Claude Mythos (Anthropic, April 2026): why the gated preview cannot land in a consumer Mac app, and why Fazm's action surface would not change if it did
Every SERP result covers the numbers: 73 percent on expert CTF tasks, a 27-year-old OpenBSD weak spot, 22 of 32 steps on corporate-attack simulation, about 40 orgs under Project Glasswing. None cover the shipping consumer Mac app story. The Fazm source says it plainly: Mythos never reaches a consumer OAuth plan's availableModels, the client-side family-map has three tuples, the tool whitelist is five names, and the AX permission wall caps what any Claude can do. A more dangerous reasoner inside this shell clicks the same buttons.
What Claude Mythos actually is, in two paragraphs
Claude Mythos Preview is Anthropic's April 2026 frontier model positioned as an offensive-security research preview rather than a general-purpose upgrade. Public specs, aggregated from red.anthropic.com, Google Cloud Vertex AI, AWS Bedrock's model card, and the UK AI Safety Institute's evaluation: 73 percent solve rate on expert-level Capture the Flag challenges that no pre-2025 model could complete; first model to finish a 32-step simulated corporate-network attack end-to-end (3 of 10 attempts, 22 of 32 steps on average); discovery of a previously undocumented OpenBSD weak spot that had gone unfound for 27 years.
Access is not general availability. It is a research preview gated to approximately 40 organizations through an initiative Anthropic calls Project Glasswing: Microsoft, Apple, Google, specific cybersecurity firms, several banks. The stated goal is defensive use, not an offensive-security SaaS. A consumer user with a paying Pro or Max OAuth plan cannot request access, and there is no button in a consumer Mac app that flips the entitlement. That is the factual starting point for everything below.
The anchor fact: three tuples, five tools
A consumer Mac AI app's architecture is its real answer to the Mythos question. Everything in the product that decides "can this model drive this action" lives in two places: the client-side model family-map, and the built-in MCP whitelist. Both are small, both are literal, and both are checkable in the Fazm repository.
The picker literally has three named slots. A model id that does not contain the substring 'haiku', 'sonnet', or 'opus' does not get a label, does not get an order, and does not get rendered. Mythos's canonical id, whatever Anthropic eventually publishes it as, has none of those substrings. That is before you even ask whether the Claude Agent SDK returned it in the first place (it did not).
The action surface, drawn as a beam
Whichever Claude the ACP bridge picks (Haiku, Sonnet, Opus, or hypothetically Mythos) pushes actions through the same funnel: one of five named MCP servers. The routing rule at ChatPrompts.swift:55-61 pins desktop work to macos-use and web work to playwright. That is the ceiling.
Five-tool funnel, independent of model family
Routing prompt: macos-use or playwright, nothing else
The system prompt that ships with every Fazm session makes the routing literal. Every model, including a Mythos we do not have, would be told the same thing in the same five lines.
Naive app vs. shell-capped app, if Mythos somehow arrived
The interesting comparison is not Fazm-with-Mythos vs. Fazm-with-Opus. It is capability-gated-app vs. architecture-gated-app. A consumer app that imagines Mythos as an unlocking event is thinking about the wrong boundary.
What a new frontier model changes, by app shape
Treats the model id as the product's ceiling. Marketing implies Mythos access unlocks broader automation. Tool surface is often open: arbitrary shell, raw filesystem writes, unfettered AppleScript. The user's exposure scales with model capability. Mythos inside this shell is, in fact, more dangerous, because the shell is dangerous first.
- Tool surface scales up as models get smarter
- Marketing implies more capable models unlock more product
- Unreachable model ids tend to surface as red errors
Five steps the code actually walks when any Claude runs
Reconstructed from acp-bridge/src/index.ts and Desktop/Sources/. The same sequence applies for Haiku answering a trivial prompt and for an imagined Mythos answering a hard one. The constraint surface does not change across steps 1 through 5.
User selects a model in Fazm's floating bar
The picker reads from ShortcutSettings.updateModels, which filters the availableModels list emitted from the ACP bridge. Only model ids whose substring matches 'haiku', 'sonnet', or 'opus' render. Mythos would not appear even if injected, because there is no family tuple that maps it.
The prompt lands at the ACP bridge
acp-bridge/src/index.ts forwards the prompt to the Claude Agent SDK on whichever lane is active (personal OAuth or built-in shared). The bridge does not know or care which Claude model family will answer. All it enforces is the BUILTIN_MCP_NAMES whitelist: five allowed tools, line 1266.
The model decides to call a tool
Say the answer requires opening Mail and copying a subject line. The model picks mcp__macos-use__macos-use_open_application_and_traverse('Mail') and then mcp__macos-use__macos-use_click_and_traverse(element='subject'). Both calls go through the macos-use binary, which walks AXUIElementCreateApplication + kAXFocusedWindowAttribute.
The AX permission wall runs
macOS checks the Accessibility TCC database. If Fazm is not trusted, the call fails before it reaches Mail's process. If it is trusted, the call returns a structured tree. No model, Mythos included, can bypass this. Capability is irrelevant to the check.
The answer returns, rendered as chat text
The model reads the AX row for the subject line, replies with it, and the chat renders. From the user's perspective, Smart (Opus, latest) found a subject in a Mac-native Mail window. The same flow with Mythos in place would look identical: different reasoning trace, same click, same AX call, same rendered text.
The fallback that would catch Mythos specifically
If a user force-pasted a Mythos model id into a custom setting, or if Anthropic one day flipped it into a consumer OAuth plan's availableModels with no entitlement, this detector would catch the resulting 400 and silently reroute. It is model-agnostic by construction.
What the log shows on a consumer plan, today
This is a cleaned paste from a Fazm dev build on 2026-04-19, tailing /tmp/fazm-dev.log. The availableModels block never contains a Mythos entry. A manual attempt to send with a Mythos id falls into the isModelAccessError path and quietly resolves on the built-in lane to the Opus id the builtin account holds.
Side-by-side on the question the SERP does not ask
The question most Mythos coverage sidesteps is what the gated preview means downstream, in a consumer app the reader actually uses. Here is the comparison in one table.
| Feature | Naive capability-first app | Fazm |
|---|---|---|
| Mythos model access on a consumer OAuth plan | Claimed or implied on marketing pages | Not in availableModels. Fazm cannot show what it cannot receive. |
| Tool surface exposed to the model | Often: arbitrary shell, filesystem, or developer framework | Closed BUILTIN_MCP_NAMES whitelist of 5 MCP servers at index.ts:1266 |
| Action boundary on the user's Mac | Pixel-coordinate clicks from screenshot VLMs; no AX check | AXUIElement tree under macOS Accessibility permission (TCC) |
| What a frontier security model unlocks in the product | Implied: broader capability, more automation | The same 5 MCP tools, the same AX-constrained action vocabulary |
| Graceful fallback when a model id is unreachable | Red error banner, user retypes query against a different model | isModelAccessError at ChatProvider.swift:1362-1368 swaps lanes silently |
| Number of model family slots in the picker | Usually unbounded: every id the API returns surfaces | Three. Haiku -> Scary, Sonnet -> Fast, Opus -> Smart. Lines 159-163. |
Why accessibility-API apps are the wrong target for a red-team model
Mythos's 73 percent expert-CTF number is about finding remote or local exploits in code and systems. Fazm does not execute arbitrary code on the user's Mac, does not read privileged system files, and does not reach across app sandboxes. It walks the accessibility tree of running foreground apps, returns structured rows to the model, and executes the exact click or keystroke the model names. That loop is bounded by three concrete constructs, all in the repo:
- macOS Accessibility (TCC) permission: granted once in System Settings, revocable, audited by the OS, not by Fazm.
- BUILTIN_MCP_NAMES whitelist: five strings at acp-bridge/src/index.ts:1266. No shell. No raw filesystem. No AppleScript dial.
- System-prompt routing: macos-use for desktop apps, playwright for web pages inside Chrome. Everything else refused.
A model that is better at finding bugs is not better at clicking through a Mail window. The two skills do not share a rate limiter. That is why a gated cybersecurity preview is structurally orthogonal to a consumer Mac AI app, and why coverage of Mythos that treats frontier access as a product advantage is usually coverage of a different product shape.
Anthropic April 2026 releases, ordered by where they can actually run
Dates reconstructed from public Anthropic posts, UK AISI evaluation, and the Fazm git log. Opus 4.7 is the relevant release for consumer apps this month; Mythos is the relevant release for the ~40 orgs in Project Glasswing and for defensive-security teams studying its output traces.
Three numbers that stay stable across the model tier
The first two are architectural decisions Fazm made before Mythos was announced. The third is a fact about Anthropic's rollout, checkable per session by tailing /tmp/fazm-dev.log after ACPBridge: session/new returned availableModels.
The one sentence takeaway
Claude Mythos is a significant model for Project Glasswing's ~40 organizations, and it is irrelevant to the shape of any consumer Mac AI app that routes actions through macOS accessibility APIs and a closed MCP whitelist, because the action ceiling in that shape is the TCC permission plus the whitelist, not the model's reasoning ability. Roundup pages that treat new model access as the lever of consumer product capability are measuring against a different ceiling than the one Fazm is built on.
Want to see BUILTIN_MCP_NAMES and the family-map in a live Fazm build?
Book a 20-minute walk-through. We tail the log on a consumer OAuth account, confirm availableModels never returns Mythos, and force the isModelAccessError path on a fake model id so you can see the silent fallback fire.
Book a call →Frequently asked questions
What is Claude Mythos, and when did Anthropic release it?
Claude Mythos Preview is Anthropic's offensive-security-focused frontier model, released as a gated research preview on 2026-04-07 after a brief March 2026 CMS misconfiguration exposed pre-launch pages. It is not an Opus upgrade but a separate model tier. Public benchmarks from Anthropic, AWS Bedrock, and the UK AISI: 73 percent success on expert-level Capture the Flag tasks that no pre-2025 model could complete, 22 of 32 steps on average on a 32-step corporate-network attack simulation (3 out of 10 runs completed the full chain), and discovery of an OpenBSD weak spot that had gone undetected for 27 years. Access is limited to approximately 40 organizations including Microsoft, Apple, Google, several banks, and specific cybersecurity firms, coordinated through Anthropic's Project Glasswing initiative for defensive use cases.
Can a consumer Mac AI app like Fazm actually ship Claude Mythos?
No, and not for lack of wiring. The Claude Agent SDK's availableModels response on a consumer OAuth account (Pro, Max, or the free tier) never contains a Mythos model id. Fazm's acp-bridge/src/index.ts line 1271 calls emitModelsIfChanged with whatever the SDK returned, line 1274 filters the 'default' pseudo-model, and then forwards the rest to the Swift app. On every consumer plan in April 2026, that list is some subset of haiku, sonnet, and opus. No Mythos. Even if a user wanted it, there is no OAuth flow that grants the entitlement; Project Glasswing access is org-to-org, not self-serve. A consumer Mac app that claims Mythos support on launch day is almost certainly lying about the identifier it is sending.
If Mythos did land in a consumer Mac app, would the app become more dangerous?
Almost certainly not in any interesting way, because model capability is not the binding constraint on what the app can do on the user's Mac. Fazm's action surface is capped three times over: first by the macOS accessibility (AX) permission the user explicitly granted (System Settings > Privacy & Security > Accessibility), second by the closed BUILTIN_MCP_NAMES whitelist at acp-bridge/src/index.ts line 1266 (exactly five tools: fazm_tools, playwright, macos-use, whatsapp, google-workspace), and third by the system-prompt routing at ChatPrompts.swift lines 55 to 61 that restricts macos-use to Finder/Settings/Mail-style apps and playwright to pages inside Chrome. A Mythos-class model inside this shell can only click the same buttons, read the same AX tree, and type into the same text fields as Haiku. It cannot shell out, it cannot read system files, it cannot write outside the sandbox. A better reasoner does not widen the action vocabulary; it just reshuffles the order of calls.
What would have to change in Fazm to surface Mythos in the picker at all?
Three edits, all client-side, none of them sufficient without server-side access. First, the family map at Desktop/Sources/FloatingControlBar/ShortcutSettings.swift lines 159 to 163 is currently three tuples: ('haiku','Scary','Haiku',0), ('sonnet','Fast','Sonnet',1), ('opus','Smart','Opus',2). A fourth tuple would be needed, something like ('mythos','Red','Mythos',3), and the substring .contains() checks at lines 171 to 173 extended. Second, ShortcutSettings would need a corresponding case in getModelFamily at lines 168 to 174. Third, the ACPBridge.swift callback at lines 1202 to 1212 already decodes arbitrary modelId/name/description tuples, so no changes there. But none of this matters if the Claude Agent SDK never returns 'mythos' for a consumer plan, which it does not. The client-side surface area for a fourth family is small; the server-side entitlement is the wall.
What happens if a user on a Project Glasswing org account somehow picks Mythos in Fazm?
The existing model-access fallback path would probably catch it. ChatProvider.swift isModelAccessError at lines 1362 to 1368 tests three substring patterns against the lowercased error message: ('may not exist' AND 'not have access'), ('model' AND 'not found'), ('model' AND 'not available'). Any of those triggers the branch at lines 2943 to 2957 which sets pendingBridgeModeSwitch = 'builtin', retryAfterModelFallback = true, clears errorMessage so no red banner renders, and replays pendingRetryMessage on Fazm's built-in shared-Claude lane. Since the built-in lane is Fazm's own API account and does not have Project Glasswing access either, the user would fall back to whatever Opus model id the built-in account holds, and their query would land on Smart (Opus, latest). They would see one reply, no error, and no Mythos. The fallback is generic over model ids by design.
Does Fazm's accessibility-API approach benefit from a security-tuned model like Mythos?
No. Fazm reads the macOS accessibility tree (AXUIElement) and returns structured rows like '[AXButton] "Send" x:842 y:712 w:68 h:32 visible' to the model. The model's job is to pick the right row and call macos-use_click_and_traverse with those coordinates. This is a fundamentally different problem from finding a 27-year-old OpenBSD bug. It is closer to semantic UI understanding on structured input, a task Sonnet and Opus already handle at the ceiling of current utility. Mythos's offensive-security capability neither unlocks new macOS APIs nor makes the AX tree richer. The rate limiter is what the user's apps expose through AX, not how cleverly the model reasons about known exploits.
Where is this verifiable in the Fazm source?
Five files, all in /Users/matthewdi/fazm. First, acp-bridge/src/index.ts line 1266 for BUILTIN_MCP_NAMES = new Set(['fazm_tools','playwright','macos-use','whatsapp','google-workspace']). Second, acp-bridge/src/index.ts lines 1271 to 1281 for emitModelsIfChanged, including the 'default' filter at line 1274 and the JSON-diff at line 1277. Third, Desktop/Sources/FloatingControlBar/ShortcutSettings.swift lines 159 to 163 for the three family-map tuples. Fourth, Desktop/Sources/Chat/ChatPrompts.swift lines 55 to 61 for the macos-use vs playwright routing and the NEVER use browser_take_screenshot rule. Fifth, Desktop/Sources/Providers/ChatProvider.swift lines 1362 to 1368 for the three isModelAccessError substring patterns and lines 2943 to 2973 for the detect-and-retry branch. All of this is live in the repo as of 2026-04-19.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.