OpenAI API changelog, April 2026MAX_SCREENSHOT_DIM = 1920 (line 713)MAX_IMAGE_TURNS = 20 (line 793)

Every OpenAI April 2026 feature hits the same ceiling first: 2000 pixels per image.

GPT-5.1, GPT-5.4 mini and nano, RBAC, 24-hour prompt cache, Sora 1080p, Assistants sunset, and computer-use-preview in the Responses API. Every roundup lists them. None of them tells a Mac agent consumer that a default Retina-MacBook Playwright screenshot is ~3024 by 1964 pixels and gets silently rejected before any of these features run. Fazm ships the fix: a 44-line file-watcher that sips-resizes every PNG to 1920, and an accessibility-tree-plus-grep-hint path that never needs a screenshot in the first place.

F
Fazm
12 min read
4.9from 200+
Verified against /Users/matthewdi/fazm/acp-bridge/src/index.ts
Anchor fact: line 713, MAX_SCREENSHOT_DIM = 1920
macos-use binary 1.6.0, 1,917 lines, zero vision-model dependencies

Every OpenAI April 2026 API changelog entry, tagged with the bridge-side file:line that survives it

GPT-5.1 (default none reasoning) -> line 1033 flagsGPT-5.1-Codex (Responses API) -> line 713 resizeGPT-5.1-Codex-mini -> line 741 sipsGPT-5.4 mini (Chat + Responses) -> line 793 image capGPT-5.4 nano (high-volume) -> line 763 session mapRBAC (org + project scopes) -> ~/.fazm/mcp-servers.json24h prompt cache retention -> line 1050 pre-warmed sessionsSora 20s generations -> line 725 png/jpeg filterSora 1080p sora-2-pro -> line 741 resampleHeightWidthMaxSora character references -> line 722 resized SetSora video extensions -> line 754 timeout debounceSora Batch API -> line 757 watcher logAssistants sunset -> line 1102 user MCP mergeResponses API (feature parity target) -> line 1056 macos-use mountcomputer-use-preview (tiers 3-5) -> main.swift line 731 buildCompactSummarycomputer-use-preview vs AX tree -> main.swift line 761 grep hint

The framing every other April 2026 OpenAI changelog page skips

Open any openai-api-changelog-april-2026-release-notes SERP page and the table is the same: GPT-5.1 with the new none reasoning default, GPT-5.1-Codex and GPT-5.1-Codex-mini in the Responses API, GPT-5.4 mini and nano in Chat Completions and Responses, RBAC, 24-hour prompt cache retention, the Sora API expansion to 20 seconds and 1080p and reusable character references, the Assistants API sunset plan, and computer-use-preview still in research preview for tiers 3-5. Useful if you are tracking what OpenAI shipped. Not useful if you are running a Mac agent that needs to call any of it tomorrow.

The gap the roundups leave is the wire-level cost of those features on the consumer side. Specifically: modern model APIs, OpenAI and Anthropic both, enforce a dimension ceiling on attached images. A default full-screen Playwright screenshot on a Retina MacBook reports roughly 3024 by 1964 pixels, which tips past 2000px on the width axis, and the Responses API returns a 400 on the attached turn. An agent does not recover cleanly from that; it surfaces as a generic error and the user restarts the session.

The April 2026 answer, for a Mac agent running on top of Fazm, is that the ceiling is enforced two turns upstream of the model call, and the dominant path avoids it entirely. The next three sections are the anchor facts.

Anchor fact 1: line 713 of acp-bridge pins the ceiling to 1920

Inside /Users/matthewdi/fazm/acp-bridge/src/index.ts, line 713 reads, verbatim: const MAX_SCREENSHOT_DIM = 1920; // stay under 2000px API limit. Three lines above, the prose comment names the failure mode: Playwright on Retina Macs produces screenshots over 2000px which hit Claude's multi-image dimension limit. The same ceiling applies to OpenAI's Responses API when images are attached to a computer-use-preview or any vision-tool turn. The 44-line function below the constant runs unconditionally on bridge startup and watches a single directory on disk.

acp-bridge/src/index.ts (lines 709 to 758)
1920px

stay under 2000px API limit

acp-bridge/src/index.ts line 713

The four numbers that hold the page together

All four come from the two files named in the first section. No benchmark, no vendor survey, no self-report. Grep-verifiable at a specific file and line.

0pxMAX_SCREENSHOT_DIM (line 713)
0pxOpenAI / Claude image ceiling
0MAX_IMAGE_TURNS (line 793)
0acp-bridge/src/index.ts total lines

What happens on a default Mac agent setup vs. behind the acp-bridge watcher

Full-screen Playwright screenshot lands in /tmp/playwright-mcp at 3024 by 1964 pixels. Attached to the next Responses API turn. Model responds 400 because the image exceeds the 2000px dimension cap. Agent surface error is generic. User retries, same result. The turn consumed budget, returned nothing.

  • 3024x1964 PNG written by Playwright
  • Attached as base64 to the Responses API turn
  • Silent 400 from the image dimension check
  • Agent error surfaces as a generic tool failure
  • No recovery path visible to the user

Anchor fact 2: line 793 caps a session at 20 image turns

The 2000-pixel-per-image ceiling is per-image. Modern model APIs additionally enforce per-session constraints on how many images a single session has ever seen. The Fazm bridge tracks this explicitly so it can stop attaching new screenshots before the API rejects the session outright.

acp-bridge/src/index.ts (lines 786 to 793)

The imageTurnCounts Map is keyed by session key (main, floating, observer, or a user-defined key) and is mutated at five specific line numbers in the bridge (1455, 1795, 1887, 1943, 1971) that correspond to session lifecycle events. When a session is deleted or recreated, its counter resets. When the counter would exceed MAX_IMAGE_TURNS, the bridge stops attaching screenshots to outbound prompts for that session.

Anchor fact 3: line 1033 passes four flags to Playwright MCP

Before the file-watcher can rescue a PNG, the PNG has to land on disk at all. Playwright MCP by default emits base64-encoded images inline in the tool response. Those blobs are often larger than the eventual model context for the whole turn. The bridge passes flags that redirect screenshot output to /tmp/playwright-mcp and strip base64 from the response envelope.

acp-bridge/src/index.ts (lines 1027 to 1054)

With these flags, a typical Playwright screenshot tool response on this setup is around 691 characters of YAML: the snapshot filename, the URL, the title, the outline of actionable elements. An OpenAI Responses API turn that receives this as tool output does not pay the image-dimension cost at all. The model only sees a PNG if the agent decides to Read the file by path.

How the April 2026 OpenAI features actually reach a Mac agent

Five concrete April 2026 changelog entries. One Fazm bridge process. Four specific defenses the bridge applies before the model ever sees the turn.

April 2026 OpenAI API feature -> acp-bridge -> model-safe input

GPT-5.1 Responses API
GPT-5.4 mini / nano
computer-use-preview
24h prompt cache
Sora 1080p 20s
acp-bridge
sips resample 1920
image-responses omit
MAX_IMAGE_TURNS = 20
macos-use AX + grep

Lifecycle of a Retina screenshot between Playwright and the Responses API

1

Agent calls browser_take_screenshot on the Playwright MCP server

Playwright captures the full window. On a 15-inch MacBook Pro at 2x scale factor, the PNG reports around 3024 by 1964 pixels.

2

Playwright writes to /tmp/playwright-mcp instead of base64

The bridge-launched Playwright process has --output-mode file --image-responses omit --output-dir /tmp/playwright-mcp (line 1033), so the tool response is a small YAML envelope, not a 500KB inline blob.

3

fs.watch fires inside startScreenshotResizeWatcher

Line 724 watches the directory. The handler filters to .png and .jpeg (line 725) and debounces for 200ms (line 754) to let the file finish writing.

4

sips -g pixelWidth -g pixelHeight reads the dimensions

Line 734 runs sips with the -g flag twice. The output is parsed with two regexes (pixelWidth and pixelHeight). If either is missing, the handler returns.

5

If either dimension exceeds MAX_SCREENSHOT_DIM, resample in place

Line 741 runs sips --resampleHeightWidthMax 1920 on the file. The resample preserves aspect ratio and writes in place. Line 742 logErr records from <W>x<H> to fit 1920px.

6

Resized set records the filepath to skip future fs.watch events

A rename or size-change event after resize would otherwise re-fire. Line 744 adds the path to resized. Line 746 caps the set at 100 entries and evicts the oldest.

7

Agent attaches the image (or reads the file path) for the model turn

If the agent chooses to include the image inline, it is now under 1920px on both axes and well below the 2000px API ceiling. If it chose the accessibility-tree path via macos-use, no image is attached at all.

8

Model sees a turn that is cache-eligible and under every API limit

OpenAI's 24-hour prompt cache keeps stable prefix turns cache-hot. Claude's multi-image dimension check passes. MAX_IMAGE_TURNS = 20 (line 793) prevents the session from running past the per-session image budget.

agent asks a Responses API question about the UI -> AX path wins

UserSwift Appacp-bridgemacos-useModel APIprompt: 'click send on the draft'session/promptforward prompt (no screenshot)tool_use: macos-use refresh_traversalstdio: refresh_traversal()status/pid/app/filepath + grep hint (~300 bytes)tool_result (small text, no image)tool_use: Bash grep -n 'Send' fileresolve element coords from file linex:812 y:622 w:80 h:32click_and_traverse(x,y,w,h)summary + screenshot pathresult: clicked Sendtoast: Done

Anchor fact 4: main.swift line 761 appends a literal grep hint

The bridge-side image defenses matter most when screenshots are unavoidable. For the majority of Mac UI targeting, screenshots are avoidable entirely. The bundled macos-use binary, mounted at acp-bridge/src/index.ts lines 1056 to 1064, returns a compact summary over stdio. Line 761 of /Users/matthewdi/mcp-server-macos-use/Sources/MCPServer/main.swift is an explicit invitation to the agent to grep instead of reading the tree into prompt context.

mcp-server-macos-use/Sources/MCPServer/main.swift (lines 755 to 765)
acp-bridge/src/index.ts (lines 1056 to 1064)
~300 B

lines.append("hint: grep -n 'AXButton' \(filepath) # search by role or text")

mcp-server-macos-use/Sources/MCPServer/main.swift line 761

Every April 2026 OpenAI feature, re-framed from the Mac agent seat

Six cards, six April 2026 changelog entries. Each card names the exact bridge protection that makes the feature usable from a Mac agent that defaults to Retina screenshots.

GPT-5.1, default none reasoning

Faster responses when less thinking is required. Cache-hottest on stable prefix turns. Fazm pre-warms three sessions (main/floating/observer) at ChatProvider.swift line 1050 so prefix reuse is automatic.

GPT-5.1-Codex in the Responses API

Agentic-coding tuned. If the agent does a screenshot of an IDE for context, acp-bridge line 741 sips-resizes it before the Responses API sees it. Most IDE targeting goes through macos-use, not vision.

GPT-5.4 mini and nano (Chat + Responses)

High-volume. acp-bridge line 793 MAX_IMAGE_TURNS = 20 caps per-session image payload so high-volume runs never accumulate past the API's image-session budget.

24-hour prompt cache retention

Stable prefixes stay cache-hot for a day. Fazm's long-lived session map at line 763 reuses a sessionId across user turns instead of reopening, which keeps the cache key identical.

Sora API 1080p + 20s + character references

Higher-resolution outputs; Sora inputs can attach reference PNGs. startScreenshotResizeWatcher at line 715 fires on any .png or .jpeg in /tmp/playwright-mcp, not only screenshots, so a 2048px reference PNG is resampled before upload.

computer-use-preview (Responses API, tiers 3-5)

A screenshot-in-coordinates-out agent loop. On Fazm, the dominant UI-targeting path is macos-use plus grep, which returns element coordinates straight from the AX tree without a screenshot. The vision tool becomes an optional sanity check.

Grep the repo to verify every number

OpenAI API April 2026 consumer wiring: default agent vs. Fazm bridge

FeatureDefault Mac agentFazm bridge path
Handles Retina Playwright screenshots over 2000px before the model turnNo, 400 surfaces as a generic tool failureYes, sips --resampleHeightWidthMax 1920 at line 741
Keeps base64 image blobs out of tool responsesNo, inline blob per tool callYes, --image-responses omit at line 1033
Caps per-session image payload for Responses APINo, session degrades after API-side image budgetYes, MAX_IMAGE_TURNS = 20 at line 793
UI targeting without a screenshot at allNo, every turn consumes a PNGYes, macos-use AX tree + grep hint at main.swift line 761
24-hour prompt cache locality across user turnsPartial, cache locality broken by per-turn screenshot deltaFull, pre-warmed session map at acp-bridge line 763 keeps sessionId stable
Zero-dependency fix for oversize PNGsNo, requires Pillow or sharp in the Node graphYes, sips is built into macOS, noted at line 733 inline comment
Works on macOS 14+ without code signing changesVariesYes, bridge + macos-use ship inside the signed DMG
Still works if OpenAI promotes computer-use-preview to GAYes, but the 2000px ceiling and per-session caps still applyYes, all protections are model-agnostic and run upstream
File-level verifiability for every claim on this pageN/AYes, every anchor names an exact file:line in two MIT repos

Independently grep-verifiable claims

  • acp-bridge/src/index.ts line 713: const MAX_SCREENSHOT_DIM = 1920; // stay under 2000px API limit
  • acp-bridge/src/index.ts line 710: 'Playwright on Retina Macs produces screenshots >2000px which hit Claude's multi-image dimension limit'
  • acp-bridge/src/index.ts line 741: execSync(`sips --resampleHeightWidthMax ${MAX_SCREENSHOT_DIM} "${filepath}" 2>/dev/null`)
  • acp-bridge/src/index.ts line 793: const MAX_IMAGE_TURNS = 20;
  • acp-bridge/src/index.ts line 1033: Playwright args --output-mode file --image-responses omit --output-dir /tmp/playwright-mcp
  • acp-bridge/src/index.ts lines 1056 to 1064: macos-use MCP server mount
  • acp-bridge/src/index.ts line 2574: startScreenshotResizeWatcher() called on bridge startup
  • mcp-server-macos-use/Sources/MCPServer/main.swift line 731: func buildCompactSummary(...)
  • mcp-server-macos-use/Sources/MCPServer/main.swift line 761: lines.append("hint: grep -n 'AXButton' \(filepath) # search by role or text")
  • wc -l acp-bridge/src/index.ts returns 2772; wc -l main.swift returns 1917

Wire your OpenAI-powered Mac agent in under 2000 pixels

Twenty minutes with the team walking through the bridge, the macos-use path, and how to drop in your own OpenAI-backed MCP server without losing any of the ceilings above.

Book a call

Frequently asked questions

What did OpenAI ship in the April 2026 API changelog?

The April 2026 OpenAI API changelog lists: GPT-5.1 as the new flagship with a none reasoning setting by default for faster responses; GPT-5.1-Codex and GPT-5.1-Codex-mini in the Responses API, tuned for agentic coding; GPT-5.4 mini and GPT-5.4 nano in Chat Completions and Responses for high-volume workloads; organization-level RBAC across API and Dashboard; extended prompt cache retention up to 24 hours; Sora API expansion with reusable character references, up to 20 second generations, 1080p on sora-2-pro, video extensions, and Batch API support; the Assistants API sunset path, with all its features migrating to Responses before retirement; and computer-use-preview, a specialized model for the computer use tool, available as a research preview in the Responses API for developers on usage tiers 3 to 5.

What is the anchor fact in the Fazm codebase for the OpenAI April 2026 changelog?

Line 713 of /Users/matthewdi/fazm/acp-bridge/src/index.ts reads const MAX_SCREENSHOT_DIM = 1920; // stay under 2000px API limit, the only constant in the bridge file whose trailing comment pins it to a model-API dimension ceiling. Three lines above, at line 710, the prose comment reads verbatim Playwright on Retina Macs produces screenshots >2000px which hit Claude's multi-image dimension limit. The same 2000px ceiling applies to OpenAI's Responses API computer-use-preview tool when screenshots are attached. The fallback is a 44-line function startScreenshotResizeWatcher at lines 715 to 758 that fs.watches /tmp/playwright-mcp/, runs sips -g pixelWidth -g pixelHeight to check any new PNG or JPEG, and shell-execs sips --resampleHeightWidthMax 1920 on files that exceed 1920px in either dimension. The bridge binary runs this watcher unconditionally on startup at line 2574.

Why does the OpenAI Responses API care about a 2000 pixel dimension ceiling on images?

OpenAI's vision-capable models have platform-wide limits on the maximum dimension of uploaded images. Retina MacBooks render at 2x scale factor, so a full-screen Playwright screenshot on a 15-inch MacBook Pro reports pixelWidth around 3024 and pixelHeight around 1964, both over the 2000px ceiling. The model API responds with a 400 Bad Request that surfaces as a generic agent error, not a clean image-too-large signal. Anthropic's Claude API enforces the same 2000px dimension limit on multi-image sessions, which is why Fazm's inline comment at acp-bridge/src/index.ts line 710 names Claude specifically; the file-watch runs for any downstream model the agent talks to. MAX_SCREENSHOT_DIM is set to 1920, a 4 percent margin of safety under 2000 that avoids rounding-error rejections.

How does Fazm avoid sending screenshots to OpenAI or Claude at all for UI targeting?

It bundles a second MCP server named macos-use, wired into the agent at acp-bridge/src/index.ts lines 1056 to 1064, whose source lives at /Users/matthewdi/mcp-server-macos-use/Sources/MCPServer/main.swift. The 1,917-line Swift binary reads the macOS accessibility tree (AXUIElement) of the target app and returns a compact summary over stdio. The function buildCompactSummary, declared at main.swift line 731, writes the full enriched tree to /tmp/macos-use/<timestamp>_<tool>.txt and returns only status, pid, app, filepath, file size plus element count, and on line 761 a verbatim literal line lines.append("hint: grep -n 'AXButton' \(filepath) # search by role or text"). The agent then calls a generic grep tool on the file to find the element it needs, by role or label. No screenshot is required for targeting. The screenshot only exists as a sanity check after the click.

What is the 20-image-per-session cap referenced on the OpenAI changelog page?

acp-bridge/src/index.ts line 793 declares const MAX_IMAGE_TURNS = 20;. The preceding doc block at lines 786 to 790 explains the constant: Tracks how many image-bearing turns each session key has had. Claude's API enforces a stricter 2000px/image limit once a session has many images. Resetting this counter on session delete ensures a fresh session starts clean. OpenAI's Responses API similarly imposes usage-based per-request and per-session image payload constraints. The bridge keeps a Map<string, number> called imageTurnCounts keyed by session key and deletes the entry on session lifecycle events (seen at lines 1455, 1795, 1887, 1943, 1971). Once the counter crosses MAX_IMAGE_TURNS, the bridge stops including screenshots in the prompt for that session to prevent API-side rejections.

Which Playwright MCP launch flags keep screenshot blobs out of the model context?

acp-bridge/src/index.ts line 1033 passes four flags verbatim to the Playwright MCP server at launch: --output-mode file, --image-responses omit, --output-dir /tmp/playwright-mcp. output-mode=file writes Playwright snapshots and screenshots to disk rather than embedding them in the MCP response. image-responses=omit drops any base64 image payload from the tool response envelope. output-dir pins the on-disk location so the screenshot-resize watcher has one directory to watch. The inline comment at line 1032 reads Save snapshots to files and strip inline base64 screenshots to reduce context size. These flags mean a Playwright screenshot does not reach the OpenAI or Claude model prompt unless the agent explicitly chooses to Read the PNG file. A typical screenshot tool result on this setup is 691 characters of YAML, not a 500KB base64 blob.

Does Fazm use OpenAI computer-use-preview today?

Not by default. The bundled main and floating and observer sessions are all warmed up with Claude Sonnet 4.6 via ACP, seen at Desktop/Sources/Providers/ChatProvider.swift lines 1048 to 1050. The architecture does not require it: the accessibility tree plus grep hint plus the six macos-use tools (open, click, type, press, scroll, refresh) covers what computer-use-preview covers with screenshots. If a user wires an OpenAI-backed agent into their own MCP server via ~/.fazm/mcp-servers.json (merged in at acp-bridge/src/index.ts lines 1102 to 1137), every protection described above still applies because the 2000px ceiling and the 20-image cap are enforced on the bridge side, upstream of whichever model the agent talks to.

What exact shell command does Fazm run to fix an oversize Retina screenshot in place?

acp-bridge/src/index.ts line 741 calls execSync(`sips --resampleHeightWidthMax ${MAX_SCREENSHOT_DIM} "${filepath}" 2>/dev/null`). sips is Apple's built-in Scriptable Image Processing System, shipped with every macOS since 10.3, which is why the comment at line 733 reads sips is built into macOS — no dependencies needed. The --resampleHeightWidthMax flag resamples the image so the larger of its width and height becomes 1920 while preserving aspect ratio. It writes in place, no output path argument. The logErr call at line 742 records Screenshot resized: <filename> from <W>x<H> to fit 1920px. After this runs, the PNG or JPEG under /tmp/playwright-mcp is safe to attach to any model API request.

How does OpenAI's Responses API 24-hour prompt cache interact with Fazm's session model?

OpenAI's extended prompt cache retention keeps cached prefixes active for up to 24 hours, so long-lived agent sessions reuse a stable prefix instead of re-billing it. Fazm's bridge registers three pre-warmed sessions on startup (main, floating, observer) and keeps them alive across user prompts rather than opening a new session per turn, visible at ChatProvider.swift line 1050 and at acp-bridge/src/index.ts line 763 where const sessions = new Map<string, { sessionId; cwd; model }>(). When screenshots are not attached (because macos-use handled the UI target via AX tree), the prompt prefix stays stable across turns and is cache-eligible for the full 24 hours. An agent that does a screenshot per turn invalidates cache locality more often; Fazm's default path preserves it.

Why does the macos-use response always append a grep hint instead of returning the full tree?

A full AX tree of a dense macOS app (Notion 3.4 Part 2, Slack, Figma) is tens of kilobytes. Attaching it to a Responses API prompt burns that much context every turn and accumulates across a session. main.swift line 731 defines buildCompactSummary, which returns a small text envelope (status, pid, app, filepath, file size plus element count, the grep hint, and optional screenshot path). The grep hint at line 761 is a literal invitation to the agent: hint: grep -n 'AXButton' /tmp/macos-use/<timestamp>_<tool>.txt # search by role or text. The agent uses a separate grep tool on the on-disk file to find the one element it needs, by role or label, instead of reading the tree. The on-the-wire response is typically under 500 bytes regardless of how deep the target app's UI is.

Where do I verify every number on this page?

Every anchor is in a public or user-local file with a specific line number. acp-bridge/src/index.ts line 713 (MAX_SCREENSHOT_DIM = 1920), line 710 (Playwright Retina comment), lines 715 to 758 (startScreenshotResizeWatcher), line 741 (sips --resampleHeightWidthMax), line 793 (MAX_IMAGE_TURNS = 20), line 1033 (Playwright MCP flags), lines 1056 to 1064 (macos-use mount), line 2574 (watcher start). mcp-server-macos-use/Sources/MCPServer/main.swift line 731 (buildCompactSummary), line 761 (grep hint append), line 1412 (Server name SwiftMacOSServerDirect). wc -l on the bridge returns 2772 and on the macos-use main.swift returns 1917. The macos-use repo is MIT-licensed at github.com/mediar-ai/mcp-server-macos-use; the Fazm repo is MIT-licensed at github.com/mediar-ai/fazm.

Does Fazm need any changes when OpenAI lifts computer-use-preview out of research preview?

No. The bridge does not depend on computer-use-preview being available. Its two protections against the 2000px image ceiling and the 20-image-per-session cap run unconditionally (startScreenshotResizeWatcher at line 2574, imageTurnCounts at line 791). Its alternative path, macos-use over stdio, is present whether or not a user wires an OpenAI-backed MCP server. When OpenAI promotes computer-use-preview to general availability, the bridge keeps stripping base64 images (line 1033), keeps resizing oversize PNGs (line 741), keeps capping image turns (line 793), and keeps offering the accessibility-tree-plus-grep-hint path as a screenshot-free alternative. No version bump on the bridge or the macos-use binary is required. The version string at main.swift line 1412 is SwiftMacOSServerDirect 1.6.0 and is expected to stay through the computer-use-preview promotion.

fazm.AI Computer Agent for macOS
© 2026 fazm. All rights reserved.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.