The Sanitization Tax
The Sanitization Tax
The accessibility tree is how desktop agents see your screen. Every button, text field, menu item, and label exposed as structured data. The problem is that raw accessibility data is messy. Redundant labels, invisible elements, container nodes with no useful information, aria attributes that say nothing.
Clean It or Ship It
You have two options. Sanitize the tree before sending it to the LLM - strip empty nodes, merge redundant labels, filter invisible elements. Or send the raw tree and let the model figure it out.
Sanitization produces cleaner input. The model sees only relevant elements. Responses are more accurate because the signal-to-noise ratio is higher. But sanitization costs compute time and risks removing elements that looked useless but were actually important.
The Token Overhead Problem
Raw accessibility trees are huge. A moderately complex application window produces thousands of tokens. A full desktop with multiple windows can easily hit tens of thousands. At current token prices and context window limits, this matters.
Sanitization reduces token count - often by 40 to 60 percent. That means more room in the context window for conversation history, more requests per dollar, and faster response times. The cost is the engineering effort to build and maintain the sanitization pipeline, plus the occasional bug where an important element gets filtered out.
Finding the Balance
The practical approach is aggressive structural sanitization with conservative content preservation. Remove empty containers, collapse single-child wrapper nodes, filter elements with zero dimensions. But keep all text content, all interactive elements, and all elements with meaningful labels.
This gets you most of the token savings without the risk of losing actionable information. It is a tax you pay once in engineering time that saves continuously in runtime costs.
Fazm is an open source macOS AI agent. Open source on GitHub.