Accessibility Tree vs DOM - Which Approach Works Better for Browser Agents?
Accessibility Tree vs DOM for Browser Agents
When you build a browser agent, the first design decision is how the agent perceives the page. Two options: parse the raw DOM or read the accessibility tree. The choice matters more than most people realize.
The DOM Approach
The DOM gives you everything. Every div, every span, every data attribute. You get the complete HTML structure exactly as the browser renders it.
The problem is that most of that information is noise. A modern web page has thousands of DOM nodes. Most are layout containers, styling wrappers, or framework artifacts. The actual interactive elements - buttons, links, form fields - are buried in the noise. Your agent has to figure out which elements matter and what they do.
The Accessibility Tree Approach
The accessibility tree is what screen readers use. It contains only the semantically meaningful elements: buttons with their labels, links with their text, form fields with their purposes, headings with their hierarchy.
Instead of parsing <div class="btn-primary-lg" role="button" aria-label="Submit form">, your agent just sees a button labeled "Submit form." The semantic meaning is already extracted.
Why Semantics Win
Browser agents need to understand what things do, not how they are built. When you tell an agent to "click the submit button," it needs to find an element that acts as a submit button. The accessibility tree literally labels it as such.
The DOM approach requires the agent to infer semantics from structure. Maybe the submit button is a <button>, maybe it is a <div> with an onClick handler, maybe it is an <a> tag styled to look like a button. The accessibility tree normalizes all of these into a single representation.
For Fazm, we use the accessibility tree as the primary perception layer. It reduces the token count sent to the LLM, improves action accuracy, and handles modern web frameworks that generate complex DOM structures from simple UI components.
Fazm is an open source macOS AI agent. Open source on GitHub.