Accessibility Tree vs DOM - Which Approach Works Better for Browser Agents?

Matthew Diakonov

Updated March 19, 2026

accessibility-tree dom browser-agent automation web

Accessibility Tree vs DOM for Browser Agents

When you build a browser agent, the first design decision is how the agent perceives the page. Two options: parse the raw DOM or read the accessibility tree. The choice matters more than most people realize.

The DOM Approach

The DOM gives you everything. Every div, every span, every data attribute. You get the complete HTML structure exactly as the browser renders it.

The problem is that most of that information is noise. A modern web page has thousands of DOM nodes. Most are layout containers, styling wrappers, or framework artifacts. The actual interactive elements - buttons, links, form fields - are buried in the noise. Your agent has to figure out which elements matter and what they do.

The Accessibility Tree Approach

The accessibility tree is what screen readers use. It contains only the semantically meaningful elements: buttons with their labels, links with their text, form fields with their purposes, headings with their hierarchy.

Instead of parsing <div class="btn-primary-lg" role="button" aria-label="Submit form">, your agent just sees a button labeled "Submit form." The semantic meaning is already extracted.

Why Semantics Win

Browser agents need to understand what things do, not how they are built. When you tell an agent to "click the submit button," it needs to find an element that acts as a submit button. The accessibility tree literally labels it as such.

The DOM approach requires the agent to infer semantics from structure. Maybe the submit button is a <button>, maybe it is a <div> with an onClick handler, maybe it is an <a> tag styled to look like a button. The accessibility tree normalizes all of these into a single representation.

For Fazm, we use the accessibility tree as the primary perception layer. It reduces the token count sent to the LLM, improves action accuracy, and handles modern web frameworks that generate complex DOM structures from simple UI components.

Fazm is an open source macOS AI agent. Open source on GitHub.

The Wrong Tab Problem - Why Browser AI Agents Break and How the OS Accessibility Layer Fixes It

DOM-based browser agents constantly hit the wrong tab and wrong window. Switching to the OS accessibility layer solves the tab confusion problem for good.

Mar 18, 2026

Accessibility Tree vs DOM: What They Are, How They Differ, and When Each Matters

The DOM stores every HTML element on a page. The accessibility tree distills it into semantic meaning. Here is how they differ and when to use each.

Apr 6, 2026

Switching from DOM Selectors to Accessibility Tree Cut Our Flake Rate from 30% to 5%

DOM selectors break when websites update. The accessibility tree is stable because it represents what elements do, not how they are built. Real numbers from