How Accessibility-Based Desktop Automation Fixes Flaky Browser Tests

Matthew Diakonov

Updated March 30, 2026

browser-automation flaky-tests accessibility-api open-source desktop-agent ai_agents

How Accessibility-Based Desktop Automation Fixes Flaky Browser Tests

If you have spent any time with browser automation, you know the pain. Tests pass locally but fail in CI. Selectors break when a developer changes a class name. Elements exist in the DOM but are not clickable because an overlay has not finished animating. The flake rate on any non-trivial browser automation suite makes you question why you bothered writing the tests.

Why Browser Automation Breaks

Most browser automation tools - Selenium, Playwright, Puppeteer - interact with the DOM directly. They find elements by CSS selectors, XPaths, or test IDs, then simulate clicks and keystrokes. This works until it does not.

DOM changes are the obvious culprit. A frontend refactor renames components, restructures the element hierarchy, or changes how elements are rendered. Every selector-based test that touches those elements breaks. A developer changes class="submit-btn primary" to class="btn btn-primary" and your test suite lights up red.

But the subtler problem is timing. Modern web apps load content asynchronously, render progressively, and animate transitions. The element your automation is looking for might exist in the DOM but not be visible, not be interactable, or not yet be in its final position. Hardcoded sleep(500) statements paper over this temporarily - until the CI runner is slower than expected and the sleep is not enough.

The five root causes of flaky browser automation:

Race conditions between test actions and application responses
Hardcoded waits that guess at timing instead of waiting for specific conditions
Shared state between tests - databases, cookies, browser storage
DOM rendering delays from async frameworks like React or Next.js
Unstable CSS selectors that break when markup changes

Most teams spend more time chasing flake than they spend on the feature work the tests are supposed to protect.

The Accessibility Layer Alternative

Desktop automation through the macOS Accessibility API (AXUIElement) sidesteps most of these problems. Instead of querying the DOM, the automation reads the accessibility tree - a semantic representation of what is on screen. Buttons are identified as buttons with labels. Text fields are identified as text fields. The underlying HTML structure is irrelevant.

This means a frontend refactor that changes the underlying HTML but keeps the same visible UI does not break the automation. The accessibility tree still shows the same buttons, text fields, and labels because the semantic meaning has not changed - only the implementation details have.

Here is a concrete comparison. Finding a submit button:

DOM-based (fragile):

// Breaks if class name changes, element moves in the hierarchy,
// or the button is wrapped in a new container
await page.click('.checkout-form .submit-btn.primary');

Accessibility-based (resilient):

// Works as long as there is a button labeled "Submit" that is interactable
// Does not care about class names, DOM depth, or HTML structure
await agent.click({ role: 'button', label: 'Submit' });

The accessibility query will still find the button after a complete UI redesign - as long as the visual button labeled "Submit" still exists and is enabled. That is the resilience that matters for production automation.

When Accessibility Automation Applies

The accessibility layer works best for:

Automating native macOS applications where you have no DOM access at all
Cross-application workflows that span browser, Finder, and native apps
Long-running automations where UI changes would break fragile selectors
Agents that need to understand UI semantics rather than just simulate clicks

It is less suited for:

Fine-grained DOM manipulation that requires JavaScript execution
Testing specific HTML structure as part of a contract (e.g., verifying a specific aria-label was applied)
Browsers with unusual accessibility tree implementations

For most day-to-day desktop automation - filling forms, clicking buttons, reading text from applications - the accessibility layer is more reliable than DOM selectors.

Open Source Matters Here

Open source desktop automation frameworks let you inspect exactly how element detection works, understand the fallback behavior when elements are not found, and contribute fixes for edge cases specific to your environment. No vendor lock-in, no waiting for a commercial tool to fix a bug that blocks your workflow.

The Fazm approach uses the macOS accessibility API for all desktop interactions. The element matching logic is in the open repository - you can see exactly how it resolves ambiguous element lookups, handles disabled elements, and deals with accessibility trees that change mid-action.

The Practical Trade-Off

You give up the precision of DOM-level interaction for the resilience of semantic-level interaction. You cannot use CSS selectors to distinguish two visually identical buttons in different parts of the DOM - you need to use context (which container they are in, what precedes them) to disambiguate.

For most automation tasks, that is a trade worth making. The reduction in flake rate is significant. An automation suite that had a 15-20% flake rate on DOM selectors will typically drop below 5% when converted to accessibility-based targeting - because the majority of flake comes from selector instability and timing issues around DOM mutations, both of which the accessibility layer avoids.

This post was inspired by a discussion on r/AI_Agents.

Fazm is an open source macOS AI agent. Open source on GitHub.

How Accessibility-Based Desktop Automation Fixes Flaky Browser Tests

How Accessibility-Based Desktop Automation Fixes Flaky Browser Tests

Why Browser Automation Breaks

The Accessibility Layer Alternative

When Accessibility Automation Applies

Open Source Matters Here

The Practical Trade-Off

More on This Topic

Related Posts

Automate Browser Tasks Without Coding - Desktop Automation with Accessibility APIs

The Browser Is a Trap for Desktop AI Agents

Open Source MCP Server for macOS Accessibility Tree Control