Testing
4 articles about testing.
Testing AI Agents with Accessibility APIs Instead of Screenshots
·2 min read
Most agent testing relies on screenshots which break constantly. Accessibility APIs give you the actual UI structure - buttons, labels, states. Tests that check the accessibility tree survive UI redesigns.
testingaccessibility-apiscreenshotsreliabilityqa
Explicit Acceptance Criteria in CLAUDE.md to Stop Premature Victory
·2 min read
How adding explicit acceptance criteria to CLAUDE.md stops Claude Code from declaring victory prematurely. Tests must pass, files must exist, no regressions.
claude-mdacceptance-criteriaclaude-codetestingdeveloper-workflowquality
Screenshots Are Better Than LLM Self-Reports for Multi-Agent Verification
·2 min read
Judge-reflection patterns in multi-agent systems sound good but the judge LLM can be fooled. Screenshots provide ground truth for verifying whether an action actually changed the screen.
multi-agentverificationscreenshotsreliabilitytesting
Non-Deterministic Agents Need Deterministic Feedback Loops
·2 min read
AI agents are inherently unpredictable, but their feedback loops should not be. Why deterministic verification is the key to reliable agent systems.
feedback-loopsreliabilityai-agentsdeterministicverificationtesting
Browse by Topic
Claude Code (101)Automation (94)Macos (79)Productivity (76)Ai Agent (74)Ai Agents (61)Desktop Agent (54)Parallel Agents (49)Accessibility Api (39)Tutorial (37)Developer Tools (34)Claude Md (31)Comparison (31)Mcp (29)Developer Workflow (27)Desktop Automation (26)Open Source (25)Memory (24)Privacy (22)Workflow (22)