Optimization

15 articles about optimization.

100M Tokens Tracked: 99.4% Were Input and Parallel Agents Make It Worse

·2 min read

After tracking 100M tokens, 99.4% were input tokens. Running parallel Claude Code agents multiplies the input cost problem. Here is how CLAUDE.md scoping helps.

tokensapi-costsparallel-agentsclaude-codeclaude-mdoptimization

Accessibility Tree Dumps Overflow LLM Context Windows - How to Fix It

·3 min read

Raw accessibility tree data can consume 24KB or more per dump, flooding AI agent context windows. The fix: write to temp files and return concise summaries instead.

accessibility-treecontext-windowllmmacosoptimizationdesktop-agent

Running 5 Parallel AI Agents Is Making My API Bill a Second Rent Payment

·2 min read

Running multiple Claude Code agents in parallel on a macOS app. The API costs add up fast. Model routing, context pruning, and local models all help reduce the bill.

api-costsparallel-agentsclaude-codebudgetoptimization

Why the Accessibility Tree Beats Screenshots for Desktop Automation: Lessons From Amazon Checkout

·2 min read

We use the accessibility tree instead of screenshots for desktop automation. Here is why AXUIElement hierarchy is faster, cheaper, and more reliable - with lessons from automating Amazon checkout.

accessibility-treedesktop-automationmacosaxuielementoptimization

Browser Automation: Accessibility Snapshots vs Screenshots - Saving Tokens by Skipping Pixels

·2 min read

Switching from screenshots to accessibility snapshots for browser automation saved us massive token costs. Here is why structured data beats pixel analysis for AI agents.

browser-automationaccessibilitytokensoptimizationplaywright

MCP Tool Responses Are the Biggest Context Hog - How to Compress Them

·3 min read

MCP server tool responses silently eat your context window. Here is how to compress accessibility tree data and other MCP outputs before they fill your token budget.

mcpcontext-windowaccessibility-apioptimizationtoken-management

What Half a Million Desktop Agent Actions Taught Us About Failure

·2 min read

Lessons from analyzing 500K desktop agent actions - the most common failures, successes, and what to optimize first.

telemetryanalyticsdesktop-agentfailure-modesoptimization

Inference Optimization Is a Distraction for AI Agent Builders

·2 min read

Why optimizing API call speed barely matters for AI agents - the real bottleneck is action execution, not model inference.

inferenceoptimizationdistractionbottleneckperformance

How Much Are You Actually Spending on LLMs Every Month?

·2 min read

A breakdown of typical developer LLM spending, where the money goes, and how local models and context pruning can cut costs dramatically.

llm-costsapi-spendingoptimizationlocal-modelsbudget

How to Cut AI Agent Costs 50-70% with Model Routing

·2 min read

Route simple tasks to local Ollama models, complex ones to Claude. Combine that with aggressive state summarization and context pruning to keep token usage down without losing important information.

model-routingcost-reductionollamaclaudeoptimization

The Engineer's Trap - Optimizing Everything Like Debugging Code

·2 min read

Software engineers try to optimize meditation, relationships, and life like debugging code. Sometimes the best approach is to stop optimizing and let things work.

engineer-mindsetoptimizationproductivitydebuggingautomation

Why Removing Unused MCP Servers Speeds Up Claude Code More Than Removing Skills

·3 min read

Trimming unused MCP servers made way more difference than removing skills. MCP servers are actual processes that all have to handshake on startup.

claude-codemcpperformancedeveloper-toolsoptimization

Real-Time AI Agent Performance - Fixing the Screenshot Pipeline

·2 min read

Your AI agent is slow because of screenshot capture, not LLM inference. Here are practical techniques to speed up the capture pipeline.

real-time-aiperformancescreenshot-pipelineoptimizationmacos

Fixing SwiftUI LazyVGrid Performance Issues on macOS

·2 min read

LazyVGrid jitter and stuttering on macOS comes from view identity instability. Here are practical fixes: stable .id() values, extracted cell views, async image loading, and avoiding inline closures.

swiftuilazyvgridperformancemacosoptimization

I Installed 20 MCP Servers and Everything Got Worse - Why Fewer Is Better

·2 min read

More MCP servers means hundreds of tool definitions competing for attention. Stripping down to 3 servers made Claude pick the right tool on the first try.

mcpclaude-codedeveloper-toolsoptimizationbest-practices

Browse by Topic