Performance
16 articles about performance.
Size Queen Energy - Does 1M Context Actually Work?
1 million token context windows sound impressive but you never use them all at once. The real pattern is loading files on demand, not stuffing everything in
A/B Testing Claude Code Hooks - Optimizing Token Usage
Cache read jumps show that hooks front-load context effectively. How to A/B test Claude Code hooks for performance and measure the impact on token consumption.
The Small Delay Between Agent and Human - API Latency and the Perception Gap
The small delay between agent and human is measured in API latency and context loading time. How these delays shape the experience of working with AI agents
Data Availability Transfer Notes: The Hidden Bottleneck
Data availability is the hidden bottleneck in AI agent systems. Agents stall not because they lack capability, but because the data they need is not
Half a Million Computer Actions in Seven Days: What the Data Revealed
What 500,000 logged desktop automation actions reveal about failure rates, action type distribution, verification overhead, and how to build reliable agents at scale.
The Noise Floor Problem in AI Agent Context Windows
Every irrelevant token in your agent's context window raises the noise floor and degrades decision quality. Learn how to keep context clean and signal-rich.
Smart Caching Strategies for AI Agent Tool Results
TTL-based caching gives AI agents stale data. Learn about dependency-tracking caches that invalidate when upstream data changes, keeping agent decisions fresh.
First Speculative Decoding Across GPU and Neural Engine on Apple Silicon
Running two models on the same Apple Silicon chip - a 1B draft model on the Neural Engine and a larger model on GPU for faster local inference.
Sub-Agents Spawn Overhead - Batching Tasks in Multi-Agent Systems
Spawning one sub-agent per task creates massive overhead in multi-agent systems. Batching related tasks into fewer agents with scoped responsibilities
Inference Optimization Is a Distraction for AI Agent Builders
Why optimizing API call speed barely matters for AI agents - the real bottleneck is action execution, not model inference.
385ms Tool Selection Running Fully Local - No Pixel Parsing Needed
Local agents using macOS accessibility APIs skip the screenshot-parse-click cycle. Structured app data means instant element targeting and sub-second tool
Native Swift Means Your AI Agent Launches Instantly
Electron apps take seconds to start. Native Swift apps launch in under a second. For an always-on agent activated by hotkey, that speed difference matters
Why Removing Unused MCP Servers Speeds Up Claude Code More Than Removing Skills
Trimming unused MCP servers made way more difference than removing skills. MCP servers are actual processes that all have to handshake on startup.
Scaling Real-Time AI - Why the Screenshot Capture Pipeline Is Always the Bottleneck
Building real-time AI agents that react to screen content? The screenshot capture pipeline is where performance hits a wall. Here's how to fix it.
Real-Time AI Agent Performance - Fixing the Screenshot Pipeline
Your AI agent is slow because of screenshot capture, not LLM inference. Here are practical techniques to speed up the capture pipeline.
Fixing SwiftUI LazyVGrid Performance Issues on macOS
LazyVGrid jitter and stuttering on macOS comes from view identity instability. Here are practical fixes: stable .id() values, extracted cell views, async