Performance

16 articles about performance.

Size Queen Energy - Does 1M Context Actually Work?

March 18, 2026·2 min read

1 million token context windows sound impressive but you never use them all at once. The real pattern is loading files on demand, not stuffing everything in

context-window1m-tokensllmai-agentsperformance

A/B Testing Claude Code Hooks - Optimizing Token Usage

March 18, 2026·2 min read

Cache read jumps show that hooks front-load context effectively. How to A/B test Claude Code hooks for performance and measure the impact on token consumption.

claude-codehooksoptimizationtokensperformance

The Small Delay Between Agent and Human - API Latency and the Perception Gap

March 18, 2026·3 min read

The small delay between agent and human is measured in API latency and context loading time. How these delays shape the experience of working with AI agents

ai-agentlatencydeveloper-experienceapiperformance

Data Availability Transfer Notes: The Hidden Bottleneck

March 18, 2026·2 min read

Data availability is the hidden bottleneck in AI agent systems. Agents stall not because they lack capability, but because the data they need is not

data-availabilitybottleneckagent-architectureperformanceinfrastructure

Half a Million Computer Actions in Seven Days: What the Data Revealed

March 18, 2026·6 min read

What 500,000 logged desktop automation actions reveal about failure rates, action type distribution, verification overhead, and how to build reliable agents at scale.

desktop-automationterminatorscalecomputer-actionsperformance

The Noise Floor Problem in AI Agent Context Windows

March 18, 2026·2 min read

Every irrelevant token in your agent's context window raises the noise floor and degrades decision quality. Learn how to keep context clean and signal-rich.

context-windownoise-reductionai-agentssignal-to-noiseperformance

Smart Caching Strategies for AI Agent Tool Results

March 18, 2026·3 min read

TTL-based caching gives AI agents stale data. Learn about dependency-tracking caches that invalidate when upstream data changes, keeping agent decisions fresh.

cachingai-agenttool-resultsarchitectureperformance

First Speculative Decoding Across GPU and Neural Engine on Apple Silicon

March 18, 2026·2 min read

Running two models on the same Apple Silicon chip - a 1B draft model on the Neural Engine and a larger model on GPU for faster local inference.

speculative-decodingapple-siliconneural-enginelocal-aiperformance

Sub-Agents Spawn Overhead - Batching Tasks in Multi-Agent Systems

March 18, 2026·3 min read

Spawning one sub-agent per task creates massive overhead in multi-agent systems. Batching related tasks into fewer agents with scoped responsibilities

multi-agentsub-agentsbatchingperformanceoverheadorchestration

Inference Optimization Is a Distraction for AI Agent Builders

March 17, 2026·2 min read

Why optimizing API call speed barely matters for AI agents - the real bottleneck is action execution, not model inference.

inferenceoptimizationdistractionbottleneckperformance

385ms Tool Selection Running Fully Local - No Pixel Parsing Needed

March 17, 2026·2 min read

Local agents using macOS accessibility APIs skip the screenshot-parse-click cycle. Structured app data means instant element targeting and sub-second tool

speedlocal-aiaccessibility-apiapple-siliconperformance

Native Swift Means Your AI Agent Launches Instantly

March 17, 2026·2 min read

Electron apps take seconds to start. Native Swift apps launch in under a second. For an always-on agent activated by hotkey, that speed difference matters

swiftnativeperformancelaunch-speedelectron

Why Removing Unused MCP Servers Speeds Up Claude Code More Than Removing Skills

March 17, 2026·3 min read

Trimming unused MCP servers made way more difference than removing skills. MCP servers are actual processes that all have to handshake on startup.

claude-codemcpperformancedeveloper-toolsoptimization

Scaling Real-Time AI - Why the Screenshot Capture Pipeline Is Always the Bottleneck

March 17, 2026·3 min read

Building real-time AI agents that react to screen content? The screenshot capture pipeline is where performance hits a wall. Here's how to fix it.

real-time-aiscreenshotperformancebottleneckscreencapturekit

Real-Time AI Agent Performance - Fixing the Screenshot Pipeline

March 17, 2026·2 min read

Your AI agent is slow because of screenshot capture, not LLM inference. Here are practical techniques to speed up the capture pipeline.

real-time-aiperformancescreenshot-pipelineoptimizationmacos

Fixing SwiftUI LazyVGrid Performance Issues on macOS

March 17, 2026·2 min read

LazyVGrid jitter and stuttering on macOS comes from view identity instability. Here are practical fixes: stable .id() values, extracted cell views, async

swiftuilazyvgridperformancemacosoptimization

Performance

Size Queen Energy - Does 1M Context Actually Work?

A/B Testing Claude Code Hooks - Optimizing Token Usage

The Small Delay Between Agent and Human - API Latency and the Perception Gap

Data Availability Transfer Notes: The Hidden Bottleneck

Half a Million Computer Actions in Seven Days: What the Data Revealed

The Noise Floor Problem in AI Agent Context Windows

Smart Caching Strategies for AI Agent Tool Results

First Speculative Decoding Across GPU and Neural Engine on Apple Silicon

Sub-Agents Spawn Overhead - Batching Tasks in Multi-Agent Systems

Inference Optimization Is a Distraction for AI Agent Builders

385ms Tool Selection Running Fully Local - No Pixel Parsing Needed

Native Swift Means Your AI Agent Launches Instantly

Why Removing Unused MCP Servers Speeds Up Claude Code More Than Removing Skills

Scaling Real-Time AI - Why the Screenshot Capture Pipeline Is Always the Bottleneck

Real-Time AI Agent Performance - Fixing the Screenshot Pipeline

Fixing SwiftUI LazyVGrid Performance Issues on macOS

Browse by Topic