Octopus Cognition - Why AI Agents Split Brain from Arms
The Octopus Model
An octopus has a central brain, but two-thirds of its neurons are in its arms. Each arm can taste, touch, and react independently. The brain sets direction. The arms handle local perception and execution.
This is exactly how the best desktop AI agents work. The brain is an LLM like Claude. The arms are MCP (Model Context Protocol) servers that interact with the operating system through the accessibility API. The brain decides what to do. The arms figure out how to do it - and they perceive the environment on their own.
Arms That See
Here is the key insight most agent frameworks miss: the arms do their own perception. Every time the macOS accessibility MCP server performs an action - a click, a keystroke, a scroll - it automatically triggers a full traversal of the accessibility tree. The arm does not wait for the brain to ask "what do you see now?" It reports back what changed.
This means the LLM is not constantly spending tokens on perception. It gets structured updates about the UI state as a side effect of acting. The cost of understanding the screen is bundled into the cost of interacting with it.
Why Separation Matters
When reasoning and execution are tightly coupled, everything slows down. The LLM has to interpret raw pixels, decide what to do, generate precise coordinates, and verify the result - all in one inference call.
With the octopus model, responsibilities are clean:
- Brain (LLM): Understands the goal, plans the sequence of actions, interprets results
- Arms (MCP server): Executes UI primitives, traverses the accessibility tree, reports structured state
This separation also means you can swap the brain. Use Claude for complex reasoning tasks, a cheaper model for simple ones. The arms do not care which LLM is driving them.
Practical Benefits
Agents built this way are faster because perception is automatic, cheaper because the LLM processes structured data instead of images, and more reliable because each layer can be tested independently.
The octopus figured this out millions of years ago. We are just catching up.
Fazm is an open source macOS AI agent. Open source on GitHub.