The Gap Between Agent Memory and Agent Execution - You Need Both
The Gap Between Agent Memory and Agent Execution
Building Fazm taught us that an AI agent has two halves that are equally important and usually developed unevenly: memory (what the agent knows and remembers) and arms (what the agent can actually do).
Most projects start with one and neglect the other.
The Arms Problem
The first version of Fazm was essentially Claude with access to the macOS accessibility API through an MCP server built in Swift. We called it "building the arms" - giving the AI the ability to physically interact with the desktop.
This MCP server wraps macOS accessibility APIs so the agent can:
- Enumerate every UI element in any running application
- Click buttons, fill text fields, select menu items
- Read the current state of any window or dialog
- Capture screenshots for visual verification
Without these arms, the agent is just a chatbot. It can talk about opening Slack and sending a message, but it cannot actually do it.
The Memory Problem
Arms without memory create a different problem: the agent that forgets everything. Every session starts from zero. It does not remember your preferences, your workflow patterns, which apps you use for what, or what it tried last time that did not work.
This is the gap most desktop agents fall into. They can execute actions but have no persistent understanding of the user's environment.
Bridging the Gap
The solution is building both systems and connecting them:
- Execution layer. MCP servers that wrap OS-level APIs for actual desktop control.
- Memory layer. Persistent storage of user preferences, workflow patterns, and action history that carries across sessions.
- Planning layer. The LLM that sits between memory and execution, using what it knows about you to decide what to do and how to do it.
An agent with great memory and no arms is a note-taking app. An agent with great arms and no memory is a macro recorder. You need both to build something that actually feels like an assistant.
- Agent Persistent Memory and Knowledge Graphs
- Agent Execution Is Harder Than Planning
- Long-Term Memory Separates Toy From Useful Agents
Fazm is an open source macOS AI agent. Open source on GitHub.