Memory Filters - Why AI Agents Need Aggressive Pruning

Fazm Team··2 min read

Memory Filters - Why AI Agents Need Aggressive Pruning

Most agent memory systems are hoarders. They save everything - every conversation, every fact, every preference the user mentioned once six months ago. The result is a bloated context that slows everything down and confuses the agent with outdated information.

The fix is aggressive filtering: only keep facts that were retrieved and actually used in the last 30 days.

The Problem with Keeping Everything

Context windows are finite. Even with models that support long contexts, stuffing 100K tokens of "memory" into every request is wasteful and counterproductive. The agent spends tokens processing irrelevant information, and important recent context gets diluted.

Worse, stale memories can actively mislead the agent. If you changed your preferred email six months ago but the agent still has the old one in memory, it will use the wrong address.

A Simple Filtering Strategy

Track two things for every memory:

  1. Last retrieved - when was this memory last pulled into context?
  2. Last used - did the agent actually reference this memory in its response?

If a memory has not been both retrieved and used in 30 days, archive it. Do not delete it entirely - move it to cold storage so it can be recalled with an explicit search. But remove it from the active memory set.

Implementation Details

  • Store memories with timestamps for creation, last retrieval, and last use
  • Run a daily cleanup job that archives anything past the threshold
  • Keep the active memory set small - under 50 key facts is ideal
  • Use semantic search on the archive for rare retrievals

The counterintuitive insight is that less memory makes agents smarter. A focused set of 20 relevant facts outperforms a sprawling collection of 500 facts every time.

Fazm is an open source macOS AI agent. Open source on GitHub.

More on This Topic

Related Posts