Persistent Memory and Multi-Model Contamination in AI Agents

Fazm Team··3 min read

Persistent Memory and Multi-Model Contamination in AI Agents

Modern AI agents often use multiple models in a single workflow. Haiku handles screen reading, Opus does planning, a local model runs quick classifications. Each model contributes to the agent's persistent memory - the running context of what has happened and what was decided.

The problem is attribution. When three models contribute to the same memory store, you lose track of which model said what and how reliable each piece of information is.

How Contamination Happens

Model A reads a screen and stores "the user's email is john@example.com." Model B later retrieves that memory and uses it to fill out a form. But Model A was a cheap, fast model that occasionally hallucinates details. Model B - a more capable model - treats the stored memory as ground truth because it has no way to know the original source's reliability.

This is multi-model contamination: unreliable outputs from one model become trusted inputs for another through the shared memory layer.

It gets worse with chains. Model A produces a summary. Model B refines it. Model C acts on the refined version. By the time Model C makes a decision, the original observation has been filtered through two layers of interpretation. Any error in Model A's output is now baked into Model C's action.

Attribution Tracking

The fix is tagging every memory entry with its source:

  • Which model produced it - "haiku-3.5" vs "opus-4"
  • What type of task generated it - screen reading, planning, verification
  • Confidence level - Did the model express uncertainty?
  • Timestamp - How old is this information?

When a high-capability model retrieves a memory tagged as coming from a low-capability model on a complex task, it should treat that information as potentially unreliable and verify it independently.

Practical Strategies

  • Separate memory stores per model tier - Fast models write to a "tentative" store. Frontier models write to a "verified" store. Actions should prefer verified memories.
  • Re-verification cycles - Periodically have a capable model review and validate entries from cheaper models.
  • Decay old memories - Information from 50 steps ago is less reliable than information from 5 steps ago, regardless of which model produced it.

Trust in memory should be proportional to the reliability of its source.

Fazm is an open source macOS AI agent. Open source on GitHub.

More on This Topic

Related Posts