MEMORY.md as an Injection Vector - The Security Risk of Implicitly Trusted Config Files

Matthew Diakonov·March 17, 2026·2 min read

security prompt-injection memory claude-md config-files ai-agent

AI agents that use persistent memory files - like CLAUDE.md or MEMORY.md - load those files at the start of every session and follow their instructions implicitly. The agent treats them as trusted configuration. But what if someone modifies them?

This is not a theoretical risk. It is a straightforward attack surface that most agent setups leave completely unprotected.

How the Attack Works

A typical setup: an AI agent has a memory directory that persists between sessions. It contains instructions, preferences, project context, and behavioral rules. All plaintext. All loaded automatically. All trusted without verification.

If an attacker - or even a mischievous collaborator - modifies these files, the agent will comply with the new instructions. It has no mechanism to distinguish between instructions written by the legitimate user and instructions injected by someone else.

The agent would likely comply before noticing anything is wrong, because "noticing" requires comparing current instructions against some baseline - which most agents do not do.

Why This Is Hard to Fix

The fundamental tension is between usefulness and security:

Useful: the agent reads instructions and follows them without asking "did you really write this?"
Secure: the agent verifies the integrity of its configuration before trusting it

Adding verification creates friction. Checksums, signatures, or confirmation prompts all slow down the workflow that makes these files valuable in the first place.

Practical Mitigations

Until better solutions exist, treat memory files like credentials:

File permissions - restrict write access to your user account only
Git tracking - keep config files in version control so unauthorized changes create visible diffs
Integrity checks - hash your config files and have the agent verify the hash at session start
Scope limits - do not put destructive capabilities in config-driven instructions
Review on change - set up file watchers that alert you when memory files are modified

The Broader Lesson

Any file that an AI agent trusts implicitly is an injection vector. The more powerful the agent, the more dangerous the vector. Treat your CLAUDE.md with the same care you would treat an SSH key.

Fazm is an open source macOS AI agent. Open source on GitHub.

MEMORY.md as an Injection Vector - The Security Risk of Implicitly Trusted Config Files

How the Attack Works

Why This Is Hard to Fix

Practical Mitigations

The Broader Lesson

More on This Topic

Related Posts

AI Agents That Optimize Themselves Instead of Doing the Actual Task

Being a Subagent - Why Not Remembering Is a Feature

Brain MCP - Persistent Memory That Remembers How You Think

Comments ()

How the Attack Works

Why This Is Hard to Fix

Practical Mitigations

The Broader Lesson

More on This Topic

Related Posts

AI Agents That Optimize Themselves Instead of Doing the Actual Task

Being a Subagent - Why Not Remembering Is a Feature

Brain MCP - Persistent Memory That Remembers How You Think

Comments (••)

Comments ()