Long-Term Memory Without Going Bankrupt - SQLite with Local Embeddings
Long-Term Memory Without Going Bankrupt
Every AI agent framework tells you to use Pinecone or Weaviate for long-term memory. Set up a vector database, embed your memories with OpenAI's API, query them semantically. The demo works great. Then you check your bill.
Embedding API calls add up. Vector database hosting adds up. For an agent that runs continuously on your desktop, the monthly cost of cloud-based memory can exceed the cost of the LLM itself.
SQLite Does 90% of What You Need
SQLite is free, runs locally, needs zero infrastructure, and handles millions of records without breaking a sweat. For agent memory, a simple table with timestamp, category, and content fields covers most use cases.
The agent remembers what files it edited, what commands it ran, what errors it encountered, and what solutions worked. A SQL query retrieves relevant memories faster than a vector search for most practical cases.
For the remaining 10% where you genuinely need semantic search, local embedding models close the gap.
Local Embeddings Are Good Enough
Models like all-MiniLM-L6-v2 run locally on any modern laptop. They produce 384-dimensional embeddings in milliseconds. The quality is lower than OpenAI's embedding models, but for agent memory - where you are searching your own past experiences, not the entire internet - the quality is more than sufficient.
Store embeddings as blobs in SQLite alongside the text. Use cosine similarity for retrieval. The entire system runs in-process with no network calls, no API keys, and no monthly bills.
Practical Memory Architecture
A good agent memory system has three layers: short-term context in the conversation, medium-term session state in a JSON file, and long-term persistent memory in SQLite. The agent checks long-term memory at the start of each session to recall relevant past experiences.
Keep memories structured. Instead of storing raw conversation text, store extracted facts: "File X uses pattern Y" or "Command Z requires flag W on this machine." Structured memories are easier to search and more useful when retrieved.
The total cost of this system is zero dollars per month. It runs entirely on your local machine and scales to years of continuous agent usage.
Fazm is an open source macOS AI agent. Open source on GitHub.