LOCAL CONTEXT / ONE SQLITE FILE / AGENT QUERIES IT

A local AI assistant is about where your context lives, not where your model runs

Every other guide on this collapses the word local into a single meaning: the model weights sit on your hard drive. That framing leaves the important part out. Fazm keeps your entire long-term memory, the file index, the knowledge graph, every past conversation, and your user profile in one SQLite file at ~/Library/Application Support/Fazm/users/{userId}/fazm.db, and the agent reads from it with a tool that takes one argument: a SELECT statement.

Matthew Diakonov, Fazm

Published April 24, 202611 min read

4.9from Written from the Fazm source tree

One SQLite file per user

Seven tables, documented

FTS5 over chat history

execute_sql is the API

Auto LIMIT 200 on SELECT

No vendor vector store

Local is a storage question

Your context lives in one file on your disk. The agent queries it.

~/Library/Application Support/Fazm/users/{userId}/fazm.db

Seven tables: files, graph, chat, profile

FTS5 virtual table over every message

execute_sql accepts SELECT / INSERT / UPDATE / DELETE

Missing LIMIT? Auto-append LIMIT 200

0:00 / 0:05

THE UNSTATED ASSUMPTION

Two meanings of "local" are getting smashed together

Almost every article about this topic takes one shape. It opens with the privacy argument, lists Ollama, Jan.ai, LM Studio, and GPT4All, explains how to download a 7B or 13B model, and tells you to ask it questions. That is a local inference runtime. It is useful. It is also not what most people mean when they sit down and think they want a local AI assistant.

What people actually want is an assistant that remembers them, reads their files, opens their apps, and does not hand a shadow copy of their life to a vendor. Only one of those four properties is about where the model runs. The other three are about where the context lives, which is a storage question, not an inference question.

Fazm separates the two. The model answering your question today is a capable cloud model, because the best model at any given moment is almost always a cloud model. The context the model sees is a single file on your disk that you can open in the sqlite3 CLI and read yourself. When a future local model becomes good enough to swap in, the agent loop and the on-disk layout do not change. Only the backend does.

TWO AXES, NOT ONE

The 2x2 the common advice misses

Cloud model + cloud context

The default ChatGPT or Gemini experience. The model is cloud, the vendor keeps a shadow copy of your uploaded files, drive connections, and conversation history in their storage. No local surface.

Local model + cloud context

Rare and unhelpful. Nobody does this on purpose. You get the downsides of running a smaller model without any of the privacy upside because your files still got synced somewhere.

Local model + local context

The default picture of a local AI assistant. Ollama + a downloaded model, maybe wired to a note-taking plugin. Private, offline, usually too weak for real tasks, and has no path to your actual apps.

Cloud model + local context

Fazm. The model is cloud and capable. Your long-term memory, file index, and conversation history live in one SQLite file on your Mac. The agent reads them with execute_sql. Swappable to a local model later without changing the on-disk layout.

The fourth quadrant is the one that matches what most people actually want, and it is the one the common advice never names.

fazm.dbindexed_fileslocal_kg_nodeslocal_kg_edgeschat_messageschat_messages_ftsobserver_activityai_user_profilesGRDBFTS5WAL modeexecute_sqlLIMIT 200~/Library/Application Support/Fazm

THE ANCHOR FACT

One file, seven tables, one query tool

The assistant's entire long-term memory of you is a single SQLite file. You can find it right now if Fazm is installed:

Terminal · ~

The seven tables the agent actually reads from are indexed_files, local_kg_nodes, local_kg_edges, chat_messages, chat_messages_fts, observer_activity, and ai_user_profiles. The others you saw in the output (chat_messages_fts_data, _idx, _docsize) are FTS5 internals.

THE ARCHITECTURE

How one English sentence becomes one SELECT

Agent loop · local context layer

The shape is deliberately boring. The model writes SQL, the pool runs it, the pool hands back rows. No RPC to a remote index, no embedding service, no network call.

THE TOOL

execute_sql is forty lines long

Here is the read path, abridged from Desktop/Sources/Providers/ChatToolExecutor.swift (starting around line 264). The whole API the model has against your personal data layer fits on one screen.

Desktop/Sources/Providers/ChatToolExecutor.swift

A few things worth noticing. First, the model writes real SQL, not a JSON query DSL. Second, the auto-appended LIMIT 200 is a floor, not a ceiling; the model is free to ask for less. Third, the return format is plain text with | separators, which is exactly what the model is already good at reading (it looks like a markdown table or a psql output). Fourth, anything that is not SELECT / INSERT / UPDATE / DELETE is rejected before the pool ever sees it.

WHAT AN ACTUAL QUERY LOOKS LIKE

Real SELECTs the agent writes against your disk

These are verbatim from the prompt in Desktop/Sources/Chat/ChatPrompts.swift (starting around line 446). The prompt teaches the model the shape of the tables, then lets it improvise.

-- What file types dominate your disk

SELECT fileType, COUNT(*) as count FROM indexed_files
GROUP BY fileType ORDER BY count DESC LIMIT 15

-- Which programming languages you write

SELECT fileExtension, COUNT(*) as count FROM indexed_files
WHERE fileType = 'code'
GROUP BY fileExtension ORDER BY count DESC LIMIT 20

-- Active projects, detected by config files

SELECT filename, path FROM indexed_files
WHERE filename IN ('package.json', 'Cargo.toml',
'Podfile', 'go.mod', 'requirements.txt',
'pyproject.toml', 'Package.swift', 'Dockerfile')
LIMIT 40

-- What you worked on yesterday

SELECT filename, path, fileType, modifiedAt
FROM indexed_files
ORDER BY modifiedAt DESC LIMIT 20

-- Full-text search your own conversation history

SELECT rowid, messageText FROM chat_messages_fts
WHERE chat_messages_fts MATCH 'refund policy'
ORDER BY rank LIMIT 10

None of these calls go through an embedding service, a vector index, or a third-party API. They all run against the file at ~/Library/Application Support/Fazm/users/{userId}/fazm.db and return in milliseconds.

THE NUMBERS

The few numbers that matter, read from source

0tables the agent reads from

0auto LIMIT on queries missing one

0MB file size cap before the indexer skips

0max folder depth the indexer walks

The 500 MB cap lives in FileIndexerService.swift line 24 as maxFileSize. The depth-3 rule is on line 21. The auto-LIMIT 200 is in ChatToolExecutor.swift line 273. The seven reading tables are enumerated in ChatPrompts.swift lines 446 to 456 and 509 to 519. Nothing on this page is a guess.

Local context (Fazm) vs the usual local-LLM stack

Feature	Ollama / Jan.ai / LM Studio	Fazm
Where the model runs	On your hardware	Cloud (swappable to local)
Where your file index lives	Not provided	~/Library/Application Support/Fazm/users/{userId}/fazm.db
Where your chat history lives	Local JSON or nothing	chat_messages + chat_messages_fts (FTS5)
Personal knowledge graph	Not provided	local_kg_nodes, local_kg_edges tables
Agent query API	Chat prompt only	execute_sql tool, SELECT/INSERT/UPDATE/DELETE
Can drive other Mac apps	No, chat window only	Yes, via macos-use MCP (AX tree)
Vendor vector store / shadow copy	None (that is the selling point)	None
Open the data yourself	Varies by app	sqlite3 fazm.db

HOW THE FILE GOT THERE

The path from first launch to your first query

You install Fazm and sign in

The app creates ~/Library/Application Support/Fazm/users/{firebaseUid}/ and opens a fresh fazm.db with a GRDB DatabasePool in WAL mode. Sidecar files fazm.db-wal and fazm.db-shm appear next to it, along with a .fazm_running flag so the next launch can detect unclean shutdowns.

Migrations run

GRDB walks registered migrations in order: initial tables, then fazmV3 (task_chat_messages -> chat_messages), fazmV4 (observer_activity), fazmV5 (add session_id, create chat_messages_fts virtual table, wire up insert/update/delete triggers). Your file ends up on the current schema even if you installed an older build last month.

The indexer scans your folders

FileIndexerService walks ~/Downloads, ~/Documents, ~/Desktop, ~/Developer, ~/Projects, ~/Code, ~/src, ~/repos, ~/Sites, /Applications, and ~/Applications up to depth 3. It skips .Trash, node_modules, .git, __pycache__, .venv, venv, .cache, .npm, .yarn, Pods, DerivedData, .build, build, dist, .next, .nuxt, target, vendor, Library, .local, .cargo, and .rustup because scanning those produces mostly noise. Files over 500 MB are not recorded. Results batch-insert in groups of 500.

You ask something

The agent loop decides to call execute_sql with a specific SELECT. The DatabasePool runs it against your local file, rows come back as plain text with pipe separators, and the model uses that output to answer. Everything up to the final answer happens on your Mac.

The knowledge graph gets written

When you tell Fazm something new, it calls save_knowledge_graph with nodes and edges. Both go into local_kg_nodes and local_kg_edges in one transaction. Next time you ask a related question, the agent includes graph edges as part of the SELECT context.

CHECKLIST

What 'local' should mean, on this reading

Your file index sits in a file you can open yourself
Your chat history is searchable offline with an FTS5 index
Your knowledge graph is nodes and edges in SQL, not a cloud graph service
The agent's only path to your data is a named tool whose calls you can log
Swapping to a local model later does not change the on-disk layout
You can back the whole thing up with cp, or inspect it with sqlite3

ABOUT THE 0 TABLES

What each table is for, in one line

indexed_files

File metadata from your Downloads, Documents, Desktop, Developer, Projects, Code, src, repos, Sites, and /Applications. Path, filename, extension, fileType, sizeBytes, folder, depth, timestamps. Content of the file is NOT read.

local_kg_nodes

Nodes of your personal knowledge graph. Each row has a nodeId, label, nodeType (concept, person, project, etc.), optional aliases JSON, and source file ids.

local_kg_edges

Typed edges between nodes, with a sourceNodeId, targetNodeId, label, and created timestamp.

chat_messages

Every message in every conversation with Fazm. Stamped with session_id (added in fazmV5) so the UI can separate conversations.

chat_messages_fts

An SQLite FTS5 virtual table shadowing chat_messages for full-text search. Three triggers (insert, update, delete) keep it in sync automatically. Search over every message you have ever sent the assistant in a single MATCH query.

observer_activity

Background insight cards, skill drafts, and detected patterns. Type, JSON content, pending / shown / acted / dismissed status. This is how the observer loop stores its output before the user sees it.

ai_user_profiles

A writable summary the agent keeps about you. The agent reads it at the start of answers that need context about who you are, and updates it when it learns something new. You can read it yourself.

A REAL SESSION

Inspect the database yourself

Nothing about this file is hidden from you. If Fazm is installed, you can open a terminal and ask the same questions the agent asks. Quit the app first so the WAL gets checkpointed cleanly; otherwise you will see slightly older numbers.

sqlite3 ~/Library/Application\\ Support/Fazm/users/{userId}/fazm.db

The counts above are illustrative. Your own file will show your own numbers. If they surprise you, it is the first time your file system has looked back at you.

WHERE TO GO NEXT

Questions this framing tends to raise

What does 'local AI assistant' actually mean if the model is still running in the cloud?

Most guides to local AI assistants conflate two very different things: where the model weights run (the inference layer) and where your personal data lives (the context layer). Fazm treats them as independent. The model that writes the chain of tool calls can be whichever one is best today, but the context the model reads from is a SQLite file on your disk at ~/Library/Application Support/Fazm/users/{userId}/fazm.db. Your file index, your conversation history, your knowledge graph, your onboarding notes, and your user profile never get uploaded to a vendor vector store. The agent queries them in place with SELECT statements. That separation is the part of 'local' that actually affects your privacy and your latency.

Where is the database on my Mac, and who opens it?

The path is ~/Library/Application Support/Fazm/users/{userId}/fazm.db, built in Desktop/Sources/AppDatabase.swift by a function called userBaseDirectory (lines 238 to 248). The file is opened as a GRDB DatabasePool with WAL mode enabled, which is why you also see fazm.db-wal and fazm.db-shm sidecar files next to it. A per-user flag file called .fazm_running tracks unclean shutdowns so the next launch can run an integrity check if the previous session did not mark a clean shutdown. No other process touches the file; the Mac app itself is the only writer.

What exactly is in the database? Which tables?

Seven tables matter for day-to-day context. indexed_files stores metadata for every file the scanner visits under ~/Downloads, ~/Documents, ~/Desktop, ~/Developer, ~/Projects, ~/Code, ~/src, ~/repos, ~/Sites, and /Applications, up to depth 3, skipping files larger than 500 MB and about twenty common build folders like node_modules, .git, __pycache__, DerivedData, and target. local_kg_nodes and local_kg_edges hold a personal knowledge graph (concept nodes, typed edges, source file ids). chat_messages stores every conversation with the assistant, with chat_messages_fts shadowing it as an FTS5 virtual table so full-text search is O(log n) instead of a full scan. observer_activity tracks background insight cards and auto-drafted skills. ai_user_profiles is a writable profile the agent updates on its own. Each table has a short description in ChatPrompts.swift that the model reads at query time so it can write correct SQL.

How does the assistant actually pull data out of the file? Is there an API in front of it?

No. There is one tool called execute_sql, defined in Desktop/Sources/Providers/ChatToolExecutor.swift, that accepts a raw SQL string. The tool is restricted to SELECT, INSERT, UPDATE, and DELETE (anything else returns 'only SELECT, INSERT, UPDATE, DELETE statements are allowed'). If the SELECT query has no LIMIT clause, the executor auto-appends LIMIT 200 to prevent the model from pulling half your disk into its context window. Results come back as a plain text table with ' | ' separators, truncated at 500 characters per cell. That simplicity is deliberate: it means the model talks to your data the same way a human would in a psql prompt, and you can read what it did by tailing the app log.

How is this different from Ollama, Jan.ai, LM Studio, or GPT4All?

Those are inference runtimes. They give you a chat UI and a local model. They do not know anything about your files, your calendar, or what you were doing in Finder an hour ago, and they cannot do anything in an app on your Mac. Fazm is the other half of the stack: it keeps your context on your Mac in one SQL-queryable file and drives the apps that already have that context open. You can point Fazm at a cloud model or, in principle, at a local model over the same agent loop; the on-disk layout does not change. The usual 'local vs cloud' framing treats these as competitors; in practice they solve different problems.

Why SQLite specifically? Why not a vector database?

Because the agent can write the query. Vector databases are excellent when the consumer of the data is another program running similarity search; they are awkward when the consumer is a language model that can write exact SQL. A single SQLite file works with the same mental model the model already has from reading millions of lines of application code, supports full-text search via the FTS5 module for free (see the chat_messages_fts virtual table and its three triggers in the fazmV5 migration), runs offline with zero setup, lives on one disk, and can be backed up with cp. Embedding search is still available where it helps, but flat SQL is the default because it is the most reliable way to ask a precise question about your own stuff.

Does that mean Fazm is slower than a RAG-style assistant?

The opposite. Retrieval-augmented chatbots spend most of their time turning your question into an embedding, comparing it against a remote index, and streaming back the top k chunks. Fazm just runs a SELECT on a local file. For a question like 'show me my Swift projects,' the path is: execute_sql → SELECT fileExtension, COUNT(*) FROM indexed_files WHERE fileType = 'code' GROUP BY fileExtension ORDER BY count DESC → text table back. That entire round trip is a single function call on local storage, not a network round trip to a vector service.

What stops the model from deleting my data with a bad DELETE query?

The executor allows writes but logs every one (see the log line 'Tool execute_sql write: \(changes) row(s) affected' at line 330 of ChatToolExecutor.swift), and the tables the model is encouraged to write to are narrow: ai_user_profiles for profile updates, and the knowledge graph via a separate save_knowledge_graph tool that batches nodes and edges into a single transaction. The core chat_messages and indexed_files tables are read almost exclusively. The guardrail is that the model knows what the table is for (every table has a human-readable description attached to the prompt) and that the tool name makes the operation legible in the tool-call log. There is no silent 'clear memory' path.

How does the file index actually get populated?

There is an actor called FileIndexerService in Desktop/Sources/FileIndexing/FileIndexerService.swift. On boot and on an incremental schedule, it walks a fixed list of folders (~/Downloads, ~/Documents, ~/Desktop, ~/Developer, ~/Projects, ~/Code, ~/src, ~/repos, ~/Sites, /Applications, and ~/Applications), up to a max depth of 3, batching 500 records at a time into indexed_files. It skips folders like .Trash, node_modules, .git, .venv, DerivedData, and target because scanning those produces mostly noise. Package extensions like .app, .framework, .xcodeproj, and .xcworkspace are treated as leaves, not walked into. It skips files larger than 500 MB. The scanner does not read file contents; it only stores metadata (path, name, extension, size, folder, depth, timestamps).

What stays on my Mac and what leaves?

The SQLite file never leaves. The accessibility tree the agent reads from other apps (from the macos-use MCP server) never leaves as bulk data. What leaves is exactly what the model needs to decide the next action: the current question, whatever rows came back from your execute_sql call, and whatever part of the AX tree the agent chose to quote. That is a fundamentally different shape from cloud assistants that index your Drive, your Gmail, or your calendar into their own vector store and keep a shadow copy. Here the shadow copy is the SQLite file, and it is in your home directory, owned by you.

Can I read the database myself with the standard sqlite3 CLI?

Yes. Open Terminal, run sqlite3 ~/Library/Application\ Support/Fazm/users/{userId}/fazm.db, then .tables to see the list, and e.g. SELECT fileType, COUNT(*) FROM indexed_files GROUP BY fileType to see what it knows about your disk. If you want to export it, .dump or .backup work. Quit the Fazm app first so the WAL is checkpointed cleanly. The schema is identical to what the agent sees, which is the whole point: the assistant has no private view of your data.

How does Fazm find the right userId directory?

When you sign in, the Firebase UID becomes the directory name. Before sign-in, Fazm writes to the subdirectory 'anonymous'. A migration step in AppDatabase.swift (migrateFromLegacyUserDirectory) moves the legacy anonymous or device-UUID database into the correct per-user folder once you authenticate, so your indexed files and conversation history follow the account rather than the device session. If you switch users on the same machine, the code calls switchUser(to:), which closes the pool, reconfigures for the new userId, and reopens a fresh database file.