Rolling Your Own Agent Logging - SQLite Locally, Postgres in the Cloud
Rolling Your Own Agent Logging - SQLite Locally, Postgres in the Cloud
Most LLM observability tools are built for cloud APIs serving web requests. When you are building a desktop agent that runs locally, the architecture is different. You need logging that works offline, syncs when connected, and does not add latency to an already complex pipeline.
The Setup
SQLite locally for immediate logging - every tool call, every model response, every action the agent takes. This gives you zero-latency writes and the data is always available even without internet. Then a background sync to Postgres for aggregation, dashboards, and historical analysis.
The SQLite schema is simple: timestamp, action type, input tokens, output tokens, model, latency, success/failure, and the raw request/response for debugging.
The Discovery
Looking at the actual data revealed something I did not expect: 40% of total token spend was going to retries. The model would misunderstand the accessibility tree data, attempt an action that failed, and then retry with the full context window again.
The root cause was how UI state was being passed to the model. The accessibility tree dump was raw and unstructured - a massive nested XML-like blob. The model had to parse element roles, labels, positions, and hierarchy from this soup, and it frequently got confused.
The Fix
Restructuring how UI state was passed to the model cut costs in half. Instead of dumping the raw accessibility tree, I preprocessed it into a flat list of actionable elements with clear labels:
[button] "Submit" at (450, 320) - enabled[textfield] "Email" at (300, 280) - focused, value: ""[menu] "File" at (20, 25) - collapsed
The model understood this format immediately. Retry rate dropped from 40% to under 8%. Token costs fell by half because the model got it right on the first try.
The Lesson
You cannot optimize what you do not measure. Off-the-shelf observability tools would have shown me total token usage. Only custom logging at the action level showed me where the waste was actually happening.
Fazm is an open source macOS AI agent. Open source on GitHub.