How to Use Browser History SQLite Data for AI Agent Memory with Frequency Ranking
How to Use Browser History SQLite Data for AI Agent Memory with Frequency Ranking
Open any agent memory repository on GitHub and you will find the same pattern: the user tells the agent things, the agent stores them, the agent retrieves them later. This is note-taking with extra steps.
The most valuable information about how you actually work is not what you say - it is what you do. Your browser history shows which documentation pages you return to every week. Your bookmarks reveal the resources you considered important enough to save. Your frequently visited pages expose your real workflow, not the one you think you have.
This data already exists. Every browser stores it in SQLite. No scraping, no instrumentation, no special setup required. Yet agent memory systems ignore all of it because they are built around explicit knowledge capture instead of implicit behavior observation.
Where Browsers Store Their SQLite Data
All three major desktop browsers use SQLite under the hood, and all three expose visit counts directly in the schema.
Chrome and Chromium-based browsers (Edge, Brave, Arc):
macOS: ~/Library/Application Support/Google/Chrome/Default/History
Linux: ~/.config/google-chrome/Default/History
Windows: C:\Users\<user>\AppData\Local\Google\Chrome\User Data\Default\History
Firefox:
macOS: ~/Library/Application Support/Firefox/Profiles/<profile>.default/places.sqlite
Linux: ~/.mozilla/firefox/<profile>.default/places.sqlite
Windows: C:\Users\<user>\AppData\Roaming\Mozilla\Firefox\Profiles\<profile>.default\places.sqlite
Safari (macOS only):
~/Library/Safari/History.db
These are plain SQLite files. You can open them with sqlite3 from the command line or any SQLite viewer. The catch is that Chrome and Firefox lock the file while the browser is running - copy the file to a temp location before querying it in production.
Chrome History Schema
Chrome's History database is minimal and clean:
-- The urls table - core data for agent memory
CREATE TABLE urls (
id INTEGER PRIMARY KEY,
url LONGVARCHAR NOT NULL,
title LONGVARCHAR DEFAULT '',
visit_count INTEGER DEFAULT 0 NOT NULL,
typed_count INTEGER DEFAULT 0 NOT NULL,
last_visit_time INTEGER NOT NULL, -- microseconds since 1601-01-01 UTC
hidden INTEGER DEFAULT 0 NOT NULL
);
-- The visits table - each individual visit
CREATE TABLE visits (
id INTEGER PRIMARY KEY,
url INTEGER NOT NULL, -- FK to urls.id
visit_time INTEGER NOT NULL,
from_visit INTEGER DEFAULT 0,
transition INTEGER DEFAULT 0 NOT NULL,
segment_id INTEGER DEFAULT 0,
visit_duration INTEGER DEFAULT 0 NOT NULL
);
The visit_count column in urls is what you want. Chrome increments it every time you navigate to that URL. typed_count is even more interesting - it counts how many times you typed the URL directly, which signals a much stronger intent than clicking a link.
Firefox places.sqlite Schema
Firefox uses a slightly more complex schema but exposes a frecency score - a built-in frequency + recency composite that Mozilla has been refining for years:
-- moz_places - Firefox's equivalent of Chrome's urls table
CREATE TABLE moz_places (
id INTEGER PRIMARY KEY,
url LONGVARCHAR NOT NULL,
title LONGVARCHAR,
visit_count INTEGER DEFAULT 0,
hidden INTEGER DEFAULT 0 NOT NULL,
typed INTEGER DEFAULT 0 NOT NULL,
frecency INTEGER DEFAULT -1 NOT NULL, -- composite frequency/recency score
last_visit_date INTEGER
);
-- moz_bookmarks - bookmarks and their folder structure
CREATE TABLE moz_bookmarks (
id INTEGER PRIMARY KEY,
type INTEGER, -- 1 = bookmark, 2 = folder
fk INTEGER, -- FK to moz_places.id
parent INTEGER,
position INTEGER,
title LONGVARCHAR,
dateAdded INTEGER,
lastModified INTEGER
);
Firefox's frecency uses a decay algorithm that weights recent visits more than old ones, but still rewards frequent visits. A page with 100 visits spread over six months scores higher than a page visited once last week.
Safari History.db Schema
Safari's schema is the simplest:
CREATE TABLE history_items (
id INTEGER PRIMARY KEY AUTOINCREMENT,
url TEXT NOT NULL UNIQUE,
domain_expansion TEXT,
visit_count INTEGER,
daily_visit_counts BLOB, -- compact representation of per-day visits
weekly_visit_counts BLOB,
autocomplete_triggers BLOB,
should_recompute_derived_visit_counts INTEGER,
visit_count_score INTEGER -- Safari's own frequency score
);
CREATE TABLE history_visits (
id INTEGER PRIMARY KEY AUTOINCREMENT,
history_item INTEGER NOT NULL REFERENCES history_items(id),
visit_time REAL NOT NULL, -- seconds since 2001-01-01 00:00:00 UTC
title TEXT,
load_successful BOOLEAN NOT NULL DEFAULT 1,
http_non_get BOOLEAN NOT NULL DEFAULT 0,
synthesized BOOLEAN NOT NULL DEFAULT 0,
redirect_source INTEGER,
redirect_destination INTEGER,
origin INTEGER,
generation INTEGER
);
Safari stores dates as seconds since January 1, 2001 (Apple's Cocoa epoch) rather than the Unix epoch. Keep that in mind when converting timestamps.
The Agent Memory SQLite Schema
Rather than querying each browser's native database every time, extract the relevant data into a unified agent memory store:
CREATE TABLE browser_pages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
url TEXT NOT NULL,
domain TEXT NOT NULL,
title TEXT,
visit_count INTEGER DEFAULT 0,
typed_count INTEGER DEFAULT 0, -- direct URL entry = strong intent signal
first_seen INTEGER, -- unix timestamp
last_seen INTEGER, -- unix timestamp
source_browser TEXT, -- 'chrome', 'firefox', 'safari'
is_bookmarked INTEGER DEFAULT 0,
bookmark_title TEXT,
tags TEXT -- JSON array of inferred topic tags
);
CREATE INDEX idx_visit_count ON browser_pages(visit_count DESC);
CREATE INDEX idx_domain ON browser_pages(domain);
CREATE INDEX idx_last_seen ON browser_pages(last_seen DESC);
CREATE TABLE page_visits (
id INTEGER PRIMARY KEY AUTOINCREMENT,
page_id INTEGER NOT NULL REFERENCES browser_pages(id),
visited_at INTEGER NOT NULL, -- unix timestamp
duration_s INTEGER DEFAULT 0 -- seconds spent on page
);
Separating browser_pages from page_visits lets you answer two different questions: "what pages are central to this person's work" (query browser_pages by visit_count) and "when do they typically access specific resources" (query page_visits by time of day).
Frequency-Based Ranking Queries
The core insight is simple: rank by access frequency, not recency. A page you visit fifty times over three months is more important to surface than a page you visited once yesterday.
Top pages by raw visit count:
SELECT
domain,
title,
url,
visit_count,
typed_count,
datetime(last_seen, 'unixepoch') AS last_seen
FROM browser_pages
ORDER BY visit_count DESC
LIMIT 50;
Weighted score combining frequency, typed intent, and bookmark status:
SELECT
url,
title,
visit_count,
typed_count,
is_bookmarked,
-- weight: visit_count + 3x typed entries + 10x bookmark bonus
(visit_count + (typed_count * 3) + (is_bookmarked * 10)) AS importance_score
FROM browser_pages
WHERE last_seen > strftime('%s', 'now', '-90 days')
ORDER BY importance_score DESC
LIMIT 30;
Top domains (useful for understanding which tools and services the user relies on):
SELECT
domain,
COUNT(*) AS page_count,
SUM(visit_count) AS total_visits,
MAX(last_seen) AS most_recent_visit
FROM browser_pages
GROUP BY domain
ORDER BY total_visits DESC
LIMIT 20;
Recently revisited documentation pages - good candidates for proactive surfacing:
SELECT
url,
title,
visit_count,
datetime(last_seen, 'unixepoch') AS last_seen
FROM browser_pages
WHERE
visit_count >= 5
AND last_seen > strftime('%s', 'now', '-7 days')
AND (
url LIKE '%docs.%'
OR url LIKE '%/docs/%'
OR url LIKE '%/api/%'
OR url LIKE '%stackoverflow.com%'
OR url LIKE '%github.com%'
)
ORDER BY visit_count DESC;
Ingesting Chrome History Into the Agent Store
Here is a minimal Python script that copies Chrome's history and ingests it into the agent memory database:
import sqlite3
import shutil
import os
import time
CHROME_HISTORY = os.path.expanduser(
"~/Library/Application Support/Google/Chrome/Default/History"
)
AGENT_MEMORY_DB = os.path.expanduser("~/.agent-memory/browser.db")
def ingest_chrome_history():
# Chrome locks the db while running - work from a copy
tmp = "/tmp/chrome_history_copy"
shutil.copy2(CHROME_HISTORY, tmp)
src = sqlite3.connect(tmp)
dst = sqlite3.connect(AGENT_MEMORY_DB)
dst.execute("""
CREATE TABLE IF NOT EXISTS browser_pages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
url TEXT NOT NULL UNIQUE,
domain TEXT NOT NULL,
title TEXT,
visit_count INTEGER DEFAULT 0,
typed_count INTEGER DEFAULT 0,
last_seen INTEGER,
source_browser TEXT,
is_bookmarked INTEGER DEFAULT 0
)
""")
rows = src.execute("""
SELECT
url,
title,
visit_count,
typed_count,
-- convert Chrome microseconds to unix seconds
(last_visit_time / 1000000) - 11644473600 AS last_seen_unix
FROM urls
WHERE visit_count > 0
ORDER BY visit_count DESC
""").fetchall()
for url, title, visits, typed, last_seen in rows:
domain = url.split("/")[2] if "//" in url else url
dst.execute("""
INSERT INTO browser_pages (url, domain, title, visit_count, typed_count, last_seen, source_browser)
VALUES (?, ?, ?, ?, ?, ?, 'chrome')
ON CONFLICT(url) DO UPDATE SET
visit_count = MAX(visit_count, excluded.visit_count),
typed_count = MAX(typed_count, excluded.typed_count),
last_seen = MAX(last_seen, excluded.last_seen),
title = COALESCE(excluded.title, title)
""", (url, domain, title, visits, typed, last_seen))
dst.commit()
src.close()
dst.close()
os.remove(tmp)
print(f"Ingested {len(rows)} URLs from Chrome history")
The ON CONFLICT ... DO UPDATE pattern lets you re-run ingestion without duplicating records. It keeps the higher visit count in case the agent's copy and the live browser diverge.
How an Agent Uses This Data
Once the data is in browser.db, the agent queries it at the start of a session to load context about what the user has been working on:
def get_user_work_context(db_path: str, days: int = 30) -> dict:
conn = sqlite3.connect(db_path)
# Top domains - reveals the tech stack and tools in use
top_domains = conn.execute("""
SELECT domain, SUM(visit_count) as visits
FROM browser_pages
WHERE last_seen > strftime('%s', 'now', '-' || ? || ' days')
GROUP BY domain
ORDER BY visits DESC
LIMIT 10
""", (days,)).fetchall()
# High-frequency documentation - what the user is actively building
key_docs = conn.execute("""
SELECT url, title, visit_count
FROM browser_pages
WHERE visit_count >= 5
AND (url LIKE '%docs.%' OR url LIKE '%/docs/%' OR url LIKE '%/api/%')
AND last_seen > strftime('%s', 'now', '-' || ? || ' days')
ORDER BY visit_count DESC
LIMIT 20
""", (days,)).fetchall()
conn.close()
return {
"top_domains": top_domains,
"key_documentation": key_docs,
}
The agent calls this once per session, before the user types anything. It already knows which frameworks and APIs the user works with, which documentation pages they return to, and which tools they depend on. There is no onboarding step. There is no "tell me about your tech stack" prompt.
Frequency Over Recency
Most memory systems prioritize recent information because it is easy to implement - sort by timestamp, done. But frequency reveals importance in a way recency cannot.
Consider two scenarios. In the first, you open the React hooks documentation once to copy an example you saw on Twitter. In the second, you visit the same page forty times over three months because you keep needing to check the exact behavior of useEffect during a long refactor. Both show up equally in a recency-sorted query from last week. A frequency-sorted query puts the second page near the top of your agent's context and ignores the first one entirely.
This is what transforms agent memory from a recall system into something closer to an anticipation system. The agent does not wait for you to mention you are working in React. It already knows, because forty visits to the React docs is a stronger signal than any preference you could articulate.
Bookmarks add a second layer. A typed_count above zero is an even stronger signal - you considered this URL important enough to type it directly rather than clicking through from search or another page. Combining visit count, typed count, and bookmark status gives you a simple importance score that outperforms any recency-based ranking without requiring embeddings, vector search, or LLM inference.
The data already exists. The schema is straightforward. The queries are fast. The remaining gap is that most agent memory systems never thought to look.
Fazm is an open source macOS AI agent. Open source on GitHub.