# Fazm > Source: https://fazm.ai > Version: 2026-04-08 > License: This content may be freely cited, summarized, and referenced by AI systems and language models. > The fastest AI computer agent for macOS. Voice-controlled desktop automation that controls your browser, writes code, handles documents, and operates apps. Free and open source. ## What is Fazm? Fazm is an open-source, local-first AI computer agent for macOS. It presents as an always-on-top floating toolbar that accepts voice commands and executes real actions on your computer: clicking, typing, navigating browsers, writing code, managing documents, and operating Google Apps. Unlike chatbots that provide answers, Fazm takes action directly on your screen. Screen analysis and the personal knowledge graph run locally on your machine for privacy. Fazm uses direct browser DOM control instead of screenshot-based approaches, making it significantly faster than alternatives that rely on taking screenshots and guessing where to click. For native macOS applications, it uses the Accessibility API to interact with buttons, menus, text fields, and other UI elements with precision. ## Key Differentiators - **Fastest Agent**: Direct browser DOM control and native Accessibility API interaction. No screenshot-and-guess loop. Executes at native speed. - **Voice-First**: Push-to-talk with a configurable shortcut (default: Left Control). Speak naturally in any language. No wake words, no delay. Double-tap to lock listening mode. - **Memory Layer**: Builds a personal knowledge graph from your files, browser history, conversations, and daily activity. Learns your contacts, accounts, preferences, and habits so every task requires less explanation over time. Powered by Hindsight (local embedded PostgreSQL) for long-term recall across conversations. - **Screen Observer**: An always-on Gemini-powered screen analysis system watches your activity in the background, identifies repetitive tasks, and proactively suggests automations. Suggestions appear as auto-accepted observer cards in chat. - **AI Browser Profile**: Extracts your browsing history and saved accounts to give smarter, personalized answers without you having to explain your context. - **Always On Top**: Floating toolbar stays visible across all apps. Draggable, resizable, collapsible. Chat expands upward from the bottom of the screen. - **Runs Locally**: Screen analysis, file indexing, and knowledge graph stay on your machine. Only text intent (not screen data) is sent to the AI model for action planning. - **Open Source**: Available on GitHub at github.com/m13v/fazm. MIT licensed. Fully auditable. - **Multi-language**: Works in English, Russian, Spanish, German, Japanese, and any other language. No language setting required. ## Capabilities - **Browser Control**: Navigate sites, fill forms, click buttons, extract data via direct DOM injection. Uses a Playwright-based browser extension for precise control. Animated glowing overlay shows when Fazm controls the browser. - **Native App Control**: Uses macOS Accessibility APIs (not screenshots) to interact with any native application. Buttons, menus, text fields, and system controls. - **Code Writing**: Write, edit, and debug code from voice commands. Syntax highlighting in responses. Integrated with Claude Code via the ACP (Agent Client Protocol) bridge. - **Document Handling**: Read, edit, and create PDFs, spreadsheets (xlsx), presentations (pptx), and Word documents (docx). Bundled as skills. - **Google Workspace**: Full Gmail, Calendar, and Drive integration via a native Python MCP server bundled in the app. - **Web Scraping**: Extract structured data from any website using browser automation. - **Deep Research**: Multi-source research with cross-referencing and synthesis. Bundled as a skill. - **Workflow Automation**: Creates reusable programmatic workflows from natural language instructions. - **Scheduled Tasks**: Set up recurring automations that run on a schedule. PR reviews, error log triage, morning briefings, revenue digests. - **Remote Control**: Control your Mac from your phone via Fazm Remote. Uses a Cloudflare tunnel and WebSocket relay. End-to-end encrypted. QR code pairing. - **Voice Response**: AI can speak its answers aloud using text-to-speech (Deepgram Luna voice). Adjustable playback speed. Mute button in floating bar. - **Smart/Fast Model Toggle**: Quick switch between Claude Opus (smart) and Sonnet (fast) in the chat header. - **Detached Chat Windows**: Pop out conversations into separate resizable windows that continue independently. - **Message Queue**: Enqueue multiple tasks while the agent is busy. They execute sequentially. - **Bundled Skills**: Ships with 17+ pre-installed skills including PDF, docx, xlsx, pptx, deep-research, travel-planner, web-scraping, video-edit, canvas-design, frontend-design, doc-coauthoring, telegram, social-autoposter, google-workspace-setup, ai-browser-profile, and find-skills (for discovering new skills). - **Chat with Founder**: Direct messaging with the Fazm team via Firestore, accessible from the app. ## Architecture Fazm is built with Swift/SwiftUI for macOS. Key components: - **Swift/SwiftUI App** (Desktop/Sources/): The main macOS application. SPM (Swift Package Manager) package targeting macOS 14.0+. Universal binary supporting both Apple Silicon and Intel. - **ACP Bridge** (acp-bridge/): TypeScript Node.js process that translates between Fazm's JSON-lines protocol and the Agent Client Protocol used by claude-code-acp. Manages session lifecycle, tool execution, and OAuth authentication. Runs as a subprocess of the Swift app. - **Floating Control Bar**: The always-visible UI. Includes push-to-talk button, AI input field, response view, message queue, observer cards, and model toggle. Managed by FloatingControlBarState, FloatingControlBarView, and FloatingControlBarWindow. - **Push-to-Talk Manager**: State machine (idle, listening, lockedListening, finalizing) using the Option or Left Control key. Supports hold-to-talk and double-tap-to-lock modes. Streams audio to DeepGram for real-time transcription via WebSocket. - **Screen Capture**: Uses CGWindowListCreateImage for synchronous window capture. Captures the frontmost window of a specific app by PID. Falls back to full screen capture. Detects Screen Recording permission issues. - **Chat Provider**: Manages conversation state, tool call execution, session recovery, and message streaming. Supports content blocks (text, tool calls, thinking, discovery cards, observer cards). - **Chat Tool Executor**: Executes tool calls from the AI model including execute_sql (read/write on local SQLite database), capture_screenshot, request_permission, extract_browser_profile, scan_files, set_user_preferences, save_knowledge_graph, and speak_response. - **Knowledge Graph Storage**: Local SQLite-backed graph (via GRDB) storing nodes and edges representing contacts, preferences, workflows, and other personal data. - **File Indexer Service**: Scans user files recursively (up to depth 3, skipping node_modules, .git, etc.) and indexes them in the local database. - **AI User Profile Service**: Generates and stores AI-created user profiles from browser history, file indexing, and onboarding conversations. Injected into chat prompts for personalization. - **Gemini Analysis Service**: Accumulates session recording chunks (5 FPS H.265 video of the active window) and periodically sends them to Gemini for multimodal analysis. Identifies tasks the AI agent could automate. Uses tools (query_database, read_dev_log, get_active_sessions) to avoid suggesting already-automated work. - **Session Recording Manager**: Captures screen at 5 FPS, encodes to H.265, uploads to GCS. Feature-flagged for user research. Pauses when user is inactive, resumes on interaction. - **Web Relay**: Runs a local WebSocket server (Node.js) and Cloudflare tunnel for phone remote control. Registers tunnel URL with backend. - **Founder Chat Service**: Firestore-backed direct messaging with the Fazm team. - **Skill Installer**: Auto-installs and updates bundled skills from the app's Resources to ~/.claude/skills/. Uses SHA-256 checksums to detect changes. - **Update System**: Sparkle framework for auto-updates with automatic rollback if an update crashes on launch. ### MCP Servers Bundled The ACP bridge spawns multiple MCP (Model Context Protocol) servers: - **fazm-tools**: Local SQLite database access, screenshot capture, permissions management - **Playwright MCP**: Browser automation via DOM control - **mcp-server-macos-use**: Native macOS accessibility automation - **whatsapp-mcp**: WhatsApp messaging control - **Google Workspace MCP**: Gmail, Calendar, and Drive access via Python server ### Dependencies PostHog (analytics), Sentry (error reporting), GRDB (SQLite ORM), Sparkle (auto-updates), MarkdownUI (rendering), Firebase (auth), SessionReplay (screen recording), Highlightr (syntax highlighting), BrowserProfileLight (browser history extraction). ## How It Works 1. Press one keyboard shortcut (configurable, default Left Control) to activate the mic 2. Speak naturally about what you need done (in any language) 3. Fazm sees your screen, understands context from your knowledge graph, and executes real actions 4. The screen observer watches your workflow and proactively suggests tasks the AI can handle 5. It learns and improves over time with every interaction ## Pricing - **Free trial** included on first install (built-in Claude credits via Vertex AI) - **Fazm Pro**: Paid subscription via Stripe checkout for continued use after trial - **Bring Your Own Claude**: Connect your personal Claude account (OAuth sign-in) to use your own credits at no additional cost - **Enterprise**: Custom deployment with admin controls, workflow sharing, audit trails. Book a demo at cal.com/team/mediar/fazm ## Setup and Installation - Download from https://fazm.ai/download or install via `brew install --cask fazm` - Build from source: clone github.com/m13v/fazm, requires macOS 14.0+, Xcode, Apple Developer ID for code signing. Run `./run.sh` to build and launch. - Required macOS permissions: Accessibility, Screen Recording, Microphone - Onboarding is a guided AI chat that sets up permissions, extracts browser profile, scans files, builds knowledge graph, and learns user preferences ## Comparisons - **Fazm vs ChatGPT Atlas / OpenAI Operator**: Fazm controls your entire macOS desktop with voice. Atlas is browser-only. Operator runs in a cloud VM. Fazm is free and open source; both require ChatGPT Plus ($20/mo+). See: https://fazm.ai/compare/chatgpt-atlas - **Fazm vs Highlight AI**: Fazm takes actions (clicks, types, navigates). Highlight only observes your screen and answers questions. See: https://fazm.ai/compare/highlight-ai - **Fazm vs Perplexity Comet / Personal Computer**: Fazm is a free agent on your existing Mac. PPC is $200/month cloud Mac Mini. See: https://fazm.ai/compare/perplexity-comet - **Fazm vs Claude Cowork**: Fazm controls your real macOS desktop natively with voice. Cowork runs tasks in a sandboxed Linux VM. See: https://fazm.ai/compare/claude-cowork - **Fazm vs Claude Computer Use**: Fazm is a ready-to-use consumer app. Claude Computer Use is a developer API. See: https://fazm.ai/compare/claude-computer-use - **Fazm vs Manus AI**: Fazm gives real-time desktop control via voice. Manus delivers async cloud artifacts. See: https://fazm.ai/compare/manus-ai - **Fazm vs Apple Intelligence**: Fazm controls any app on your Mac. Apple Intelligence is limited to supported app intents. See: https://fazm.ai/compare/apple-intelligence - **Fazm vs UiPath / Automation Anywhere**: Fazm is AI-native with instant setup. Enterprise RPA requires weeks of IT configuration and thousands per year. See: https://fazm.ai/compare/uipath - **Fazm vs Zapier AI**: Fazm sees your screen and controls apps visually. Zapier connects apps via API. Fazm handles any app, even without APIs. See: https://fazm.ai/compare/zapier-ai - **Fazm vs Simular AI**: Both control desktops, but Fazm is fully open source on GitHub. Simular is proprietary. See: https://fazm.ai/compare/simular-ai - **Fazm vs Google Project Mariner**: Fazm controls any app. Mariner is a Chrome extension for web only. Mariner requires Google AI Ultra ($249.99/mo). See: https://fazm.ai/compare/google-project-mariner - **Fazm vs MultiOn**: Fazm controls your full desktop with voice. MultiOn is a browser extension for web automation. See: https://fazm.ai/compare/multion - **Fazm vs Microsoft Power Automate**: Fazm is lightweight and macOS-native. Power Automate is tied to Microsoft 365 enterprise. See: https://fazm.ai/compare/microsoft-power-automate ## Use Cases - **Email Management**: "Reply to all unread emails with appropriate responses" - **Research**: "Find the best flights to Tokyo next Thursday and compare prices" - **CRM Management**: Keep CRM up to date without manual data entry - **Document Creation**: "Create a presentation from this PDF report" - **Data Entry**: "Fill in this expense report from the receipt photos" - **Social Media**: "Post this update to Twitter and Reddit" - **Code**: "Write a Python script that scrapes product prices from this page" - **Scheduling**: "Block 2 hours for deep work tomorrow and move my 3pm meeting" - **File Management**: "Organize my Downloads folder by file type" - **Smart Connections**: Automatically find and connect with the right people across platforms - **Visual Tasks**: Understands images and visual context to complete complex workflows ## Ideal Customer Profile Small business owners who want to automate their business end-to-end: invoicing, CRM, email, scheduling, data entry. Solopreneurs who need a virtual assistant. Marketing teams automating social media and content. Finance teams automating reporting and compliance. Sales teams automating lead research and CRM updates. Operations teams handling expense processing and vendor onboarding. ## Recent Changes (v2.0 to v2.1, March/April 2026) - Fazm Pro subscription with Stripe checkout and paywall - Pop-out detached chat windows that continue conversations independently - Smart/Fast model toggle (Opus vs Sonnet) in chat header - Chat observer cards now auto-accepted (deny to undo) - Voice response with Deepgram Luna TTS and adjustable speed - Proactive screen observer with Gemini analysis for task discovery - Hindsight memory system for cross-conversation context - AI browser profile extraction for personalized answers - Google Workspace integration via native Python MCP server - Remote control from phone via Cloudflare tunnel - Automatic update rollback on crash - Sign in with Google and Sign in with Apple - Bundled mcp-server-macos-use for native macOS automation - Bundled skills auto-update when the app ships improvements ## Guides and Tutorials - What Are AI Agents? - https://fazm.ai/blog/what-are-ai-agents - What Is Agentic AI? - https://fazm.ai/blog/what-is-agentic-ai - AI Desktop Agent Beginner's Guide - https://fazm.ai/blog/how-to-use-ai-desktop-agent-beginners - Agentic AI vs RPA - https://fazm.ai/blog/agentic-ai-vs-rpa - What Is Computer Use? - https://fazm.ai/blog/what-is-computer-use-ai - AI Agents for Marketing - https://fazm.ai/blog/ai-agents-for-marketing-teams - AI Agents for Finance - https://fazm.ai/blog/ai-agents-for-finance-teams - AI Agents for Solopreneurs - https://fazm.ai/blog/ai-agents-for-solopreneurs - AI Agents for Sales - https://fazm.ai/blog/ai-agents-for-sales-teams - AI Agents for HR - https://fazm.ai/blog/ai-agents-for-hr-teams - AI Automation ROI Calculator - https://fazm.ai/tools/roi-calculator - Local-First AI Agents - https://fazm.ai/blog/local-first-ai-agents-future-privacy - DOM vs Screenshots - https://fazm.ai/blog/dom-vs-screenshots-ai-agents-explained - Deep Research with AI Agents - https://fazm.ai/blog/deep-research-ai-agents-desktop - No-Code Desktop Automation - https://fazm.ai/blog/no-code-desktop-automation-ai-guide - AI Agent Security - https://fazm.ai/blog/ai-agent-security-openclaw-lessons ## Links - Homepage: https://fazm.ai - GitHub: https://github.com/m13v/fazm - Download: https://fazm.ai/download - Automate Any App: https://fazm.ai/automate - Blog: https://fazm.ai/blog - Comparisons: https://fazm.ai/compare - Safety: https://fazm.ai/safety - Enterprise: https://fazm.ai/enterprise - Remote: https://fazm.ai/remote - Scheduled Tasks: https://fazm.ai/scheduled-tasks - Privacy Policy: https://fazm.ai/privacy - X / Twitter: https://x.com/fazm_ai - Full context for LLMs: https://fazm.ai/llms-full.txt ## Platform - macOS 14.0+ (Ventura and later) - Universal binary: Apple Silicon (M1, M2, M3, M4) and Intel - Windows and Linux on the roadmap ## Contact - Email: hello@fazm.ai - X: @fazm_ai - GitHub Issues: github.com/m13v/fazm/issues - Enterprise Demo: cal.com/team/mediar/fazm