Open Source Computer Use Agent GitHub Repos Worth Watching in 2026
Open Source Computer Use Agent GitHub Repos Worth Watching in 2026
If you search GitHub for "computer use agent" today, you get hundreds of results. Most are abandoned after a weekend hackathon. A few have real momentum, active maintainers, and production users. This guide separates the signal from the noise by looking at what actually matters: commit frequency, issue response time, release cadence, and whether the thing works on a real desktop.
We tracked 15+ repos through Q1 2026 and narrowed it down to the ones worth your time.
Why GitHub Repo Health Matters for Agent Projects
Picking an AI agent based on a demo video is a mistake. Demo videos show cherry-picked runs. The GitHub repo tells you the truth: Is the project still maintained? Are bugs getting fixed? Does the team ship releases, or just push to main and hope?
For computer use agents specifically, repo health is even more important than for typical open source software. Operating systems change constantly. macOS Sequoia broke half the accessibility APIs that agents relied on. Windows 11 24H2 changed the UI Automation tree structure. If a project's last commit was three months ago, it probably doesn't work on the latest OS update.
Note
All GitHub stats in this post were collected in early April 2026. Star counts and commit activity change daily, so treat these as directional indicators, not exact figures.
Top Open Source Computer Use Agent Repos on GitHub
Here is our shortlist of the repos that are actively maintained, have meaningful community traction, and solve real problems.
| Project | GitHub Stars | Last Commit | Primary OS | Architecture | License | |---|---|---|---|---|---| | Fazm | 3,000+ | Daily | macOS | Accessibility API + Vision | MIT | | Browser Use | 55,000+ | Daily | Cross-platform | Browser CDP | MIT | | Open Interpreter | 58,000+ | Weekly | Cross-platform | Code execution | AGPL-3.0 | | UFO | 10,000+ | Weekly | Windows | UI Automation | MIT | | OS-Copilot | 3,500+ | Monthly | Linux | Mixed | Apache-2.0 | | OpenAdapt | 2,000+ | Weekly | Cross-platform | Screen recording + replay | MIT | | Computer Use OOTB | 2,000+ | Monthly | Cross-platform | Anthropic API wrapper | Apache-2.0 |
Fazm
Fazm takes the accessibility-first approach on macOS. Instead of sending screenshots to a vision model and waiting 2-3 seconds per action, it reads the native accessibility tree to understand what's on screen, then uses targeted vision only when the accessibility data is incomplete. The result is sub-second action cycles on most tasks.
The repo ships as a native macOS app with a one-click installer. You clone it, run cargo build, and it works. No Docker, no Python environment, no configuration files to edit. The README walks you through setup in under two minutes.
What makes the repo stand out on GitHub:
Browser Use
Browser Use is the most starred browser automation agent on GitHub. It wraps Playwright with an LLM layer that interprets page content and decides what to click, type, or scroll. The project moved fast in 2025 and now supports multi-tab workflows, file downloads, and form filling.
The GitHub repo is well-organized with a clear contributing guide, automated CI, and tagged releases. The issue tracker is active, though response times vary since the maintainer team is small.
Open Interpreter
Open Interpreter lets an LLM run code on your machine. It is not a "computer use" agent in the strict sense (it does not click buttons or navigate UIs), but it shows up in every GitHub search for the term because it controls your computer through code execution. The repo has strong documentation, a plugin system, and a large contributor base.
UFO (Microsoft)
UFO is Microsoft's research project for Windows UI automation via LLMs. It reads the Windows UI Automation tree and uses GPT-4V to plan actions. The repo includes benchmark datasets and evaluation scripts, which is rare for agent projects. If you are building a Windows-specific agent, UFO's codebase is worth studying even if you do not use it directly.
How to Evaluate a Computer Use Agent Repo
Stars are a vanity metric. Here is what actually predicts whether a repo will still work six months from now:
Commit frequency
A repo with daily or weekly commits is alive. A repo with nothing for 60+ days is effectively dead for agent projects, because OS updates will have broken something in the meantime. Check the "Contributors" tab on GitHub to see if commits come from multiple people or just one.
Issue response time
Open the "Issues" tab, sort by newest, and look at how fast maintainers respond. A healthy project responds within 48 hours, even if the response is just a label or a "we're looking into this." Projects where issues sit unanswered for weeks will leave you stuck when you hit a bug.
Release cadence
Tagged releases with changelogs mean the maintainers care about stability. If the only way to get the latest version is git pull main, expect breakage. Look for semantic versioning and a CHANGELOG file.
CI/CD pipeline
Check for a .github/workflows/ directory. Projects with automated tests catch regressions before they ship. Projects without CI are shipping on faith.
Contributor diversity
A single-maintainer project is a bus-factor-one risk. Check the contributors graph. If 90% of commits come from one person, the project dies when that person gets a new job or burns out.
Architecture Patterns Across GitHub Agent Projects
After reviewing dozens of repos, we see three dominant architectures for computer use agents in 2026:
| Architecture | How It Works | Strengths | Weaknesses | |---|---|---|---| | Screenshot + Vision | Takes screenshots, sends to multimodal LLM, receives click coordinates | Works on any OS, no API access needed | Slow (2-5s per action), expensive, fragile to resolution changes | | Accessibility API | Reads the OS accessibility tree to get UI element metadata | Fast (under 200ms per action), precise, low cost | OS-specific, not all apps expose accessibility data | | Hybrid (Accessibility + Vision) | Uses accessibility tree as primary, falls back to vision for gaps | Best of both worlds | More complex codebase, still OS-specific |
Most of the repos that gained traction in early 2026 moved toward the hybrid approach. Pure screenshot agents are too slow and too expensive for real workflows. Pure accessibility agents miss elements that apps don't expose properly (looking at you, Electron).
Getting Started: Cloning and Running Your First Agent
Here is the fastest path from "I found a repo" to "it's running on my machine" for the top projects.
Fazm (macOS)
git clone https://github.com/m13v/fazm.git
cd fazm
cargo build --release
# Grant Accessibility permission when prompted
./target/release/fazm
Requirements: macOS 14+, Rust toolchain, an Anthropic or OpenAI API key in your environment.
Browser Use (Cross-platform)
git clone https://github.com/browser-use/browser-use.git
cd browser-use
pip install -e .
playwright install chromium
python examples/simple.py
Requirements: Python 3.11+, a supported LLM API key.
UFO (Windows)
git clone https://github.com/microsoft/UFO.git
cd UFO
pip install -r requirements.txt
python -m ufo --task "Open Notepad and type hello"
Requirements: Windows 11, Python 3.10+, GPT-4V API access.
Common Pitfalls When Choosing a GitHub Agent Repo
-
Picking by star count alone. Stars measure marketing, not quality. Some of the best agent repos have under 5,000 stars because their maintainers spend time on code instead of Twitter threads.
-
Ignoring the license. AGPL-3.0 requires you to open-source any application that uses the library over a network. This is a dealbreaker for most commercial use cases. MIT and Apache-2.0 are permissive. Check before you build on top of a project.
-
Forking instead of contributing. If you need a small change, open a PR instead of maintaining a private fork. Forks drift from upstream fast, and you lose access to bug fixes and new features.
-
Not checking hardware requirements. Some vision-based agents need a GPU for local model inference. Others need 16GB+ RAM to run the vision model locally. Read the README's requirements section before investing time in setup.
-
Assuming cross-platform means equal quality. A repo that says "works on macOS, Windows, and Linux" usually works well on one platform and barely on the others. Check the issue tracker for your specific OS.
Warning
Several popular agent repos bundle API keys or tokens in their default configuration. Before running any agent, review the .env.example file and ensure you are not accidentally exposing your own credentials. Never commit API keys to a public fork.
What to Watch for in the Second Half of 2026
The computer use agent space on GitHub is moving fast. Three trends to track:
-
MCP (Model Context Protocol) adoption. Anthropic's protocol for tool use is becoming the standard way agents interact with external services. Repos that adopt MCP will be easier to extend and compose with other tools.
-
Local model support. As smaller vision models (Qwen-VL, LLaVA) improve, more repos are adding support for local inference. This eliminates API costs and latency for simple tasks.
-
OS vendor integration. Apple's on-device AI features in macOS and Microsoft's Copilot runtime are changing what's possible without third-party agents. Watch for repos that integrate with these native capabilities rather than fighting against them.
Wrapping Up
The best open source computer use agent for you depends on your OS, your use case, and how much you trust a project's maintainers to keep shipping. GitHub gives you all the signals you need: commit history, issue tracker, CI status, license, and contributor graph. Use them instead of relying on demo videos or star counts.
Fazm is an open source macOS AI agent that uses accessibility APIs for fast, reliable desktop automation. Open source on GitHub.