Codex-Like Functionality with Local Ollama - Qwen 3 32B Is the Sweet Spot
Codex-Like Functionality with Local Ollama
You do not need a cloud subscription to get coding agent capabilities. Running Qwen 3 32B locally through Ollama on an M-series Mac gives you surprisingly competent code generation, tool calling, and multi-step reasoning - all on your own hardware.
Why 32B Is the Sweet Spot for M-Series
The 32B parameter count hits a specific hardware balance on Apple Silicon. On an M2 Pro or M3 Pro with 32GB RAM, the model loads with enough headroom for a comfortable context window. Inference runs at 10-15 tokens per second - not blazing fast, but fast enough for interactive use.
Go smaller (7B-14B) and you lose the reasoning quality that makes coding agents useful. The model cannot hold complex codebases in context or reason about multi-file changes. Go larger (70B+) and you need 64GB+ RAM, inference drops below 5 tokens per second, and the experience becomes frustrating.
32B is where the model is smart enough to be useful and the hardware is fast enough to be practical.
What You Actually Get
With the right setup, local Qwen 3 32B handles code generation from descriptions, bug identification and fixes, test writing, refactoring, and basic multi-step coding tasks. It does not match Claude Opus or GPT-4 on complex architectural decisions, but for the 80% of coding tasks that are straightforward, it works.
The key is pairing it with good tool definitions. Give the model access to file reading, file writing, and command execution through MCP servers, and it behaves like a coding agent - not just a chatbot.
The Privacy and Cost Advantage
Every line of code stays on your machine. No API calls, no usage limits, no monthly bills. For teams working with proprietary codebases or regulated industries, this is not a nice-to-have. It is a compliance requirement.
The initial hardware investment (a well-specced Mac) pays for itself in two to three months of saved API costs if you are a heavy user.
Fazm is an open source macOS AI agent. Open source on GitHub.