Architecture Spec-Driven AI Coding: CLAUDE.md, Guardrails, and Production Survival
The vibeArchitecture movement started from a real observation: AI can write code fast, but without architectural guardrails, it writes code that works locally and dies in production. The gap between a demo and a deployable system is mostly an architecture gap. This guide covers how to write architecture specs that make AI coding reliable, the CLAUDE.md pattern for encoding project knowledge, and specific strategies for native apps where the local-to-production gap is widest.
1. The Local-to-Production Gap
AI coding tools like Claude Code, Cursor, and Copilot have solved the "can AI write code?" question. The answer is yes, and often quite well. But they have exposed a deeper problem: AI writes code that satisfies the immediate prompt without considering the broader system context.
Common manifestations of the gap:
- Works on happy path, fails on edge cases - AI generates the straightforward implementation but misses error handling, timeout logic, retry mechanisms, and graceful degradation.
- Ignores existing patterns - The codebase uses a specific state management pattern, logging convention, or error handling approach. AI introduces a different pattern because it was not told about the existing one.
- Creates security vulnerabilities - AI prioritizes making code work over making it secure. SQL injection, unvalidated input, exposed credentials, and insecure defaults are common in AI-generated code.
- Performance blindness - AI does not consider N+1 queries, unnecessary re-renders, memory leaks, or O(n^2) algorithms unless specifically prompted about performance.
- Deployment incompatibilities - Code works in the development environment but fails in production due to different OS versions, missing dependencies, permission restrictions, or network configurations.
The solution is not to stop using AI coding tools. It is to provide the AI with the architectural context it needs to make good decisions. This is where spec-driven development comes in.
2. The CLAUDE.md Pattern: Encoding Project Knowledge
CLAUDE.md is a convention where you place a markdown file at the root of your project that Claude Code (and other tools) reads automatically before starting any task. It serves as the "tribal knowledge" document - everything a new team member (or AI agent) needs to know about the project.
An effective CLAUDE.md contains:
- Project overview - What the project does, its architecture (monolith, microservices, monorepo), and the primary tech stack.
- Conventions and patterns - Naming conventions, directory structure, state management patterns, error handling approaches, logging standards.
- Do not do this - Explicit anti-patterns. "Never use var, always use const/let." "Never query the database directly from route handlers." "Never store secrets in code."
- Build and test commands - How to build, test, lint, and deploy. AI agents need to verify their own work.
- External dependencies - APIs, services, and third-party tools the project depends on, including version constraints.
- Known issues and workarounds - Bugs, limitations, and temporary hacks that an AI might try to "fix" without understanding the context.
Cursor uses a similar concept with .cursorrules. Copilot uses .github/copilot-instructions.md. The principle is the same across tools: give the AI the project context before it starts writing code.
The CLAUDE.md does not need to be perfect. Start with 20-30 lines covering the most important conventions. Add to it every time the AI makes a mistake that a better spec would have prevented. Over weeks, it becomes a comprehensive guide that dramatically improves AI output quality.
3. Writing Effective Architecture Specs
Beyond CLAUDE.md, specific architecture specs for new features or components significantly improve AI coding output. A good spec answers three questions:
- What are we building? - Clear description of the feature, including user-facing behavior, data flow, and integration points.
- What are the constraints? - Performance requirements, security requirements, compatibility requirements, and any technical debt to work around.
- How should it be structured? - Module boundaries, interface definitions, data models, and error handling strategy.
A spec for an AI coding agent should be more explicit than one for a human developer. Humans infer conventions from context. AI agents follow instructions literally. If you want the AI to use a specific error handling pattern, show the pattern. If you want it to avoid a specific library, say so explicitly.
Example spec structure:
## Feature: User notification preferences ### Requirements - Users can configure notification channels (email, push, slack) - Each channel has independent on/off toggle and frequency setting - Changes are persisted immediately (optimistic UI) - Must work offline (queue changes, sync when online) ### Architecture - New NotificationPreferences component in /src/components/settings/ - State managed via existing useSettings hook (NOT local state) - API calls through existing /api/v2/preferences endpoint - Use existing Toast component for feedback, not alert() ### Constraints - Do not add new dependencies - Must support iOS 16+ (no iOS 17-only APIs) - Max 2 API calls per save operation - Error states must be recoverable without page reload ### Anti-patterns to avoid - Do not use useEffect for API calls (use useSWR mutation) - Do not create new types - extend existing PreferenceTypes - Do not bypass the API layer with direct database access
Teams that write specs before asking AI to code report 40-60% fewer iterations needed to reach production-quality code. The time spent writing the spec is recovered many times over.
4. Guardrails: Constraining AI Without Crippling It
Guardrails are automated checks that run during or after AI code generation to catch common issues. They operate at multiple levels:
- Pre-generation guardrails - Check that a spec or CLAUDE.md exists before the AI starts coding. Claude Code hooks can enforce this.
- In-flight guardrails - Check each file edit as it happens. Auto-run linting, type checking, and security scanning after every modification. Block changes to protected files (config, migrations, secrets).
- Post-generation guardrails - Run the full test suite, check for security vulnerabilities, verify the build still succeeds, and validate that no unintended files were modified.
Practical guardrail implementations:
- Claude Code hooks that run ESLint/Prettier after every file edit
- Pre-commit hooks that check for secrets, large files, and forbidden patterns
- Type checking (TypeScript strict mode, mypy, etc.) that the AI must satisfy
- Test coverage requirements - new code must include tests
- File permission checks - prevent edits to infrastructure, CI/CD, or deployment configs unless explicitly authorized
The goal is to make it easy for AI to write correct code and hard for it to introduce problems. Good guardrails do not slow the AI down - they redirect it toward better solutions faster than trial and error.
5. Special Considerations for Native Apps
The local-to-production gap is widest for native applications (macOS, iOS, Android). Web apps have relatively uniform deployment environments. Native apps face:
- OS version fragmentation - Your app needs to work on macOS 13, 14, and 15. AI defaults to using the latest APIs unless constrained.
- Entitlement and permission complexity - Accessibility, camera, microphone, location, contacts - each requires specific entitlements and runtime permission requests that AI often forgets.
- Code signing requirements - The code itself might be perfect but the app fails because of signing configuration issues.
- App Store guidelines - Apple's review guidelines impose constraints that AI does not know about unless told. Using private APIs, improper data collection, or insufficient privacy disclosures will cause rejection.
- Performance expectations - Native apps are expected to be fast. AI-generated SwiftUI views that trigger excessive re-renders or hold strong references causing memory leaks will be noticed by users.
Your architecture spec for native apps should explicitly include:
- Minimum deployment target (e.g., macOS 13.0, iOS 16.0)
- Required entitlements and their justifications
- Memory budget and performance constraints
- Thread safety requirements (main thread for UI, background for compute)
- Data persistence strategy (UserDefaults, Core Data, SwiftData, files)
Desktop agents like Fazm can help verify native app behavior by actually running the app and interacting with it through accessibility APIs - checking that buttons work, screens load, and the UI responds correctly. This automated verification catches issues that code review alone would miss.
6. Evolving Specs Over Time
Architecture specs are living documents. They should evolve based on:
- AI mistakes - Every time the AI produces code that violates a convention or introduces a bug, add a rule to prevent it. "Do not use setTimeout for debouncing - use the existing useDebouncedValue hook."
- New patterns - As the codebase evolves, new patterns emerge. Document them as they solidify so AI uses them consistently.
- Deprecated approaches - Mark old patterns as deprecated in the spec so AI does not copy them from existing code.
- Performance learnings - After profiling reveals bottlenecks, add performance notes. "This query pattern hits N+1. Always use .includes() for related data."
A healthy spec grows by 5-10 lines per week for an actively developed project. If it stops growing, either the project has stabilized or the team has stopped learning from AI output. Review the spec monthly to remove outdated entries and reorganize for clarity.
7. Team Patterns for Spec-Driven AI Coding
For teams, spec-driven AI coding requires coordination:
- Shared CLAUDE.md ownership - Anyone can add to the CLAUDE.md, but changes should be reviewed like code changes. A bad rule in the spec affects every AI-generated change.
- Spec-first PRs - For significant features, submit the architecture spec as the first PR. Review and approve the spec before any code is written (by AI or human). This front-loads the design discussion.
- AI output review process - AI-generated code should be reviewed with the spec open side-by-side. The review checks whether the code follows the spec, not just whether it works.
- Spec retrospectives - After each sprint, ask: "What did the AI get wrong that better specs would have prevented?" Use the answers to improve the specs.
- Per-module specs - For large projects, supplement the root CLAUDE.md with module-specific specs in subdirectories. The payment module has different constraints than the notification module.
The organizations getting the most value from AI coding are the ones that invest in specification quality. The AI model is a constant - everyone has access to the same models. The differentiator is the quality of the instructions you give it.
Spec-Driven AI That Controls Your Whole Desktop
Fazm reads your project specs and operates across code editor, browser, and native apps - keeping your architecture constraints intact.
Try Fazm Free