$25 Per PR Review Is Wild - Run Claude Code on the Diff Yourself
$25 Per PR Review Is Wild - Run Claude Code on the Diff Yourself
Anthropic launched their official Code Review feature on March 9, 2026. It deploys five parallel AI agents on every pull request, posts inline comments, and catches bugs before your team sees the PR. The pricing: $15 to $25 per review, scaling with PR size and complexity. A large PR over 1000 lines can push well above that ceiling.
That adds up fast. A team shipping 20 PRs a week hits $2,000 to $10,000 a month before the feature leaves research preview.
The irony is that you can build the same core capability yourself with Claude Code and a custom skill in about an hour - and pay pennies per review instead of dollars.
The DIY Approach
The simplest version is three lines:
git diff main...HEAD > /tmp/pr-diff.txt
claude -p "Review this diff for bugs, security issues, and style violations. Be specific about line numbers." < /tmp/pr-diff.txt
That is the core of what the $25 service does. You pipe the diff into Claude with a review prompt and get back structured feedback. A typical PR diff runs 5,000 to 20,000 tokens, which costs roughly $0.02 to $0.08 at current API pricing.
The official tool adds multi-agent parallelism and GitHub integration, but for most teams those are nice-to-haves, not requirements.
Building a Custom Review Skill
A more structured approach uses a Claude Code skill - a markdown file that defines your team's specific review criteria. The skill lives in .claude/skills/review.md and runs with a single command.
Here is a minimal version:
# PR Review Skill
Run `git diff main...HEAD` and review the output for:
## Security (flag as HIGH)
- Hardcoded secrets, API keys, passwords
- SQL queries built from string concatenation
- Input passed to shell commands without sanitization
- Path traversal: file paths built from user input
- Dependencies added without pinned versions
## Correctness (flag as MEDIUM)
- Error cases that are not handled
- Off-by-one errors in loops and slices
- Race conditions in concurrent code
- Null/undefined access without guards
## Performance (flag as LOW unless severe)
- N+1 database queries
- Missing indexes on filtered columns
- Unnecessary re-renders or recomputation
## Output format
For each issue: file path, line number, severity, description, suggested fix.
Summary at the end: total issues by severity.
You invoke it with:
claude --skill review
The output is structured, specific, and calibrated to your codebase. Because you write the checklist, the agent flags what matters to your team - not a generic rubric.
Real Token Costs
Anthropic's review data shows large PRs over 1000 lines contain issues 84% of the time with an average of 7.5 findings. Small PRs under 50 lines have issues 31% of the time with 0.5 findings on average. Less than 1% of findings from the official tool were reported as incorrect.
Running the same analysis yourself:
| PR size | Approx tokens | API cost (Claude Sonnet) |
|---|---|---|
| Small (under 50 lines) | 3-5K | ~$0.01 |
| Medium (200 lines) | 10-15K | ~$0.04 |
| Large (1000+ lines) | 40-60K | ~$0.15 |
Even if you run three passes - one for security, one for logic, one for style - you stay under $0.50 for a large PR. Compare that to $25.
Integrating With GitHub Actions
The practical version runs automatically on every PR:
# .github/workflows/ai-review.yml
name: AI PR Review
on:
pull_request:
types: [opened, synchronize]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Generate diff
run: git diff origin/main...HEAD > /tmp/pr-diff.txt
- name: Run AI review
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
pip install anthropic
python scripts/review_pr.py /tmp/pr-diff.txt > /tmp/review-output.txt
- name: Post review comment
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const review = fs.readFileSync('/tmp/review-output.txt', 'utf8');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: review
});
The Python script is 30 lines calling the Anthropic API with your diff. The GitHub Action posts the output as a PR comment. Total setup time: one afternoon.
When Paying Makes Sense
The official Anthropic Code Review tool does add things that take real engineering work to replicate:
- Inline comments at specific diff lines, not just a summary comment
- Agent specialization - five parallel agents with different review focuses
- GitHub PR integration that understands the full PR context, not just the diff
- Audit trails for compliance-heavy environments
For teams that need SOC 2 audit trails or want tight GitHub integration without building it, the managed service has legitimate value. For teams shipping daily who want fast, cheap feedback without subscription overhead, the DIY path costs 99% less.
The Broader Pattern
The pattern here extends beyond code review. Whenever a SaaS tool is essentially "LLM plus a prompt plus a wrapper," you can build your own version with Claude Code skills. The skill system exists precisely for this: reusable, shareable AI workflows that cost API tokens instead of monthly subscriptions.
The $25 PR review tool is useful. It is also a reminder that the underlying capability - running a language model on a diff with a good prompt - is now a commodity. The value is in the integration and the UI, not the model.
Fazm is an open source macOS AI agent. Open source on GitHub.