ClipProxy: Turn AI CLI Subscriptions into OpenAI-Compatible APIs

Matthew Diakonov··10 min read

ClipProxy: Turn AI CLI Subscriptions into OpenAI-Compatible APIs

CLIProxyAPI (the Go SDK package is called cliproxy) is an open source proxy server that wraps CLI-based AI tools into standard OpenAI/Gemini/Claude-compatible API endpoints. If you have a ChatGPT Plus subscription, a Claude Pro/Max plan, or Gemini CLI access, cliproxy lets you use those subscriptions as API endpoints for any client that speaks the OpenAI protocol.

Why ClipProxy Exists

Most AI subscriptions give you generous usage through their native apps and CLIs, but the API is metered separately and often expensive. CLipProxy bridges that gap. It sits between your application and the CLI tool, translating API requests into CLI invocations and streaming the responses back in OpenAI-compatible format.

This means you can point Cursor, Continue, Open Interpreter, or any OpenAI SDK client at your local cliproxy instance and use your existing subscription instead of burning API credits.

Your AppOpenAI SDKCLIProxyAPIOAuth tokensLoad balancingStreamingHot reload:8317Claude CodeChatGPT CodexGemini CLIOAuthOAuthOAuth

Supported Providers

CLIProxyAPI currently wraps these CLI tools:

| Provider | CLI Tool | Models Exposed | Auth Method | |---|---|---|---| | OpenAI | ChatGPT Codex | GPT-5, GPT-4.1 | OAuth (ChatGPT Plus/Pro) | | Anthropic | Claude Code | Claude Opus, Sonnet | OAuth (Claude Pro/Max) | | Google | Gemini CLI | Gemini 2.5 Pro | Google OAuth | | Alibaba | Qwen Code | Qwen 3 | OAuth | | iFlow | iFlow CLI | Various | OAuth | | Antigravity | Antigravity CLI | Various | OAuth |

The project also has a "Plus" variant (CLIProxyAPIPlus) that adds support for third-party providers like OpenRouter, giving you access to models beyond what the native CLIs offer.

Installation

On macOS with Homebrew:

brew tap router-for-me/tap
brew install cliproxyapi

On Windows with winget:

winget install -e --id LuisPater.CLIProxyAPI

Or download binaries directly from the GitHub releases page. The project is written in Go, so you get a single static binary with no runtime dependencies.

Authentication Setup

Before cliproxy can forward requests, you need to authenticate with at least one provider. Each provider has its own login command:

# Claude Code (Anthropic)
clipproxyapi --claude-login

# ChatGPT Codex (OpenAI)
clipproxyapi --codex-login

# Gemini CLI (Google)
clipproxyapi --gemini-login

Each command opens a browser for OAuth. After you log in, the token is saved to ~/.cli-proxy-api/ as a JSON file. You can authenticate multiple accounts for the same provider to enable load balancing.

Warning

OAuth tokens expire. Claude tokens last about 7 days, Gemini tokens about 1 hour (with auto-refresh). If requests start failing with 401 errors, re-run the login command for the affected provider.

Configuration

Create a config.yaml in ~/.cli-proxy-api/:

port: 8317
auth-dir: ~/.cli-proxy-api/
remote-management: false

auth:
  providers: []
  # Leave empty to skip API key validation.
  # Add entries to require an API key on incoming requests:
  # providers:
  #   - api-key: "sk-my-secret-key"

load-balancing:
  strategy: round-robin
  # Options: round-robin, least-connections, random

logging:
  level: info
  format: json

The auth-dir points to the directory where your OAuth tokens live. Cliproxy watches this directory for changes and hot-reloads tokens without restarting the service. You can add new account tokens while the server is running.

Starting the Service

# Run directly
clipproxyapi --config ~/.cli-proxy-api/config.yaml

# Or as a Homebrew service (macOS)
brew services start clipproxyapi

Once running, cliproxy listens on http://localhost:8317. Point any OpenAI-compatible client at this endpoint:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8317/v1",
    api_key="unused"  # or your configured API key
)

response = client.chat.completions.create(
    model="claude-code",  # or "chatgpt-codex", "gemini-cli"
    messages=[{"role": "user", "content": "Hello from cliproxy"}],
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

Multi-Account Load Balancing

One of cliproxy's strongest features is multi-account load balancing. If you have two Claude Max accounts (or two Google accounts with Gemini access), you can authenticate both and cliproxy will round-robin requests between them.

To add a second account, run the login command again with a different account:

# First account
clipproxyapi --claude-login
# Log in with account-1@example.com

# Second account
clipproxyapi --claude-login
# Log in with account-2@example.com

Each login creates a separate token file in ~/.cli-proxy-api/. The server detects new token files automatically and starts routing to the new account within seconds.

If one account hits a rate limit or returns an error, cliproxy automatically fails over to the next available account. When the failed account recovers, it rejoins the rotation.

ClipProxy vs. Alternatives

| Feature | CLIProxyAPI (cliproxy) | LiteLLM Proxy | llm-proxy | Direct API | |---|---|---|---|---| | Uses existing subscriptions | Yes (OAuth) | No (API keys) | No (API keys) | No (API keys) | | Multi-account load balancing | Yes | Yes | Yes | No | | Auto failover | Yes | Yes | Partial | No | | Hot reload config | Yes | Restart required | Restart required | N/A | | Web management UI | Yes (separate project) | Yes (built-in) | No | N/A | | Supported providers | 6+ CLI tools | 100+ API providers | OpenAI, Anthropic | 1 per SDK | | Cost to run | $0 (uses subscriptions) | API costs | API costs | API costs | | Open source | Yes (Go, MIT) | Yes (Python) | Yes (Go) | N/A |

The key differentiator is the OAuth-based approach. LiteLLM and similar proxies are API key routers. They need you to have API access (and pay per token). Cliproxy wraps the CLI tools themselves, using your subscription's OAuth tokens, so the marginal cost of each request is effectively zero beyond your subscription fee.

The cliproxy Go SDK

For developers building on top of CLIProxyAPI programmatically, the cliproxy Go package (published at github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy) provides the core service implementation. It handles:

  • Service lifecycle management (start, stop, health checks)
  • Authentication token handling and auto-refresh
  • File watching for config and token hot-reload
  • Provider integration through a unified interface
import "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy"

svc, err := cliproxy.NewService(cliproxy.Config{
    Port:    8317,
    AuthDir: "~/.cli-proxy-api/",
})
if err != nil {
    log.Fatal(err)
}
defer svc.Stop()
svc.Start()

This is useful if you want to embed cliproxy inside a larger application rather than running it as a standalone binary.

Management Center Web UI

The CLI Proxy API Management Center is a companion web UI that talks to cliproxy's management API (/v0/management). It provides:

  • Real-time log streaming
  • Configuration editing (no manual YAML)
  • OAuth provider status and token management
  • Usage analytics per account and model
  • Container management for Docker deployments

The management UI is a separate project and does not proxy traffic itself. It connects to an already-running cliproxy instance.

Common Pitfalls

  • Token expiration silently breaks requests: OAuth tokens expire. Claude tokens last roughly a week, Gemini tokens refresh automatically but can fail if the refresh token is revoked. If you see 401 or "unauthorized" errors in cliproxy logs, re-authenticate. Set up a cron job or monitoring alert to catch expired tokens early.

  • Port conflicts with other dev tools: The default port 8317 can conflict with other local services. If cliproxy fails to bind, check lsof -i :8317 and either kill the conflicting process or change cliproxy's port in config.yaml.

  • Confusing model names across providers: Each provider exposes models under different names. When you send "model": "claude-code" to cliproxy, it routes to the Claude provider. Sending "model": "gpt-5" routes to the ChatGPT Codex provider. Check cliproxy's logs or the management UI to see the exact model strings each provider accepts.

  • Rate limits are per-account, not per-instance: If you run two cliproxy instances pointing at the same token directory, both instances share the same OAuth accounts. Requests from both instances count against the same rate limits. Either use separate accounts per instance or run a single instance with multiple accounts.

Tip

Run clipproxyapi --version to check your installed version. The project moves fast; v6 introduced the Go SDK package and multi-account load balancing. If you are on v5 or earlier, upgrade to get these features.

Quick Start Checklist

  1. Install cliproxy: brew install cliproxyapi (macOS) or winget install LuisPater.CLIProxyAPI (Windows)
  2. Authenticate at least one provider: clipproxyapi --claude-login
  3. Create ~/.cli-proxy-api/config.yaml with your preferred port and settings
  4. Start the service: clipproxyapi --config ~/.cli-proxy-api/config.yaml
  5. Point your client at http://localhost:8317/v1
  6. Verify with: curl http://localhost:8317/v1/models

Wrapping Up

ClipProxy (CLIProxyAPI) solves a real pain point: you are paying for an AI subscription but cannot use it programmatically. By wrapping CLI tools as OpenAI-compatible endpoints, it turns any subscription into a local API server with load balancing and failover built in. For anyone running AI agents, coding assistants, or automation pipelines, it eliminates the separate API bill.

Fazm is an open source macOS AI agent. Open source on GitHub.

Related Posts