Building Custom MCP Tools to Connect Claude Code to Production Systems

Matthew Diakonov

Updated March 30, 2026

mcp claude-code automation tools production workflow

Building Custom MCP Tools to Connect Claude Code to Production Systems

Claude Code out of the box can read files, run shell commands, and search code. That covers a lot of ground. But real workflow automation means connecting it to your actual systems - your database, your deployment pipeline, your monitoring stack, your internal APIs. Custom MCP tools are how you do that.

A 2025 CodeRabbit analysis of 470 open-source PRs found that structured AI development using Claude Code with MCP servers and project-scoped configuration produced 1.7x fewer defects and 2.74x fewer security vulnerabilities compared to ad-hoc usage. The infrastructure investment is not just quality-of-life - it measurably improves output quality.

What MCP Tools Actually Are

MCP (Model Context Protocol) tools are small servers that expose callable functions to Claude Code. You define the function name, parameters, and description. Claude discovers available tools at session start and calls them by name when they are relevant to the task.

Instead of telling Claude "run this psql command to check the database," you build a tool called query_production_db that handles connection pooling, read-only enforcement, and result formatting. Claude calls the tool by name and gets structured data back. No raw psql, no manual connection strings, no risk of accidentally calling a write query.

A minimal tool server in TypeScript:

import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { Pool } from "pg";

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  ssl: { rejectUnauthorized: false }
});

const server = new Server(
  { name: "production-tools", version: "1.0.0" },
  { capabilities: { tools: {} } }
);

server.tool(
  "query_production_db",
  {
    description: "Run a read-only SELECT query against the production database. Returns rows as JSON. Use for debugging, data investigation, and answering questions about production data. Do not use for writes - use the admin panel for mutations.",
    inputSchema: {
      type: "object",
      properties: {
        query: {
          type: "string",
          description: "A valid SQL SELECT statement. Must not contain INSERT, UPDATE, DELETE, DROP, or TRUNCATE."
        },
        limit: {
          type: "number",
          description: "Maximum rows to return. Defaults to 100. Maximum 1000."
        }
      },
      required: ["query"]
    }
  },
  async ({ query, limit = 100 }) => {
    // Enforce read-only at the server level, not via LLM instructions
    const normalized = query.trim().toUpperCase();
    if (!normalized.startsWith("SELECT")) {
      return {
        content: [{ type: "text", text: "Error: only SELECT queries are allowed" }],
        isError: true
      };
    }

    const safeQuery = `${query.trimEnd().replace(/;+$/, "")} LIMIT ${Math.min(limit, 1000)}`;
    const result = await pool.query(safeQuery);

    return {
      content: [{
        type: "text",
        text: JSON.stringify({ rows: result.rows, rowCount: result.rowCount }, null, 2)
      }]
    };
  }
);

const transport = new StdioServerTransport();
await server.connect(transport);

{
  "mcpServers": {
    "production-tools": {
      "command": "node",
      "args": ["/path/to/production-tools/index.js"],
      "env": {
        "DATABASE_URL": "${PROD_DB_URL}"
      }
    }
  }
}

Tools Worth Building First

Start with the tools that eliminate the most copy-paste in your daily work:

Deployment status - Check what version is deployed where without opening three dashboards. A single tool that returns current version, deploy timestamp, and health check status for each environment.

Log search - Query structured logs with a natural language-friendly interface. Accept a time range and a search string; return formatted log entries. Much faster than constructing Datadog or Loki queries manually.

Feature flag management - Toggle flags and inspect current state without navigating your feature flag UI. One command: "disable the new checkout flow for staging."

Database queries - Read-only access to production data for debugging. The example above covers this. Add useful helpers: "show me recent errors for user X" as a named query template.

Incident creation - File PagerDuty or OpsGenie incidents with proper severity and routing without navigating the UI. Accept a description and severity; return the incident URL.

Safety Boundaries

The critical design choice is what each tool can and cannot do. Safety boundaries belong in the tool server, not in the LLM's judgment.

Read-only database access is safe by default - enforce it at the server with query parsing, not by trusting the model to avoid write queries.

Write operations need explicit confirmation signals. A deploy tool should require a specific confirmation parameter before executing, not just accept a boolean:

server.tool(
  "deploy_to_staging",
  {
    description: "Deploy a tagged release to the staging environment. Requires explicit confirmation.",
    inputSchema: {
      type: "object",
      properties: {
        tag: { type: "string", description: "Git tag to deploy, e.g. v1.4.2" },
        confirm: {
          type: "string",
          description: "Must be exactly 'DEPLOY' to proceed. This prevents accidental deploys."
        }
      },
      required: ["tag", "confirm"]
    }
  },
  async ({ tag, confirm }) => {
    if (confirm !== "DEPLOY") {
      return {
        content: [{ type: "text", text: "Aborted: confirm must be exactly 'DEPLOY'" }],
        isError: true
      };
    }
    // ... deploy logic
  }
);

Production write access - database mutations, production deploys, user data changes - should require human approval flows, not just LLM confirmation. Build the approval step into the tool, not into the prompt.

The Tool Search Pattern for Large Tool Sets

Once you have more than 20-30 tools, loading all definitions upfront consumes significant context. The tool search pattern - building a search_tools meta-tool that discovers tools on demand - preserves context while keeping your full tool library accessible.

A 2025 benchmark found that on-demand tool search preserved 191,300 tokens of context compared to 122,800 with upfront loading - an 85% reduction in tool-definition token usage - while maintaining access to the full tool library.

The Compound Effect

Once your MCP tools are wired up, Claude Code becomes a natural language interface to your entire infrastructure. "How many users signed up today?" becomes a direct database query. "What is deployed on staging?" becomes a tool call. "Create an incident for the checkout service latency spike" becomes a single command that routes to the right oncall.

Each tool you build eliminates one manual workflow. The tools compose - debugging a production issue goes from "open dashboard, copy log query, open database client, run query, open PagerDuty, file incident" to "investigate the checkout latency spike and file an incident if it is real."

Fazm is an open source macOS AI agent. Open source on GitHub.

Building Custom MCP Tools to Connect Claude Code to Production Systems

Building Custom MCP Tools to Connect Claude Code to Production Systems

What MCP Tools Actually Are

Tools Worth Building First

Safety Boundaries

The Tool Search Pattern for Large Tool Sets

The Compound Effect

More on This Topic

Related Posts

Where Do AI Agents Discover Tools - The Skills System Explained

Claude Code Skills Are Mini Startup Wrappers - How Playwright MCP Ties 30+ Skills Together

Giving Claude Code Eyes and Hands with macOS Accessibility APIs