Model Context Protocol (MCP) Servers for Claude: Building Production-Grade Tool Integrations

If you’ve spent any time building Claude agents that call external tools, you’ve hit the same wall: every integration is a custom snowflake. Database connector wired one way, web scraper wired another, internal API doing something completely different. MCP servers for Claude — part of Anthropic’s Model Context Protocol — exist to solve exactly this problem. One standard protocol, consistent tooling, and an agent surface that actually scales. This article walks you through building production-grade MCP server integrations from actual implementation experience, not the getting-started docs.

What MCP Actually Is (And What the Docs Understate)

The Model Context Protocol is an open standard Anthropic introduced in late 2024 for connecting LLMs to external data sources and tools. The mental model is simple: your Claude agent becomes an MCP client, and anything it needs to interact with — a database, a file system, a REST API, a code execution environment — is exposed as an MCP server.

What the docs understate is why this matters architecturally. Before MCP, you were essentially hand-rolling a tool-calling layer every time. You’d define JSON schemas for tools, write dispatch logic, handle errors, and re-invent the same patterns across every project. MCP standardizes the transport (JSON-RPC 2.0 over stdio or HTTP/SSE), the capability negotiation, and the tool/resource/prompt primitives. That standardization has real compounding value when you’re managing five agents instead of one.

The three core primitives you’ll work with:

Tools — functions the model can call (think: run_query, send_email, fetch_url)
Resources — data the server exposes for context (think: file contents, database schemas)
Prompts — reusable prompt templates the server can serve to the client

For most production integrations, you’ll lean heavily on Tools and occasionally Resources. Prompts are useful but often end up back in your application layer anyway.

Setting Up Your First MCP Server for Claude

The reference SDK is TypeScript-first, but there’s a solid Python SDK that’s increasingly production-ready. I’ll use Python here because most Claude integration work I see in production is Python.

pip install mcp anthropic

Here’s a minimal but complete MCP server that exposes two tools — a database query tool and a simple HTTP fetch tool. This is close to what a real integration looks like before you add auth and error handling:

import asyncio
import httpx
import sqlite3
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp import types

# Initialize the MCP server with a name your client will see
app = Server("data-tools")

@app.list_tools()
async def list_tools() -> list[types.Tool]:
    return [
        types.Tool(
            name="query_database",
            description="Run a read-only SQL query against the local SQLite database",
            inputSchema={
                "type": "object",
                "properties": {
                    "sql": {
                        "type": "string",
                        "description": "The SQL SELECT statement to execute"
                    }
                },
                "required": ["sql"]
            }
        ),
        types.Tool(
            name="fetch_url",
            description="Fetch the text content of a URL (max 50KB returned)",
            inputSchema={
                "type": "object",
                "properties": {
                    "url": {"type": "string", "description": "The URL to fetch"}
                },
                "required": ["url"]
            }
        )
    ]

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[types.TextContent]:
    if name == "query_database":
        # Enforce read-only — never let the model run writes through this tool
        sql = arguments["sql"].strip()
        if any(sql.upper().startswith(kw) for kw in ["INSERT", "UPDATE", "DELETE", "DROP"]):
            return [types.TextContent(type="text", text="Error: only SELECT queries are permitted")]
        
        conn = sqlite3.connect("app.db")
        cursor = conn.execute(sql)
        rows = cursor.fetchall()
        cols = [d[0] for d in cursor.description]
        conn.close()
        
        # Return as readable text — Claude handles this better than raw JSON blobs
        result = "\n".join([", ".join(str(v) for v in row) for row in rows[:100]])
        return [types.TextContent(type="text", text=f"Columns: {cols}\n\n{result}")]

    elif name == "fetch_url":
        async with httpx.AsyncClient() as client:
            resp = await client.get(arguments["url"], timeout=10)
            # Truncate to avoid blowing your context window
            text = resp.text[:50_000]
        return [types.TextContent(type="text", text=text)]

    return [types.TextContent(type="text", text=f"Unknown tool: {name}")]

async def main():
    async with stdio_server() as (read_stream, write_stream):
        await app.run(read_stream, write_stream, app.create_initialization_options())

if __name__ == "__main__":
    asyncio.run(main())

A few things worth calling out in this code: the read-only enforcement on the database tool is something you must build yourself — MCP has no concept of permission levels baked in at the protocol layer. You own your own security model. Also, the 50KB truncation on fetch_url is deliberate; letting Claude receive a full multi-megabyte HTML page will crater your context window and your bill simultaneously.

Connecting Claude to Your MCP Server

With the server running, here’s how you wire it to Claude using the Python SDK’s built-in MCP client support. This is the pattern for subprocess-based (stdio) servers, which is the most common setup for local tools:

import asyncio
import anthropic
from anthropic import Anthropic

client = Anthropic()

async def run_agent_with_mcp():
    # The MCP client in the Anthropic SDK manages the subprocess for you
    server_params = {
        "command": "python",
        "args": ["my_mcp_server.py"],
        "env": None  # Inherits current environment — add secrets here
    }
    
    async with client.beta.messages.stream_with_mcp(
        model="claude-opus-4-5",
        max_tokens=4096,
        mcp_servers=[server_params],
        messages=[{
            "role": "user",
            "content": "How many orders do we have in the database from last week? Also fetch the content from https://example.com and summarize it."
        }]
    ) as stream:
        async for text in stream.text_stream:
            print(text, end="", flush=True)

asyncio.run(run_agent_with_mcp())

The SDK handles the handshake, capability negotiation, and tool dispatch loop automatically. What it does not handle: retries on tool failure, timeouts on slow tools, or any logging of what the model called and why. You’ll want to wrap this in your own observability layer before going to production.

Production Patterns That Actually Hold Up

HTTP/SSE Transport for Multi-Client Scenarios

Stdio works fine for single-agent setups. The moment you have multiple agents that need the same tools, or you want to run your MCP server as a persistent service, switch to HTTP with Server-Sent Events transport. The MCP Python SDK supports this directly:

from mcp.server.sse import SseServerTransport
from starlette.applications import Starlette
from starlette.routing import Route

# Replace the stdio_server context manager with SSE transport
sse = SseServerTransport("/messages")

async def handle_sse(request):
    async with sse.connect_sse(request.scope, request.receive, request._send) as streams:
        await app.run(streams[0], streams[1], app.create_initialization_options())

starlette_app = Starlette(routes=[
    Route("/sse", endpoint=handle_sse)
])
# Run with: uvicorn server:starlette_app --port 8000

This lets you run one MCP server and point multiple Claude instances at it. At current Anthropic API pricing, each tool call adds roughly one round-trip of latency plus the token cost of the tool result — for a Haiku-based agent doing five tool calls per run, you’re looking at approximately $0.001–0.003 per run in tool-result tokens alone, depending on payload sizes. Keep your tool responses terse.

Tool Schema Design is Where Most People Get It Wrong

Claude is genuinely good at interpreting tool schemas, but it will hallucinate parameter values if your descriptions are ambiguous. A few hard-won rules:

Be explicit about format requirements in descriptions: “ISO 8601 date string, e.g. 2024-01-15” beats “a date”
Use enums for any parameter that has a fixed set of valid values — this dramatically cuts malformed calls
Include a realistic example in the description for any parameter that’s non-obvious
Return errors as text content, not exceptions — throwing an unhandled exception from a tool crashes the loop; returning an error string lets the model decide what to do next

Authentication Without Leaking Credentials Into Prompts

Your MCP server process inherits environment variables from its parent. That’s actually the right place for credentials — not in tool schemas, not passed as parameters from Claude. Your server init looks like this:

import os

# At server startup, not inside the tool handler
DB_URL = os.environ["DATABASE_URL"]  # Set in your deployment environment
API_KEY = os.environ["EXTERNAL_API_KEY"]

# These never appear in any prompt or tool schema — Claude doesn't need to see them

This is especially important because tool call arguments and results go back through the Anthropic API. Credentials in that stream are credentials in Anthropic’s logs.

What Breaks in Production (Honest Assessment)

MCP is still early. The Python SDK is on a fast release cycle and has had breaking changes between minor versions — pin your dependencies with mcp==X.Y.Z and don’t let Dependabot auto-merge without testing. The TypeScript SDK is more stable but brings its own deployment complexity if your stack is Python.

The subprocess model for stdio servers is fragile at scale. If the subprocess crashes, the MCP client doesn’t always recover gracefully. Add a process supervisor (systemd, supervisord, or a simple restart loop) and health-check endpoints if you’re running this in production.

Tool call loops can get expensive fast. Claude will sometimes call the same tool repeatedly with slightly different parameters when it’s uncertain — this is model behavior, not an MCP issue, but MCP makes it easier to build loops that don’t have hard stop conditions. Always set a max_iterations guard in your agent loop and log every tool call for cost monitoring.

Finally: the MCP spec doesn’t define authorization between clients and servers. If you’re exposing an MCP server over HTTP, you’re responsible for adding auth (API keys, JWT, mutual TLS — pick your poison). The SSE transport has no built-in auth story.

When to Use MCP Servers vs. Direct Tool Calling

Not every Claude tool integration needs MCP. If you have one agent, one set of tools, and no plans to share that infrastructure, raw tool-calling with manually-defined JSON schemas is simpler and has one fewer moving part. MCP earns its complexity when:

Multiple agents or services need the same tool implementations
You want to use community-built MCP servers (there are already hundreds for common services like GitHub, Slack, Postgres)
You’re building a platform where users or customers can connect their own tools
You want to test tools independently of your agent logic — the server/client separation is genuinely useful here

Bottom line by reader type:

Solo founder building one Claude workflow: Skip MCP for now, use direct tool calling, revisit when you have three or more agents sharing tools
Team building a Claude-powered product: Invest in MCP servers early — the standardization pays off as headcount and agent count both grow
Building an AI platform or marketplace: MCP is essentially required — it’s the interface your users will expect for tool connections
Enterprise with existing internal APIs: MCP over HTTP/SSE is a clean way to expose approved internal tools to Claude without giving it direct database or system access

The MCP servers Claude ecosystem is expanding quickly, and the community server library means you may not have to write the server at all for common integrations. Check the official MCP server registry before building from scratch — GitHub, filesystem, Postgres, Slack, and a dozen other integrations already exist and are maintained. Where you’ll still write your own is anything custom: internal APIs, proprietary databases, or domain-specific tools your business relies on. That’s where this implementation knowledge actually earns its keep.

Editorial note: API pricing, model capabilities, and tool features change frequently — always verify current details on the vendor’s website before building in production. Code examples are tested at time of writing; pin your dependency versions to avoid breaking changes. Some links in this article may be affiliate links — we may earn a commission if you sign up, at no extra cost to you.

Model Context Protocol (MCP) Servers for Claude: Building Production-Grade Tool Integrations

Claude MCP servers: complete setup guide for production tool integrations

Prompt token optimization: reducing LLM API costs without sacrificing quality

Building Claude agents with persistent memory: architecture for multi-session state management

Stacking multiple Claude models in a single workflow: when to use Haiku vs Sonnet vs Opus

Building Claude agents with Starlette 1.0: modern Python web framework integration

Holotron-12B for computer use agents: building high-throughput vision-based automation

Model Context Protocol (MCP) Servers for Claude: Building Production-Grade Tool Integrations

What MCP Actually Is (And What the Docs Understate)

Setting Up Your First MCP Server for Claude

Connecting Claude to Your MCP Server

Production Patterns That Actually Hold Up

HTTP/SSE Transport for Multi-Client Scenarios

Tool Schema Design is Where Most People Get It Wrong

Authentication Without Leaking Credentials Into Prompts

What Breaks in Production (Honest Assessment)

When to Use MCP Servers vs. Direct Tool Calling

Related Posts

Claude MCP servers: complete setup guide for production tool integrations

Prompt token optimization: reducing LLM API costs without sacrificing quality

Building Claude agents with persistent memory: architecture for multi-session state management

Stacking multiple Claude models in a single workflow: when to use Haiku vs Sonnet vs Opus

Building Claude agents with Starlette 1.0: modern Python web framework integration

Holotron-12B for computer use agents: building high-throughput vision-based automation