Ship an MCP Server in Python That Exposes Your Internal API to LLMs — DevClubHouse
Skip to content
$refs.q.focus())"<br>@keydown.window.ctrl.k.prevent="open = true; $nextTick(() => $refs.q.focus())"<br>@keydown.escape.window="open = false"><br>$refs.q.focus())"<br>class="grid place-items-center size-9 rounded-lg border border-ink-700 text-mist-300 hover:text-mist-100 hover:border-ink-600 transition-colors" aria-label="Search">
Search
Sign in<br>Sign up
Sign in<br>Create an account
AI
Advanced
Tutorial
Ship an MCP Server in Python That Exposes Your Internal API to LLMs
Wrap a corporate REST API in three typed tools using FastMCP, inspect them locally, and connect them to Claude Desktop—without ever exposing credentials to the model.
Mariana Souza<br>Senior Editor · Jun 13, 2026 · 8 min read
What You'll Build
A Python MCP server using FastMCP that wraps a corporate REST API as three structured tools—search_customers, get_order, and create_support_ticket. Any MCP-compatible client (Claude Desktop, Cursor, custom agents) can call your API with full type safety, without the model ever seeing credentials or constructing raw URLs.
Prerequisites
Python 3.10+ (required for built-in generic types like list[dict])
pip or uv for package management
Node.js 18+ — mcp dev invokes npx @modelcontextprotocol/inspector under the hood
Latest Claude Desktop (for end-to-end testing; optional if using only the inspector)
A REST API with a bearer token — a mock URL works fine to follow along
Comfortable with async/await Python
1. Set Up the Project
mkdir mcp-internal-api && cd mcp-internal-api<br>python -m venv .venv<br>source .venv/bin/activate # Windows: .venv\Scripts\activate<br>pip install "mcp[cli]" httpx python-dotenv
mcp[cli] installs the mcp CLI used for local inspection. httpx handles async HTTP to your backend.
Create .env for local credentials — add it to .gitignore now :
API_BASE_URL=https://api.corp.example.com<br>API_KEY=sk-your-real-token-here
2. Write the Server
Create server.py:
import os<br>import httpx<br>from dotenv import load_dotenv<br>from mcp.server.fastmcp import FastMCP
load_dotenv()
mcp = FastMCP("internal-api")
_BASE = os.environ["API_BASE_URL"]<br>_KEY = os.environ["API_KEY"]
def _auth_headers() -> dict[str, str]:<br>return {"Authorization": f"Bearer {_KEY}", "Accept": "application/json"}
@mcp.tool()<br>async def search_customers(query: str, limit: int = 10) -> list[dict]:<br>"""Search customers by name or email. Returns a list of customer records."""<br>async with httpx.AsyncClient() as client:<br>r = await client.get(<br>f"{_BASE}/customers",<br>headers=_auth_headers(),<br>params={"q": query, "limit": limit},<br>timeout=10.0,<br>r.raise_for_status()<br>return r.json()
@mcp.tool()<br>async def get_order(order_id: str) -> dict:<br>"""Fetch a single order by its ID."""<br>async with httpx.AsyncClient() as client:<br>r = await client.get(<br>f"{_BASE}/orders/{order_id}",<br>headers=_auth_headers(),<br>timeout=10.0,<br>r.raise_for_status()<br>return r.json()
@mcp.tool()<br>async def create_support_ticket(<br>customer_id: str,<br>subject: str,<br>body: str,<br>priority: str = "normal",<br>) -> dict:<br>"""Open a support ticket for a customer.
Args:<br>customer_id: The customer's UUID.<br>subject: One-line summary (max 120 chars).<br>body: Full description of the issue.<br>priority: 'low', 'normal', or 'high'.<br>"""<br>if priority not in {"low", "normal", "high"}:<br>raise ValueError(f"priority must be low/normal/high, got '{priority}'")
async with httpx.AsyncClient() as client:<br>r = await client.post(<br>f"{_BASE}/tickets",<br>headers=_auth_headers(),<br>json={<br>"customer_id": customer_id,<br>"subject": subject,<br>"body": body,<br>"priority": priority,<br>},<br>timeout=10.0,<br>r.raise_for_status()<br>return r.json()
if __name__ == "__main__":<br>mcp.run()
Why each decision matters:
Detail<br>Reason
Type annotations<br>FastMCP auto-generates JSON Schema from them — the LLM receives exact parameter types, not free-form text
Docstrings<br>Become the tool description the model reads before calling; write them like an API spec
raise_for_status() + ValueError<br>Exceptions surface to the LLM as structured tool errors rather than crashing the server process
Credentials in env vars<br>Never passed as tool arguments, never echoed in responses, never in source control
mcp.run() defaults to stdio transport , which is what Claude Desktop and most local clients expect — the client spawns your server as a subprocess and talks JSON-RPC over stdin/stdout.
3. Inspect Locally with mcp dev
Before touching any LLM, validate the wiring in a browser UI:
mcp dev server.py
This starts your server and opens the MCP Inspector (the URL is printed in your terminal). Navigate to Tools — you'll see all three tools with auto-generated input forms matching your Python signatures. Call search_customers with query = "alice" and confirm a JSON response or a typed upstream error.
Tip: Set API_BASE_URL=https://httpbin.org temporarily to exercise the async/auth plumbing without a live internal API. You'll get a 404 back, which correctly surfaces as an httpx.HTTPStatusError tool...