Building production MCP servers in Python with FastMCP
Tufail Khan walks through shipping a production-grade MCP server in Python using FastMCP, covering streamable HTTP transport, OAuth 2.1 auth, and deployment patterns on Lambda and ECS Fargate.
Score breakdown
Python backend engineers can use this guide to ship MCP-compliant internal AI assistants today, with concrete patterns for auth, transport, and deployment that avoid the common pitfalls of over-exposing APIs or using subprocess-based transports in production.
- 01MCP SDKs reached 97 million monthly downloads as of March 2026.
- 02Claude, Cursor, OpenAI Agents SDK, and Microsoft Agent Framework all support MCP natively.
- 03FastMCP enables streamable HTTP transport — finalized in the 2025 MCP spec — with a single `transport="streamable-http"` argument.
Tufail Khan's post frames MCP as the dominant interoperability standard for AI agents, noting that MCP SDKs reached 97 million monthly downloads as of March 2026 and that every major agent framework — Claude, Cursor, OpenAI Agents SDK, and Microsoft Agent Framework — now speaks MCP natively. The article introduces FastMCP as the Python framework of choice for building MCP servers, describing MCP's three primitives: tools (callable functions), resources (URI-addressable data), and prompts (parameterized templates). A complete example CRM server is shown, using Pydantic models for type-safe inputs and outputs, `async`/`httpx` for upstream API calls, and docstrings that double as agent-readable tool descriptions.
On transport, the post argues strongly against the `stdio` subprocess model popularized by 2024 tutorials, which is suited only for desktop tools like Claude Desktop.
On transport, the post argues strongly against the `stdio` subprocess model popularized by 2024 tutorials, which is suited only for desktop tools like Claude Desktop. The 2025 MCP spec's streamable HTTP transport — enabled in FastMCP with a single `transport="streamable-http"` argument — allows servers to run as long-lived HTTP services that scale horizontally behind a load balancer and can be shared across teams. For auth, the 2025 spec standardized OAuth 2.1, and FastMCP ships `OAuth2Middleware` that integrates with existing identity providers such as Auth0, Okta, Cognito, and Clerk, enforcing scopes per tool.
For deployment, the post outlines two patterns: Lambda + API Gateway (using `mangum` or FastMCP's ASGI adapter, with cold starts of roughly 300–500ms) for low-traffic servers handling fewer than 10 agent calls per hour, and ECS Fargate behind an ALB (with ElastiCache for session continuity, estimated at roughly $30/month for a small always-on service) for predictable, higher-traffic workloads. On tool design, the post warns against exposing entire internal APIs and recommends curating 5–15 tools per server, each with a single clear job, typed I/O, honest docstrings, and idempotent behavior to handle agent retries gracefully.
Key facts
- 01MCP SDKs reached 97 million monthly downloads as of March 2026.
- 02Claude, Cursor, OpenAI Agents SDK, and Microsoft Agent Framework all support MCP natively.
- 03FastMCP enables streamable HTTP transport — finalized in the 2025 MCP spec — with a single `transport="streamable-http"` argument.
- 04The 2025 MCP spec standardized OAuth 2.1; FastMCP ships `OAuth2Middleware` compatible with Auth0, Okta, Cognito, and Clerk.