MCP (Model Context Protocol) in 2026: the complete guide for engineering teams
Table of contents
- Key takeaways
- What MCP is and why it became the standard
- Architecture: clients, servers, and orchestrator
- Generic vs custom servers
- Explicit policies and authentication outside the model
- Multi-server composition with tool prefixes
- Real antipatterns from 2025–2026
- Dangerous tools without explicit policy
- Custom servers without contract tests
- Shared memory between servers that should be isolated
- Retry loops without a breaker
- How to choose your first custom MCP server
- Deep dive
- Conclusion
MCP, short for Model Context Protocol, is the open standard that in 2026 connects any model (Claude, GPT, Gemini) to any tool or data source without a proprietary contract in the middle. This article covers what an engineering team needs to adopt it well: how the client-server architecture fits, which servers to use and which to build, how to keep credentials outside the model, how to compose multiple servers with tool prefixes, and which antipatterns are already biting in production. See also: MCP as multi-vendor standard: patterns already mature.
Key takeaways
- MCP is now the de facto standard for connecting models to tools: proposed by Anthropic in November 2024 and adopted through 2025-2026 by OpenAI, Google, Cursor, Claude Code, and VS Code.
- The architecture has three roles: LLM client, MCP server that exposes tools, resources, and prompts, and orchestrator that applies policy and prefixes.
- Combine community servers (
server-filesystem,server-postgres,server-github) for generic capabilities with your own servers for domain logic. Never mix the two. - Credentials live in the server, not in the prompt or in the model. Environment variables when starting the server, or
Authorizationheaders in the descriptor. - Multi-server composition uses prefixes (
fs:read_file,db:query) to avoid name collisions; the orchestrator routes, the model picks a tool. - The documented antipatterns all point in the same direction: granting the model more capability than the policy justifies.
What MCP is and why it became the standard
MCP[1] is an open protocol that standardises how an application hosting a model — an editor, an agent, a chat assistant — discovers and calls external tools, reads resources, and triggers predefined prompts. Before MCP every vendor had its own mechanism: proprietary function-calling at OpenAI, tool use at Anthropic, plugins in ChatGPT, ext in Cursor. Each integration had to be rewritten three times over and any change broke contracts in four places.
Anthropic proposed MCP in November 2024 with a simple idea: if the model and the tool speak a common JSON-RPC-based protocol, the integrator writes the server once and any compatible client consumes it. The official “USB-C for AI applications” analogy is a bit reductive but captures the right intuition: standardising the connector frees the ecosystem.
What has happened over the following eighteen months justifies treating it as a standard. During 2025 and 2026, OpenAI, Google, Cursor, Claude Code, Visual Studio Code, MCPJam, and hundreds of community tools have added native MCP support, per the public directory at modelcontextprotocol.io[1]. The result is an ecosystem where an MCP server written for Claude works in ChatGPT, Cursor, and VS Code without changes; a GitHub or Postgres MCP server is available to every client at once.
The operational consequence is what matters: integrations stop being assets tied to a specific provider and become portable assets owned by the team. If you switch models — because a better one shipped, because pricing went up, because latency doesn’t fit — your MCP servers keep working unchanged. That real portability is why in 2026 every new agent architecture starts with MCP, not with a proprietary mechanism.
Architecture: clients, servers, and orchestrator
The mental model is three roles. The LLM client is the application running the model and orchestrating the conversation: Claude Desktop, Cursor, an app using the Anthropic SDK. The MCP server is a process (or HTTP endpoint) that exposes capabilities in three forms: tools (actions the model can invoke), resources (data the model can read), and prompts (predefined templates). The orchestrator lives inside the client and is responsible for discovering servers, applying policy, presenting tools to the model with prefixes, and translating the model’s calls into JSON-RPC invocations over the right transport.
The transports that matter in 2026 are three: stdio for local servers launched as a subprocess, HTTP (with or without SSE) for remote servers, and in-process SDK for tools defined in the same binary as the client. The choice depends on who owns the process lifecycle. Generic community servers tend to be stdio (npx/uvx launches the binary at client startup). Remote services — your SaaS CRM’s MCP, an integration with a vendor — are HTTP. Custom tools tightly coupled to the product mount in-process to avoid IPC overhead.
A minimal .mcp.json descriptor illustrates the simplest piece:
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/srv/data"],
"env": {
"FS_READ_ONLY": "true"
}
}
}
}
The client reads this file at startup, launches the npx subprocess, exchanges the MCP handshake, discovers the available tools (read_file, list_directory, etc.), and presents them to the model with the mcp__filesystem__* prefix. The header diagram shows the multi-server version of this architecture, with three servers connected to the orchestrator and each one with its own credentials and policy isolated.
The standard’s observability piece is also settled: the OpenTelemetry GenAI semantic conventions[2] include since late 2025 MCP-specific attributes (mcp.method.name, mcp.protocol.version, mcp.session.id) that any server or client can emit as spans. In practice this means MCP-call traces cross-correlate in Grafana or Honeycomb with the rest of the agent’s trace without ad-hoc instrumentation. Anyone who lived through the era of instrumenting proprietary function-calling knows what this is worth.
Generic vs custom servers
The pattern that has won in 2026 is a double stack: community generic servers for horizontal capabilities, custom servers for the vertical logic of your product. Mixing the two — patching a community server to add a CRM endpoint of yours — is the mistake that produces the most silent breakages.
The community servers almost everyone ends up using are a well-known handful. npx -y @modelcontextprotocol/server-filesystem /path gives filesystem access with read_file, write_file, list_directory. npx -y @modelcontextprotocol/server-github exposes issues, PRs, and searches, authenticated with GITHUB_TOKEN. npx -y @modelcontextprotocol/server-postgres "$DATABASE_URL" provides a parameterisable query tool against the database. uvx mcp-server-git launches the Python equivalent. There are dozens more in the official directory and the list grows every month.
Custom servers play a different role. If your product has a domain — a CRM, a PIM, a logistics platform — and you want the agent to act on it, the MCP server is where that logic lives. You build it with the official SDK (TypeScript or Python are the most mature), version it alongside the product, and put contract tests on it like any internal API. The big difference with a community server is that you control the binary: credential rotation, versioning, and change policy are yours.
The operational rule for deciding what goes on each side is fairly clean. If the capability exists in the community list and fits reasonably, use it. You save code, maintenance, and most importantly the temptation to push domain logic into a generic piece. Any product-specific capability — anything an external client wouldn’t use as-is — goes in your own server from day one. There’s no middle ground that survives six months.
Explicit policies and authentication outside the model
The hardest part to get right and the most important: credentials never enter the model’s context. MCP’s architecture is designed so the model requests operations by name and parameters and the server executes with its own local credentials. If this breaks — if at any point an API token appears in the agent’s prompt or in a model message — you’ve already lost the prompt-injection battle.
The canonical pattern, documented in the Anthropic SDK reference[3], is to pass credentials as environment variables when starting the server (stdio transport) or as HTTP headers in the server descriptor (HTTP/SSE transport). In code, with the Claude Agent SDK, MCP server registration is a first-class API via the mcpServers option of query(), and control over which tools the model can call is via allowedTools, not via instructions in the prompt. The Python SDK exposes exactly the same shape. This matters: security decisions are managed in code, outside natural language, where they’re auditable.
Above authentication sits policy, and there are three levels worth separating. Policy is specified in the agent layer, not in the model. Each MCP server declares which tools it exposes; the agent decides:
- which tools can be invoked and with what parameters (
allowedTools: ["mcp__filesystem__read_file"]is very different from["mcp__filesystem__*"]) - which operations require human confirmation before executing: any write to important data, any command execution, any call that moves money
- which operations are never executed autonomously and must go through a manual or supervised path (deletes, production deploys, operations on regulated systems)
For remote servers, MCP supports OAuth 2.1 since the March 2025 spec revision, and the SDKs accept access tokens obtained by your application’s OAuth flow passed as Authorization: Bearer ${TOKEN} headers in the descriptor. Rotation works without redeploying the agent: change the token in the secret and the next server start picks up the new one.
Multi-server composition with tool prefixes
An agent that talks to a single MCP server is the exception. The norm in 2026 is half a dozen: files, database, Git repo, domain CRM, web search, calendar. So the model can choose between them all without two same-named tools clashing, the orchestrator presents each tool with a server prefix. The Anthropic SDK standardises the format mcp__<server-name>__<tool-name>:
import { query } from "@anthropic-ai/claude-agent-sdk";
const options = {
mcpServers: {
fs: { command: "npx", args: ["-y", "@modelcontextprotocol/server-filesystem", "/srv/data"] },
db: { command: "npx", args: ["-y", "@modelcontextprotocol/server-postgres", process.env.DATABASE_URL] },
web: { type: "http", url: "https://search.example.com/mcp", headers: { Authorization: `Bearer ${process.env.SEARCH_TOKEN}` } }
},
allowedTools: [
"mcp__fs__read_file",
"mcp__fs__list_directory",
"mcp__db__query",
"mcp__web__fetch"
]
};
The model sees four tools with unique names. If it calls mcp__db__query, the orchestrator resolves to “server db, method query” and routes the call to the right subprocess, with its credentials and its policy. The model never picks which server an operation goes to; it picks a specific tool and the orchestrator handles the rest.
When the tool catalogue grows, two practical bottlenecks appear. The first is context: every tool definition costs tokens, and with fifty tools registered the system prompt becomes uncomfortable. The Anthropic SDK includes tool search, which keeps definitions out of context and loads only the relevant ones each turn, enabled by default since 2025. The second bottleneck is design: if three servers expose search, it’s worth renaming them in the policy layer (docs:search, crm:search, web:search) so the model can reason about which to use without trial and error.
Real antipatterns from 2025–2026
After eighteen months of deployments, the recurring mistakes are four and all share one root: granting the model more capability than the policy justifies.
Dangerous tools without explicit policy
Exposing delete_file, rm -rf via bash, db.execute with free-form SQL, or git push --force without human confirmation is the most expensive mistake. The model isn’t malicious, it just predicts tokens; if among the tokens it predicts there’s a destructive call with plausible arguments, it will fire it. The mitigation is dull but effective: tool whitelists, human confirmation for any significant write, and for truly irreversible operations (production deletes, transfers) a channel that doesn’t go through the agent.
Custom servers without contract tests
Your MCP server is yet another API; if you change a response shape or rename an argument without bumping the version, the agent breaks silently on the next start. The mature 2026 practice is CI with a probe MCP client that enumerates tools, calls each one with canonical arguments, and verifies the response shape. Snapshot tests on the tool listing catch renames and removals immediately.
Shared memory between servers that should be isolated
If two MCP servers read and write to the same bucket or the same auxiliary table, you’ve created a side channel through which one server can poison the context another reads. Seen in production more than once: a web-scraping MCP saving unsanitised content into a table a search MCP later indexes, opening prompt injection from the open web. Each server with its own storage.
Retry loops without a breaker
The client retries an MCP call that fails, the server retries its backend, the agent retries the whole turn when it sees a generic error. Three composed retry levels saturate rate limits and produce surprise costs. Exponential backoff at each level, a breaker that opens after N consecutive failures, and OTel telemetry on error rates are the standard mitigation.
How to choose your first custom MCP server
If you’re about to write your first domain MCP server, three questions filter the decision well.
Is there a community server that covers the case reasonably? If the answer is yes — and often it is for generic cases like Postgres, GitHub, or filesystem access — start there. You save the SDK learning curve and the maintenance. If not, or the community server doesn’t fit without patches, move on.
Is the logic domain-specific or generic? If the tool does something product-specific — create an opportunity in your CRM, kick off a build in your pipeline, generate an invoice with your numbering — it’s domain logic, and goes in a custom server from day one. If it’s generic — read files, query a DB, HTTP fetch — there’s usually community. The clear signal is: could an external client use this tool exactly the same way you do? If yes, generic. If no, custom.
Does it need sensitive credentials you control? If yes — your cloud tokens, your database keys, mTLS certificates — the custom server gives you control over rotation, the binary, and the version policy. A community server touching those credentials is unnecessary supply-chain risk.
Once you’ve decided custom, the pragmatic starting pattern is two or three tools, stdio transport, policy with human confirmation for any write. Iterate from there. Contract tests from the first commit. OTel telemetry with the GenAI MCP attributes from the start. Document the tool catalogue as a public API even if only your own agent consumes it.
Deep dive
- Claude Code vs Cursor vs Copilot in 2026: a benchmarked comparison
- Building a productive agent with the Anthropic SDK, step by step
- Multi-agent systems: LangGraph vs CrewAI vs Autogen in 2026
- Agent observability with OpenTelemetry GenAI semconv in 2026
Conclusion
MCP is the default choice in 2026 for wiring models to tools, and teams following the patterns — community servers for the generic, custom for the domain, credentials outside the model, explicit policy, and prefix-based composition — will end up with stable, portable agents across providers. The standard has matured, the SDKs are solid, observability is in the OpenTelemetry spec, and the antipatterns are well documented. There’s no reasonable excuse to start a new architecture with a proprietary mechanism.
For adjacent topics: direct browser control as an alternative or complement to MCP in Claude’s Computer Use, and the bridge to real-time voice in the multi-vendor voice-agents comparison.
Follow us on jacar.es for more on MCP, agents in production, GenAI observability, and tooling architecture.