DEFENSE MEDIUM NEW

MCP needs a trust handshake: attested tool-server admission

A May 22, 2026 arXiv paper proposes mcp-attested — a backward-compatible MCP extension that gates tool dispatch on signed clearance, deny-by-default allowlists, and tamper-evident audit logs.

2026-05-29 // 6 min affects: mcp, enclawed

What is this?

On May 22, 2026, Alfredo Metere posted Attested Tool-Server Admission: A Security Extension to the Model Context Protocol on arXiv (2605.24248, cs.CR). The paper grew out of a concrete operational need — letting the Enclawed agent connect to Google’s externally-operated MCP servers (Gmail, Calendar, Drive) without having to choose between blanket trust and blanket refusal — and generalises into a clean, backward-compatible additive to the protocol.

The argument starts from a structural observation about MCP as shipped today: the protocol standardises how an LLM host and a tool server exchange messages, but it has no native notion of which servers a host may use, at what sensitivity, or which of a server’s tools are in bounds. A host reads a server’s self-declared tool list and dispatches calls. That is exactly the trust model an unmediated third-party connection inherits, and it is the gap that makes accredited deployments today essentially impossible.

The paper proposes three small additions, expressed in normative RFC 2119 form so they can be adopted as an MCP addendum rather than reinvented. It arrives two days after NSA AISC’s Model Context Protocol: Security Design Considerations for AI-Driven Automation — both documents reach the same conclusion from different ends of the stack.

How it works

mcp-attested adds three layered checks to the MCP handshake. An unextended host that does not know about the well-known document ignores it and behaves exactly as today, so the design is strictly additive.

Mechanism                    Where it lives                      What it gates
---------------------------  ----------------------------------  --------------------------
Signed clearance assertion   Well-known URI on the server        Server admission
Per-server tool allowlist    Host configuration, deny-by-default Tool dispatch (per-tool)
Flavor-gated enforcement     Host runtime mode                   Warn vs. hard-deny

Signed clearance assertion. Each MCP server publishes a small, offline-signed document at a well-known URI. The host verifies the assertion against a pinned trust root before dispatching any tool call. The server is no longer admitted on the strength of “it implements MCP”; it is admitted because a trust root the host has pre-decided to honour has vouched for it at a given clearance level.

Deny-by-default per-server tool allowlist. Admitting a server is not the same as trusting its every tool. The host configures, per admitted server, the explicit subset of tools it will dispatch to. Anything outside the allowlist is denied without ever reaching the model’s tool-selection step — which closes the slowest, most expensive defence layer (the model itself reasoning about whether to call a tool) by making the dispatch path skip it entirely.

Flavor-gated enforcement mode. The same checks run in either a warning flavor (log and pass) or an enforcing flavor (log and deny). A regulated deployment can ship the enforcing flavor with no operational fall-back; a developer-mode deployment can stay in warning while a team works out allowlists. Every decision — admit, deny, warn — is written to a tamper-evident audit log so the dispatch decision is reviewable after the fact.

The paper accompanies the design with a wire format, a verification algorithm, a security analysis, and an LLM-driven adversarial evaluation. A separate February 2026 line of work, MCPShield (arXiv:2602.14281), takes a complementary cognition-layer approach — metadata-guided probing before invocation, isolated projection during invocation, periodic reasoning after — and is most usefully read alongside this paper, not against it.

Why it matters

The MCP threat model that the industry has been muddling through for eighteen months treats the host as the trust authority and the model as the gatekeeper. Both choices are wrong in different ways. The host has no protocol-level handle on who a server is. The model is not a security boundary — it is a probabilistic next-token predictor that can be talked out of refusing.

The MCP-specific consequences of leaving this gap unaddressed are by now well documented. NSA AISC’s CSI of May 20, 2026 catalogues eight weakness classes, including capability-spoofing servers and unauthenticated tool registration. Public reports — the Invariant Labs WhatsApp MCP and GitHub MCP findings, the spring 2026 wave of MCP backend CVEs — have shown that a malicious or compromised server can turn routine tool dispatches into exfiltration or filesystem corruption. None of those incidents required a clever prompt; they required only that the host take the server’s word.

What makes mcp-attested interesting is that it pulls the trust decision back out of the model entirely. The model never gets to choose to dispatch to an unattested server, because the host’s dispatch path refuses before the model’s tool selection ever runs. That is the same shape as TLS’ pre-handshake decision: the application code does not get to “consider” connecting to a server with an invalid certificate.

The price is a small amount of new operational work — managing a pinned trust root, keeping per-server allowlists, distributing signed clearance documents. The paper’s claim, which feels right after reading the recent MCP CVE wave, is that this is the cost of being able to accredit an MCP deployment at all.

Defenses

Four things are worth taking from the paper even if your stack will not adopt mcp-attested verbatim.

Pin a trust root for MCP servers and refuse the rest. Even without a formal clearance schema, host runtimes can ship with a list of fingerprints they will dispatch to, with everything else producing a hard error rather than a silent ignore.
Make per-server tool allowlists the default, not an opt-in. Treat “this server exposes a tool I have not enumerated” as a deployment bug, not a usage event. The set of tools a host will actually dispatch to should be explicit and version-controlled.
Separate warning from enforcing mode and ship audit logs from day one. Even a development MCP host should write each admission and dispatch decision to a tamper-evident log. Most production incidents are reconstructed from logs that did not exist at the time.
Read this paper next to the NSA AISC CSI and MCPShield, not in isolation. The three together cover the protocol layer (Metere), the governance layer (NSA), and the runtime cognition layer (MCPShield). No single one is sufficient.

Status

Item	Reference	Date	Notes
Attested tool-server admission paper	arXiv:2605.24248	2026-05-22	RFC 2119 wire format, signed clearance, allowlist, enforcement mode
Reference implementation	`mcp-attested` (cited in paper)	2026-05	Shipped in the `enclawed-oss` and `enclaved` distributions per §1
NSA AISC MCP CSI	nsa.gov, U/OO/6030316-26	2026-05-20	Eight weakness classes, defensive baseline
MCPShield	arXiv:2602.14281	2026-02	Complementary cognition-layer defence

MCP is not going to lose its growth curve over the next twelve months. What it can lose is the convention that “the server said so” is good enough to dispatch a tool — and Metere’s paper is the first concrete proposal that lets a host say no before the model is even asked.