system: OPERATIONAL
← back to all hacks
INDIRECT INJECTION MEDIUM NEW

Message-object injection: the serialization gap in AI assistants

Imperva showed (June 10, 2026) that contacts, vCards and location pins get flattened inline into an AI assistant's prompt with no untrusted-content boundary — a structural injection vector, patched in OpenClaw 2026.4.23.

2026-06-21 // 6 min affects: openclaw, gemini-3.1-pro

What is this?

On June 10, 2026, Imperva Threat Research (Yohann Sillam) published an analysis of a prompt-injection vector that does not live in a document or a web page, but in the structured messaging objects a personal AI assistant routinely handles: shared contacts, vCards (.vcf), and geolocation pins. The work targeted OpenClaw, a popular self-hosted agent that wires an LLM to file systems, shells and messaging platforms such as WhatsApp and Telegram. The Hacker News reported it the next day alongside parallel findings from Varonis.

The injected instruction was invisible to the victim, crossed the trust boundary into the authenticated user context, and — in Imperva’s lab against Gemini 3.1 Pro (preview) — got the agent to download and run a script from a researcher-controlled server. Imperva disclosed responsibly; OpenClaw shipped a fix in version 2026.4.23.

How it works

The flaw is in the plumbing, not the model. When OpenClaw passes web content to the LLM, it wraps that content in an untrusted-content marker. When it passes a message object — a contact, a vCard, a location label — it flattens the object straight into the prompt text inline, with no boundary marking it as untrusted.

Only some fields travel to the model, and that is what the technique abuses. A shared contact sends just its name field, serialized along the lines of <contact: name, number>. Angle brackets are perfectly legal characters inside a name, so the model has no reliable way to tell where the genuine name ends and injected text begins. On screen, the contact name is truncated, so the victim does not see the trailing payload either. The same logic applies to a vCard’s full-name (FN) field, which WhatsApp supports natively, and to the label on a shared location pin.

Tellingly, Imperva reported that a plain image carrying hidden instructions failed — that attack has been publicized so often that models are now trained to resist it. The message-object route worked precisely because models have seen far fewer examples of it. No payload is reproduced here; the structural point is enough.

The deeper issue: there is no standard for how messaging objects are serialized before reaching an LLM. Tool integration has MCP; web fetches have untrusted-content wrappers; rich message objects have neither. Each assistant flattens them its own ad-hoc way, and Imperva observed the same flattening pattern in other personal assistants — so this is not OpenClaw-specific.

Why it matters

Personal AI assistants are not chatbots; they are authenticated executors with access to files, shells and connected accounts. An injection that reaches one inherits that access. Two properties make the message-object vector worse than a typical indirect injection. First, invisibility on both ends — neither the model nor the human sees the payload as anomalous. Second, virality plus persistence: a single piece of shared content (a contact card forwarded thousands of times) combined with an agent’s default-on memory can quietly seed compromise across every assistant that ingests it, where execution is not sandboxed.

This is the lethal trifecta in concrete form — access to private data, exposure to untrusted content, and an outbound channel — delivered through a serialization seam most defenders never modeled. It belongs to the same family as indirect prompt injection in the wild, but through a channel that input-sanitization rules aimed at documents and URLs simply do not cover.

Defenses

  • Mark every untrusted field as untrusted. OpenClaw’s fix is the template: move contact names, vCard fields and location labels out of the inline prompt body into a structured untrusted-metadata channel, so the model receives them tagged, not blended into instructions. Apply this to all message-object fields, not just the ones in a current advisory.
  • Enumerate every serialization seam. Audit each path by which a structured object (contact, calendar invite, vCard, location, rich link) is flattened into the prompt. Each one is an injection channel until proven otherwise.
  • Keep execution sandboxed and least-privileged. Code execution should be off or sandboxed by default; scope skills and connectors to the minimum the task needs. The injection’s blast radius is exactly the agent’s standing authority.
  • Gate outbound and high-risk actions. Require human approval for first-time sends to unfamiliar destinations and for credential or money-moving actions, so a hijacked agent cannot complete an exfiltration leg unattended.
  • Treat memory as attack surface. With persistence on by default, a single poisoned object can leave durable instructions. Bound what untrusted content may write to long-term memory.

Status

ItemDetail
Disclosed2026-06-10 (Imperva Threat Research)
VectorInjection via message-object fields (contact name, vCard FN, location label) flattened inline
Tested againstOpenClaw + Gemini 3.1 Pro (preview)
FixOpenClaw 2026.4.23 — untrusted fields moved to a structured metadata channel
ScopeImperva observed the same flattening pattern in other personal assistants

The patch closes OpenClaw’s instance. The class — rich message objects serialized into prompts without an untrusted boundary — stays open until assistants adopt a consistent way to carry structured, untrusted fields to the model. Until then, every new object type an assistant learns to ingest is a new injection channel to audit.

Sources