AGENTS MEDIUM NEW

Stored prompt injection: when an injection outlives the session

A June 2026 arXiv paper reframes prompt injection as a stored, cross-session problem: once adversarial text lands in an agent's persistent state, it can steer executions long after the attacker is gone.

2026-06-20 // 6 min affects: llm-agents, agent-memory-systems, mcp-agents

What is this?

The paper What If Prompt Injection Never Left? Exploring Cross-Session Stored Prompt Injection in Agentic Systems, posted to arXiv in June 2026, makes a single sharp point: most prompt-injection research studies a single context window, but modern agents keep state. Memories, files, scratchpads, checkpoints and tool-visible artifacts persist between runs. Once attacker text reaches one of those stores, the injection does not end when the conversation ends — it can be re-read into a future execution context, for the same user or a different one.

The authors borrow the vocabulary of web security on purpose. Classic prompt injection behaves like reflected XSS: the malicious input is processed once, in the turn that delivered it. Cross-session stored prompt injection behaves like stored XSS: the payload is written to persistent state and re-served later. The distinction matters because the defenses, the detection windows and the blast radius are all different.

This is not a brand-new attack invented here — it formalizes a pattern that practitioners have been reporting. Palo Alto’s Unit 42 documented indirect injection poisoning an agent’s long-term memory, and the broader argument that agentic deployment makes injection worse, not better, is laid out in Christian Schneider’s write-up. The June paper’s contribution is the system-level framing.

How it works

The mechanism is structural, not a clever string. An agent reads untrusted content during a task — a web page, a document, a tool result, another agent’s message. Embedded in that content is an instruction. Instead of (or in addition to) acting immediately, the agent writes a residue of that content into persistent state: a memory entry, a note file, a summarized history, a saved plan.

On a later run, the agent’s orchestration prompt is assembled partly from that persistent state. The agent treats its own memory and files as authoritative context rather than as untrusted input — so the planted instruction is reincorporated and can influence behavior, for example by steering the agent to quietly forward conversation history or take an action the new user never asked for.

Session 1 (attacker present)
  untrusted content ──▶ agent ──▶ writes residue to PERSISTENT STATE
                                   (memory / file / summary / checkpoint)

   ... attacker leaves, time passes, monitoring resets ...

Session N (victim present)
  PERSISTENT STATE ──▶ orchestration prompt ──▶ agent acts on planted instruction

The paper highlights several persistence channels that make this durable. State that is checkpointed and later resumed lets an injection lie dormant across a temporal gap, defeating monitors that expect an effect to manifest immediately. Append-only history (for instance reducer functions that only add to a shared message log) can make an early-session injection effectively permanent. We do not reproduce a working payload here — the lesson does not require one. The point is that any write path from untrusted content into reused state is a candidate channel.

Why it matters

Single-session defenses miss this class by construction. A guardrail that inspects the current turn, or a sandbox that resets per task, can be perfectly effective and still let a stored injection through, because the malicious instruction arrives from the agent’s own trusted state on a clean turn. A complementary May 2026 paper argues bluntly that AI agents may always fall for prompt injections; persistence raises the stakes of that pessimism, because the cost of a single successful injection is no longer bounded by one conversation.

The cross-user dimension is the part defenders underestimate. If persistent state is shared — a team memory store, a common knowledge base, a multi-tenant agent — an injection planted by one user can surface in another user’s session. That turns a per-session annoyance into something closer to a persistence foothold, with a detection window measured in days rather than seconds.

Defenses

No single control removes this class; the goal is to break the write-then-reuse loop and to treat persistent state as untrusted.

Treat memory and files as untrusted input, not authoritative context. The core error is the agent trusting its own state. Re-validate persisted content on read with the same scrutiny applied to fresh external content, rather than assuming “it’s in my memory, so it’s true.”
Separate trusted from untrusted at the storage layer. Tag or partition state by provenance. Content derived from untrusted sources should be quarantined and never injected directly into orchestration or planning prompts without sanitization.
Make writes to persistent state explicit and reviewable. Gate memory writes behind policy: what can be written, by which task, from which source. Log every write with provenance so a later investigation can trace a behavior back to the session that planted it.
Scope and expire state aggressively. Prefer per-user, per-task isolation over shared mutable memory. Apply TTLs and forgetting so dormant content cannot wait indefinitely for a resume. Avoid append-only designs for anything that feeds back into prompts.
Monitor for delayed and cross-session effects, not just immediate ones. Detection that only checks the turn an input arrives in will miss checkpoint-and-resume. Watch for anomalous reads of memory, unexpected egress, and behavior that diverges from the current user’s request.
Constrain exfiltration paths from the agent context. Most stored-injection payoffs end in an outbound action — a message, a tool call, a fetch. Least-privilege tooling and egress monitoring shrink what a reactivated instruction can accomplish.

Status

Aspect	Reflected (classic) prompt injection	Cross-session stored prompt injection
Lifetime	One context window	Persists in memory / files / checkpoints
Trigger	The turn that delivers it	A later run that re-reads the state
Victim	The current session	Future sessions, possibly other users
Detection window	Immediate	Delayed (days), survives monitoring resets
Primary control	Input/output guardrails	Provenance-aware state, write gating, isolation

The takeaway from the June 2026 paper is not a new exploit but a shift in threat model: in a stateful agent, the durable artifact is the vulnerability. A single injection that reaches persistent state buys an attacker time and reach that a single-turn injection never could, and defenses designed only for the current context window will keep missing it.