Prompt injection is unsolved — so contain it at machine speed
At Infosecurity Europe 2026, OWASP's Ariel Fogel called prompt injection an unresolved architectural problem and argued defenders must shift from prevention to runtime containment that runs as fast as the agent.
What is this?
On June 8, 2026, Infosecurity Magazine reported remarks made at Infosecurity Europe 2026 by Ariel Fogel, an AI security researcher in the office of the CTO at Pillar Security and a contributor to OWASP’s GenAI Security Project. His message was blunt: prompt injection “remains an unresolved problem” at the architectural level, and the gap is widening as agents gain tools and the ability to act.
This is not a new vulnerability disclosure. It is a state-of-the-field assessment from inside the body that writes the OWASP Top 10 for LLM and agentic applications — and it comes in the same week OWASP unveiled its Agentic Research Council (announced June 4) to close the gap between research and deployable mitigations. We cover it because the practical takeaway has shifted: defenders should stop expecting a fix and start designing for containment.
How it works
The root cause is structural, not a bug to be patched. An LLM processes its input as a single token sequence, and there is no reliable mechanism inside the model to enforce a privilege boundary between the system prompt, the user’s query, and content an agent retrieves from the outside world. Trusted instructions and untrusted data arrive in the same channel, so any text the model reads can compete to become an instruction. This is the same architectural fact behind Simon Willison’s Lethal Trifecta — private data, exposure to untrusted content, and an exfiltration path — and behind the broader argument that agent security is a systems problem, not a model problem.
What changes with agents is the consequence. Fogel’s point is that a successful injection no longer just produces a bad answer; when the executor is an agent with tool access, it can trigger a chain of real-world actions — an escalation from bad output to active compromise.
The part worth dwelling on is how human-era controls fail in this setting. Fogel described two patterns observed in real attacks:
Control (human-era) How it fails against an agent
---------------------- ----------------------------------------------------
Allow-list of commands The commands the agent needs are already approved,
so the allow-list *streamlines* the exploit instead
of blocking it.
Sandbox boundary The agent's own output redefines the boundary —
effectively rewriting the containment meant to
stop it.
Manual review Attacks unfold in minutes; human review cycles are
too slow to be in the loop for every action.
Heuristics such as the Lethal Trifecta and Meta’s Agents Rule of Two (an agent should satisfy at most two of the three trifecta properties without human approval) help shrink the blast radius, but Fogel cautioned they are not complete defenses — published research already shows attacks that succeed with only two of the properties present.
Why it matters
Most organizations, Fogel noted, are deploying agents faster than they can govern them. That gap matters because the failure mode is no longer cosmetic. The same speed and scale that make agents useful also collapse the time-to-impact of an injection, and the controls many teams are relying on — allow-lists, sandboxes, periodic review — were designed for human operators and can be turned into accelerants. Treating prompt injection as a solvable input-validation problem leads to over-trusting agents that should be tightly constrained.
Defenses
The recommended posture moves from prevention-only thinking to constraining what an injected agent can do, with controls that operate at the agent’s speed:
- Assume injection will succeed. Design the blast radius first: scope each agent’s tools, data, and outbound network access to the minimum the task needs.
- Budget the trifecta. Apply the Rule of Two — require a human checkpoint before any session combines private data, untrusted content, and an exfiltration channel — while knowing two-property attacks exist.
- Monitor at machine speed. Use live behavioral monitoring of tool calls with real-time containment and hard stop mechanisms, rather than after-the-fact log review.
- Tighten identity and sessions. Issue ephemeral, narrowly scoped credentials and add cryptographic attestation so every agent action is traceable and time-limited.
- Join safety and security response. Build incident playbooks that cover machine-speed, multi-agent scenarios, with human-on-the-loop oversight rather than per-action human-in-the-loop approval that cannot keep pace.
Status
| Item | Detail |
|---|---|
| Source | Ariel Fogel (Pillar Security / OWASP), Infosecurity Europe 2026 |
| Reported | June 8, 2026 (Infosecurity Magazine) |
| Nature | Architectural limitation — no patch; mitigation is containment |
| Related | OWASP Agentic Research Council launched June 4, 2026 |
This is an assessment, not an exploit: there is no payload to publish and no single vendor to patch. The durable lesson is that, until models and runtimes can enforce firm privilege separation between instructions and data, prompt injection is a property of the environment defenders must contain — not a bug they can wait out.