Indirect prompt injection in the wild: three April 2026 studies converge
Google, Forcepoint and CISPA independently measured indirect prompt injection across the open web in April 2026. The picture: 15K+ validated payloads, 32% growth, organized templates.
What is this?
Three independent measurement studies published in late April 2026 confirm something the security community had long suspected but never quantified: indirect prompt injection (IPI) is no longer a laboratory curiosity. Adversaries are actively seeding the open web with instructions aimed at LLM-powered agents — and the practice is growing fast enough to register in continental-scale crawls.
The three reports landed within five days of each other:
- Google Security Team (April 23, 2026) — a scan of 2–3 billion crawled pages per month across blogs, forums and comment sections, comparing CommonCrawl snapshots from November 2025 and February 2026.
- Forcepoint X-Labs (April 24, 2026) — active threat hunting on public web infrastructure, with telemetry flagging payloads triggered on patterns such as
Ignore previous instructionsandIf you are an LLM. - Khodayari, Zhang, Acharya and Pellegrino at CISPA Helmholtz Center (arXiv:2604.27202, April 29, 2026) — an academic empirical study of 1.2B URLs across 24.8M hosts, identifying 15.3K validated injection instances on 11.7K pages.
The convergence matters more than any single finding: two corporate red teams and one academic group, with different methodologies, all see the same trend.
How it works
Indirect prompt injection is the original Greshake et al. 2023 attack class — embed instructions in content the model will later ingest as data, and watch the agent follow them. What changed in 2026 is the scale and the realism of the carriers.
The CISPA paper finds that 54 prompt templates account for roughly 95% of all detected instances. That is the signature of organized tooling, not isolated experimentation. Forcepoint independently observes “shared injection templates across multiple domains” and notes one widely distributed payload that appears to function as a test probe — a way to fingerprint which AI products fetch and obey untrusted text before deploying higher-impact payloads.
Visibility is the other striking finding. About 70% of the CISPA-validated payloads sit in non-rendered HTML — headers, comments, metadata. Of the rest, 87% are visually hidden through one of three techniques, with counts roughly:
# CISPA dataset (15.3K validated payloads, April 2026)
non-rendered HTML (headers/comments/metadata) ≈ 70%
visible-but-hidden via rendering:
color/contrast manipulation 2,397
occlusion (overlays) 1,860
viewport-based hiding (off-screen) 1,802
The visible carriers split into four broad objectives:
- Reputation manipulation (~1.5K instances) — instructions telling search-LLM pipelines to promote a product, force a citation, or downrank a competitor.
- Data-protection / anti-scraping directives (~4K) — site owners asking models to refuse summarisation or to omit content.
- AI-bot identification probes (~3K) — payloads asking the agent to reveal its model name and version.
- Disruptive / destructive payloads — Google documented examples instructing the agent to delete files on the user’s machine; Forcepoint isolated payloads embedding a complete PayPal transaction or routing AI-mediated payments through a Stripe donation link via a “persuasion amplifier” keyword (
ultrathink).
Practical effectiveness remains moderate. CISPA’s 5,200 controlled experiments across 13 models and four webpage representations show compliance peaking at 8% on plain-text inputs for smaller models, and dropping to 0.2–1.1% when the structural cues of the original HTML are preserved. Non-negligible, in other words, but far from universal.
Why it matters
The 32% growth figure (Google, November 2025 → February 2026) is the headline data point. Three other implications matter for anyone shipping LLM features.
First, the threat surface scales with agent privilege, not with model intelligence. Forcepoint’s framing is exact: “A browser AI that can only summarize is low-risk. An agentic AI that can send emails, execute terminal commands or process payments becomes a high-impact target.” The same payload that produces a funny chatbot reply against a passive reader becomes a wire transfer against an unconstrained agent.
Second, the attacker economy is consolidating. Recurring templates and a test-probe payload point at tooling and reconnaissance rather than one-off pranks. The 2023 era of Ignore previous instructions is giving way to industrialised IPI, with the same maturation curve security teams have watched in SEO spam, malvertising and supply-chain typosquatting.
Third, structural representations help. Both studies find that giving the model the original HTML structure — not flattened text — reduces compliance with embedded instructions. This is consistent with the contextual integrity framing from Abdelnabi and Bagdasarian (arXiv:2605.17634) and gives defenders something concrete to optimise.
Defenses
These measurements do not change the defensive playbook fundamentally; they sharpen its priorities.
- Treat web content as untrusted by default. Any agent that ingests fetched pages should run under the Agents Rule of Two — never combine untrusted input, access to private data, and the ability to change state in the same session.
- Preserve structural cues. Pass HTML to the model with its boundaries intact (headings, code blocks, metadata zones) rather than flattening to plain text. The CISPA experiments quantify the gain: roughly an order of magnitude lower compliance.
- Strip the hiding tricks before retrieval. Render the page, then export the visible DOM only, dropping HTML comments,
metatags, off-screen elements, and text withvisibility:hidden,display:none, near-zero contrast, or 1-pixel sizing. Most real-world payloads die in this filter. - Allow-list domains for sensitive flows. If the agent can act on payments, code, or internal data, restrict the corpus to known-good sources rather than the open web.
- Watch for the high-signal templates. The 54 templates that explain 95% of injections are fingerprintable. Pre-filtering with a small classifier or even regex matching on the strongest patterns (
Ignore previous instructions,If you are an LLM,metainjection of role tags) catches the long tail at near-zero cost. - Log every fetched-content → action edge. When an agent decides to act, record the upstream document that justified it. Reviewing the first few thousand of these reveals the contextual-integrity violations the CISPA data describes.
Finally: assume your testing corpus is contaminated. The CISPA paper points out that some payloads target hiring workflows and customer-support agents specifically. If your red-team set comes from the open web, it almost certainly contains live IPIs.
Status
| Item | Reference | Date | Notes |
|---|---|---|---|
| CISPA empirical study | arXiv:2604.27202 | 2026-04-29 | 1.2B URLs, 24.8M hosts, 15.3K validated payloads |
| Google blog post | security.googleblog.com | 2026-04-23 | 32% growth Nov 2025 → Feb 2026 |
| Forcepoint X-Labs report | forcepoint.com | 2026-04-24 | Payment-redirection payloads, test probes |
| Help Net Security coverage | helpnetsecurity.com | 2026-04-24 | Synthesis of Google + Forcepoint |
| Related contextual-integrity result | arXiv:2605.17634 | 2026-05-17 | Why data/instruction separation is the wrong frame |
Three measurement studies in five days, agreeing on direction and order of magnitude, is rare. The web is no longer a passive corpus that LLM agents can naively consume — it is becoming an active adversary, and the agents most exposed are the ones with the most privilege.
Sources
- → https://arxiv.org/abs/2604.27202
- → https://arxiv.org/html/2604.27202v1
- → https://www.helpnetsecurity.com/2026/04/24/indirect-prompt-injection-in-the-wild/
- → https://security.googleblog.com/2026/04/ai-threats-in-wild-current-state-of.html
- → https://www.forcepoint.com/blog/x-labs/indirect-prompt-injection-payloads