PROMPT INJECTION MEDIUM NEW

Web chatbot plugins: how insecure widgets amplify prompt injection

An IEEE S&P 2026 study of 17 chatbot plugins on 10,000+ sites found forgeable conversation histories (3-8x stronger injections) and web-scraping tools that mix trusted and untrusted content.

2026-06-11 // 6 min affects: llm-chatbot-plugins, commercial-llm-apis, customer-service-chatbots

What is this?

Most prompt-injection research targets cutting-edge assistants — copilots, coding agents, RAG pipelines. But the most common LLM application on the web is far simpler: the customer-service chatbot bubble in the corner of a retail or SaaS site. A paper titled “When AI Meets the Web: Prompt Injection Risks in Third-Party AI Chatbot Plugins,” submitted to arXiv on 8 November 2025 and accepted to IEEE S&P 2026, is the first large-scale look at that surface. Authored by Yigitcan Kaya, Anton Landerer, Stijn Pletinckx, Michelle Zimmermann, Christopher Kruegel and Giovanni Vigna (UC Santa Barbara), it studies 17 third-party chatbot plugins deployed on more than 10,000 public websites and finds that the plumbing around the LLM — not the model itself — is where the security goes wrong.

How it works

These plugins act as intermediaries between a non-expert site builder and a commercial LLM API. The study documents two structural weaknesses.

The first is conversation-history integrity. In a normal chat, each request to the LLM resends the prior turns so the model has context. The researchers found that 8 plugins (used by roughly 8,000 of the studied sites) trust the conversation history sent by the browser without verifying it server-side. An attacker who controls their own session can edit that payload before it leaves the browser — forging earlier assistant replies, and even fake system messages that the model treats as authoritative. With a fabricated history asserting that the assistant already agreed to drop its rules, a direct prompt injection becomes far more effective: the paper measured a 3 to 8x increase in the success of eliciting unintended behavior such as code generation.

The second is untrusted content mixing. 15 of the 17 plugins offer tools — web-scraping in particular — to enrich the chatbot’s context with site content. But they make no distinction between the trusted content a site owner controls (product descriptions, policies) and untrusted third-party content (customer reviews, Q&A, user comments). Anything scraped lands in the prompt with equal authority, which is the textbook setup for indirect prompt injection: a malicious review can carry instructions the chatbot then follows. The authors found ~13% of e-commerce sites studied had already wired their chatbots to third-party content, exposing the surface in the wild before any attacker even shows up.

Why it matters

The takeaway is that LLM safeguards do not survive bad integration. The underlying commercial models ship with alignment and refusal training, but an insecure plugin hands attackers the levers — forged history, unsegmented context — that route around those defenses. Because the same handful of plugins are reused across thousands of sites, a single insecure pattern scales into a long tail of vulnerable deployments run by owners who never wrote a line of the integration. This is the unglamorous middle of the market: not a flagship agent, but the chatbot most ordinary users actually touch.

Defenses

Enforce conversation-history integrity on the server. Never trust message history (and especially system or assistant roles) replayed from the client. Reconstruct or authenticate the session server-side, sign or store the canonical transcript, and reject client-supplied system messages outright. This single control removes the 3-8x amplification the paper measured.

Separate trusted from untrusted content in the prompt. Treat scraped reviews, comments and any third-party text as data, not instructions. Fence it with clear delimiters, label its provenance, and — where the plugin allows — apply spotlighting or an input filter before it reaches the model. Owner-controlled content and visitor-controlled content must not share the same authority level. This maps directly to OWASP LLM01: Prompt Injection guidance.

Constrain the blast radius. Give the chatbot the least capability it needs: no tool calls, code execution, or sensitive data access it doesn’t require for support. Add output filtering for high-risk responses (code, links, commands), and monitor for anomalous tool invocations rather than relying on the model to refuse.

If you operate a site with one of these widgets, audit whether your chatbot scrapes user-generated content, and ask your plugin vendor how conversation history is validated. The fix is in the integration layer, which is exactly where most site owners assume the vendor has it covered.

Status

Item	Detail
Source	arXiv:2511.05797, “When AI Meets the Web,” submitted 2025-11-08; at IEEE S&P 2026
Scope	17 third-party chatbot plugins across 10,000+ public websites
Finding 1	8 plugins (~8,000 sites) fail conversation-history integrity → 3-8x stronger direct injection
Finding 2	15 plugins mix trusted/untrusted content via tools → indirect injection; ~13% of e-commerce sites already exposed
Root cause	Insecure integration practices, not the underlying LLM
Action	Server-side history integrity; trusted/untrusted content separation; least-capability chatbots