DATA LEAK MEDIUM NEW

Service-side exfiltration via deep research agents

A hidden instruction in a single email made ChatGPT's Deep Research agent leak inbox data from OpenAI's own cloud — no rendering, no user action, invisible to network defenses. Here is the class and how to contain it.

2026-06-20 // 6 min affects: chatgpt, chatgpt-deep-research, connected-agents, browsing-agents

What is this?

Service-side exfiltration is a class of indirect prompt injection where a connected agent leaks your data from the provider’s cloud rather than from your browser. The reference case is ShadowLeak, disclosed by Radware on September 18, 2025 (reported to OpenAI on June 18, 2025, fixed in early August 2025). Radware showed that a single crafted email, processed by ChatGPT’s Deep Research agent connected to Gmail, could make the agent read sensitive inbox content and send it to an attacker-controlled URL — zero-click, with no visible UI and no user approval.

What makes the class distinct is where the leak happens. Earlier agent data-leak research — Aim Security’s EchoLeak and Zenity’s AgentFlayer — relied on client-side rendering: the exfiltration fired when the user’s browser loaded an attacker-controlled image. ShadowLeak moved the leak into OpenAI’s backend: the agent’s own browsing tool makes the outbound request directly.

How it works

The attack rides the lethal trifecta — private data, untrusted content, and an outbound channel — all present in one connected agent. The untrusted content is an ordinary-looking email whose instructions are hidden in the HTML (tiny fonts, white-on-white text, layout tricks) so the human never sees them, but the agent reads and obeys them when it processes the mailbox.

Radware reported that direct “send this data to my URL” instructions were reliably refused; the working approach shifted from arguing with the model to abusing the tool-execution layer. Reported bypass techniques were behavioral, not a code exploit: asserting false authorization, disguising the attacker endpoint as a “compliance” service, mandating retries to defeat non-deterministic safety checks, and instructing the agent to Base64-encode the stolen fields before appending them to the URL — so the lower execution layer saw only an opaque string instead of raw PII. (We omit the verbatim payload; the point is the pattern, not a reusable recipe.)

Untrusted document (email/file/issue/invite)
   → hidden instructions in HTML/metadata   [REDACTED]
   → agent's browsing tool fires outbound request from provider cloud
   → encoded data leaves via an attacker URL  ← no client render, no log on your network

Why it matters

Service-side leaks are harder to see and harder to stop than client-side ones. The exfiltration originates inside the provider’s network, so a secure web gateway, endpoint agent, or browser policy on your side never sees the request. Nothing is rendered, so the user has no visual cue. And where client-side image leaks are often constrained to a domain allowlist (OpenAI’s url_safe mechanism), Radware observed no comparable restriction on the URLs the agent could fetch directly — a far broader set of exfiltration sinks.

The bigger lesson is generality: any connector that feeds text into an agent is an injection vector. Radware notes the same pattern extends to Drive, SharePoint, Outlook and Google Calendar invites, Teams messages, GitHub READMEs and issues, Notion and Linear records. The agent becomes a trusted proxy carrying data out under the guise of normal tool use.

Defenses

Content sanitization before ingestion helps but is not sufficient: normalize and strip invisible CSS, obfuscated characters and suspicious HTML from documents before the agent reads them. It will not stop a well-crafted instruction that survives normalization.

The durable mitigations attack the trifecta’s third leg and the agent’s behavior:

Cut the outbound channel. On June 4, 2026 OpenAI extended Lockdown Mode to personal and self-serve Business ChatGPT accounts (first introduced February 13, 2026). It deterministically disables Deep Research, Agent Mode, live web browsing (cache-only), web image retrieval, Canvas networking, live connectors and file downloads — explicitly to remove the paths a successful injection uses to push data out. See our note on OpenAI Lockdown Mode.
Egress allowlisting. Constrain the domains an agent’s browsing/tool layer may reach to a small approved set, and treat any direct fetch tool as high-risk.
Intent monitoring. Radware’s recommended control is continuous behavior monitoring: compare the agent’s actions and inferred intent against the user’s original goal, and block deviations in real time.
Connector hygiene. Grant the narrowest scopes, isolate sensitive connectors, and log connector reads so an exfiltration attempt leaves a trace you control.

This is the attack side of a defense we have already covered; the two pair directly with the lethal trifecta framing.

Status

Item	State	Date
ShadowLeak (ChatGPT Deep Research, Gmail)	Fixed by OpenAI	Early Aug 2025
Service-side exfiltration class	Ongoing across connectors	2025–2026
OpenAI Lockdown Mode (cuts outbound leg)	Rolled to personal/Business	Jun 4, 2026
Client-side leaks (EchoLeak, AgentFlayer)	Prior, patched	2025

Service-side exfiltration is not a single bug to patch once; it is a structural property of connected, autonomous agents. Until intent-level monitoring and strict egress control are standard, the safest posture for sensitive data is to deny the agent an outbound channel it does not strictly need.