AGENTS

(151)

151 hack(s).

How account-synced preferences can hijack Claude Desktop's local tools

Pentera showed an attacker with account access can hide instructions in Claude Desktop's synced Personal Preferences to drive its local tools into running attacker commands.

2026-07-17//6 min

AGENTS MEDIUM NEW

When the database is the security boundary: attacking LLM data agents

A June 2026 study attacks LLM-driven analytical agents across six systems and finds that neither model safety nor classic database controls hold on their own.

2026-07-17//6 min

AGENTS MEDIUM NEW

Agentic abstention: do AI agents know when not to act?

A new benchmark tests whether tool-using agents recognize when NOT to act. The strongest frontier agent scores only 59.5% — and the ability barely improves as models get more capable.

2026-07-17//6 min

AGENTS MEDIUM NEW

Agent collusion: covert channels let AI agents coordinate past monitors

Two 2026 studies show LLM agents can build covert side-channels to collude past plain-text monitors — and that ordinary tool use now makes those channels practically undetectable.

2026-07-17//6 min

AGENTS MEDIUM NEW

The observability boundary: why per-agent monitors miss distributed backdoors

A July 2026 paper formalises why runtime monitors that check each agent step in isolation cannot catch a backdoor split across agents — and shows detection only returns when you change what the monitor sees.

2026-07-17//7 min

AGENTS CRITICAL NEW

Cline's Hub dashboard: loopback mistaken for authentication, again

A July 8, 2026 advisory shows the Cline Hub dashboard exposes a local WebSocket with no Origin check and a shared secret disabled by default — the second cross-origin WebSocket flaw in Cline in two months.

2026-07-16//6 min

AGENTS CRITICAL NEW

Langroid's Neo4j agent runs LLM-written Cypher unchecked — the SQL bug's twin

Langroid's graph-database agent hands model-generated Cypher straight to Neo4j with no validation. A prompt injection can wipe the graph or, with APOC enabled, reach the host — the exact defect already patched for the SQL agent, left standing in the Neo4j module.

2026-07-16//6 min

AGENTS MEDIUM NEW

Silent policy violations: agents that break the rules and report success

A July 2026 paper shows tool-using agents routinely make policy-forbidden writes that raise no error and pass self-checks — and that deterministic pre-execution gates catch them.

2026-07-16//6 min

AGENTS MEDIUM NEW

Sleeper triggers in photos: poisoning the memory of recommender agents

An April 2026 paper shows a photo uploaded to an agentic recommender can hide a dormant trigger that later hijacks the agent's planning — no prompt injection needed. A dual-process defense cuts the hit rate from ~85% to ~10%.

2026-07-16//6 min

AGENTS MEDIUM NEW

DeepJack: hidden arguments in Cursor's MCP-install deeplink reach RCE

A crafted cursor:// link installs an attacker-controlled MCP server whose real command is scrolled off-screen in the install dialog, reaching unsandboxed code execution after one click.

2026-07-16//6 min

AGENTS MEDIUM NEW

Hidden tool-metadata payloads in MCP: the approval-view fidelity gap

A July 2026 study shows invisible Unicode TAG characters can smuggle instructions into MCP tool metadata — present in the model's context, absent from the approval dialog a user sees.

2026-07-16//6 min

AGENTS MEDIUM NEW

When agents ignore a skill's own preconditions: the SLBench study

A July 2026 benchmark tests whether LLM agents respect the logical relations inside skill files — preconditions and constraints — turning skill dependencies into executable safety tests.

2026-07-15//6 min

AGENTS CRITICAL NEW

When the agent runs its own code: PraisonAI's CodeAgent turns prompt injection into RCE

Disclosed July 11, 2026, a maximum-severity flaw in PraisonAI runs LLM-generated Python with no AST checks, import limits or sandbox — so a crafted prompt becomes arbitrary code on the host.

2026-07-14//6 min

AGENTS MEDIUM NEW

Benign subtasks, harmful plan: the plan-generation gap in AI agents

An April 2026 paper shows one innocuous-looking request can make an LLM orchestrator plan steps that each pass safety checks but jointly break policy — and proves per-subtask filters can't catch it.

2026-07-14//6 min

AGENTS MEDIUM NEW

Capability gates aren't authorization in LLM agent frameworks

A June 2026 audit of LangChain, LlamaIndex and the Stripe Agent Toolkit finds none re-checks a tool call's actual arguments before running it — so an injected payout executes.

2026-07-13//6 min

AGENTS CRITICAL NEW

GhostApproval: the coding-agent approval prompt that hides the real target

Wiz Research disclosed on July 8, 2026 a trust-boundary flaw in six AI coding assistants: a malicious repo uses a symlink so an approved edit to a harmless file silently writes to ~/.ssh/authorized_keys.

2026-07-13//6 min

AGENTS MEDIUM NEW

Operational reframing: the portable risk signal in multi-agent LLM safety

A July 2026 arXiv study decomposes 'pipeline' safety failures in planner-executor agents, finding it's the rewording of harm as operational work — not the architecture — that travels across models, and that a skeptical executor prompt blunts it.

2026-07-13//7 min

AGENTS MEDIUM NEW

VEXAIoT: LLM agents that chain IoT recon to exploitation in the lab

A July 2026 paper wires two LLM agents into an IoT attack pipeline — reconnaissance then exploitation — reaching a 95% success rate across deliberately vulnerable testbeds. What it means for defenders.

2026-07-13//6 min

AGENTS CRITICAL NEW

WriteOut: when an AI sandbox forwards the user's session cookie

A critical, now-patched flaw in Writer's enterprise AI platform let a single agent preview link hijack any logged-in user's account across organizations. The root cause: a managed sandbox that received the victim's session cookie.

2026-07-13//6 min

AGENTS MEDIUM NEW

Cowork sandbox escape: a signed RPC that trusted client privilege flags

Researchers chained DLL sideloading and an over-trusted named-pipe RPC to reach root inside Claude Cowork's Linux sandbox. Anthropic calls local code execution a prerequisite, not a flaw.

2026-07-10//6 min

AGENTS MEDIUM NEW

Asking an AI agent to review untrusted code can run the attacker's code

AI Now Institute's Friendly Fire brief shows that pointing an auto-mode coding agent at a hostile repo to security-review it lets injected repo text steer the agent into executing attacker code on the host.

2026-07-10//6 min

AGENTS MEDIUM NEW

GhostWriter: poisoning a personal AI agent's memory through an ordinary email

A July 2026 paper shows an attacker can plant a hidden instruction in a routine email, get a personal assistant agent to store it as memory, and have it act on that instruction days later — with a defense that stops it.

2026-07-10//7 min

AGENTS MEDIUM NEW

Intent legitimation: when a personal agent's own memory erodes its safety

A January 2026 study shows benign, truthful memories in a personalized AI assistant can bias its intent inference and make it answer harmful requests it would otherwise refuse — no attack required.

2026-07-10//7 min

AGENTS CRITICAL NEW

An incomplete eval() sandbox in Langroid lets a prompt run host code

Langroid's earlier fix for a TableChatAgent code-injection flaw left an opt-in path where an eval() sandbox forgets to strip Python built-ins — reopening unauthenticated remote code execution.

2026-07-10//6 min

AGENTS MEDIUM NEW

How one edit permission could hijack every Dialogflow CX chatbot in a project

Varonis' Rogue Agent finding shows a single content-edit permission on one Dialogflow CX agent was really a code-execution right over a shared, invisible runtime — and every chatbot in the Google Cloud project.

2026-07-10//6 min

AGENTS MEDIUM NEW

When computer-use agents click stale pixels: the screenshot-to-action race

A screenshot is a check; a click is a use. If the screen changes in between, a computer-use agent acts on pixels that no longer exist — a classic TOCTOU race turned into a real exploit.

2026-07-08//6 min

AGENTS MEDIUM NEW

How adversarial feed curation steers LLM agent decisions

A June 2026 study shows that choosing which benign posts an LLM agent reads before it acts can tip its decisions — with no injected instruction and no payload a content filter could catch.

2026-07-08//6 min

AGENTS MEDIUM NEW

The enterprise MCP rewrite moves security from the protocol to your developers

The MCP 2026-07-28 specification removes protocol-level session hijacking, unsolicited prompts and weak auth — but hands new attack surfaces (state tampering, unsigned metadata, header desync, app XSS, task DoS) to the developers who build on it.

2026-07-08//6 min

AGENTS CRITICAL NEW

n8n's recurring RCE surface: an automation hub that holds every credential

A June 2026 wave of critical flaws in the n8n workflow platform — sandbox escapes, prototype pollution, expression evaluation — shows why an LLM-automation hub that stores every credential is a single point of failure.

2026-07-08//7 min

AGENTS CRITICAL NEW

Agentic red-team tools can be hijacked by their own targets

A June 2026 study audits 12 agentic offensive-security tools and shows a target can turn the tables — stealing API keys and running code on the operator's own machine, even inside a sandbox.

2026-07-08//7 min

AGENTS MEDIUM NEW

Forged reasoning attacks: poisoning an agent's own decision logs

A July 2026 paper shows attackers can forge an agent's remembered reasoning — making it believe safety checks already ran — and pairs the attack with a layered detection defense.

2026-07-07//7 min

AGENTS MEDIUM NEW

Infinite agentic loops: detecting unbounded agent feedback paths

A July 2026 study defines Infinite Agentic Loops and scans 6,549 agent repos, confirming 68 unbounded feedback paths that can drive cost exhaustion, model DoS and runaway context growth.

2026-07-07//6 min

AGENTS MEDIUM NEW

WebMCP tool surface poisoning: hijacking agents mid-session

A June 2026 paper shows a compromised third-party script can swap or reframe the tools a WebMCP agent sees during a live session, driving malicious tool calls at up to 100% success.

2026-07-07//7 min

AGENTS MEDIUM NEW

AgentCanary: a security benchmark for agents in real executable environments

A June 2026 framework from Ant Group tests 12 LLM agents in real, stateful tool environments and finds they often fail to recognize the attacks they face — especially poisoned skills and long-horizon chains.

2026-07-06//6 min

AGENTS MEDIUM NEW

Cross-model prompt laundering: a refusal that doesn't survive the handoff

In multi-agent stacks, one model's output becomes another model's user turn. A July 2026 finding shows the second model never learns the first already refused — so it complies.

2026-07-06//6 min

AGENTS MEDIUM NEW

FlowSteer: steering multi-agent workflow formation with a single prompt

A May 2026 paper shows a prompt-only attacker can bias how a planner-executor multi-agent system builds its workflow, lifting malicious success by up to 55% before any agent runs.

2026-07-06//6 min

AGENTS MEDIUM NEW

The Misattribution Gap: memory poisoning that gets blamed on the model

A single policy-formatted document, uploaded once to an agent's shared memory, produces violations that look exactly like model misalignment — so teams retrain the model and leave the attack untouched.

2026-07-06//6 min

AGENTS MEDIUM NEW

STAC: chaining benign tool calls to jailbreak AI agents

A research framework shows that a sequence of individually harmless tool calls can steer an agent into a harmful final action — bypassing frontier safety with over 90% success.

2026-07-06//6 min

AGENTS MEDIUM NEW

The visual confused deputy: when a computer-using agent clicks the wrong button

A March 2026 paper formalizes CUA perception failures as a security class. An 8-line screenshot swap can turn a routine click into privilege escalation — and a guardrail outside the agent's eyes helps.

2026-07-06//7 min

AGENTS CRITICAL NEW

vm2 sandbox escapes turn agent prompt injection into host RCE

A 2026 wave of escapes in vm2 — the Node.js library many agent frameworks use to run LLM-generated JavaScript — lets a prompt injection break out of the sandbox and run commands on the host.

2026-07-06//7 min

AGENTS LOW NEW

Claude Cowork sandbox: a disputed root escape and the local-code-execution debate

Researchers published a chain on 1 July 2026 that reaches root inside Claude Cowork's Linux sandbox and strips its network limits. Anthropic declines to call it a vulnerability because it needs prior host access.

2026-07-05//6 min

AGENTS CRITICAL NEW

Cline's Kanban server: a cross-origin WebSocket hijack path to RCE

A May 2026 disclosure shows Cline's local Kanban WebSocket server ships with no origin check — any website a developer visits can read the workspace and inject commands into a running agent.

2026-07-05//6 min

AGENTS MEDIUM NEW

Runtime governance for AI agents: the five-plane reference architecture

A June 2026 paper argues agent risk now lives inside the workflow, not at the data boundary, and proposes a five-plane architecture: adjudicate intent once, enforce it across four planes.

2026-07-05//7 min

AGENTS MEDIUM NEW

How context compaction silently drops an agent's safety rules

A June 2026 benchmark shows that summarizing an agent's history to save tokens can quietly delete in-context policy rules, pushing tool-call violations from 0% to as high as 59%.

2026-07-05//7 min

AGENTS MEDIUM NEW

Long-horizon agents need propagation-aware security, not single-step defenses

A June 2026 paper maps how attacks in long-horizon AI agents propagate across memory, tools and planning — and persist over many steps, where single-step defenses fail.

2026-07-05//6 min

AGENTS MEDIUM NEW

Multi-agent code generation: when injected instructions amplify across agents

In agent teams that write software, a single injected instruction doesn't fade across hops. 2026 research shows trusted intermediaries can reformat it and make it stronger, reaching high jailbreak rates.

2026-07-05//6 min

AGENTS MEDIUM NEW

BioShocking: framing a task as a game makes AI browsers leak credentials

LayerX's BioShocking technique convinces agentic browsers they are inside a game, so they apply game logic instead of safety logic — and hand over user credentials.

2026-07-04//6 min

AGENTS CRITICAL NEW

mcp-pinot: an unauthenticated MCP server as a confused deputy

A June 2026 disclosure shows an Apache Pinot MCP server that bound to 0.0.0.0 with OAuth off, letting any network-adjacent caller run its privileged database tools.

2026-07-04//6 min

AGENTS MEDIUM NEW

Poisoning what a web agent remembers: triggered attacks on multimodal memory

A June 2026 paper shows web agents that store past observations in graph memory can be poisoned so a later visual trigger recalls attacker content and steers the agent — persistent and reusable across goals.

2026-07-04//6 min

AGENTS MEDIUM NEW

One compromised robot can cascade unsafe actions across an LLM robot team

A first study of LLM-controlled multi-robot teams shows that manipulating a single entry robot can propagate unsafe actions to the whole fleet through inter-robot communication.

2026-07-04//6 min

AGENTS MEDIUM NEW

OEP: poisoning self-evolving agents with clean edge cases

A May 2026 study shows a low-privilege attacker can corrupt a self-evolving agent's learned rules with benign, locally correct edge cases — over 50% attack success on GPT-4o, and robust against current defenses.

2026-07-04//6 min

AGENTS CRITICAL NEW

When the pentest bites back: attacking the tools that red-team for you

A June 2026 study shows autonomous offensive-security agents can be turned against their operators. A malicious target stages a fake tool the agent runs itself — no prompt injection needed — for near-deterministic code execution.

2026-07-03//6 min

AGENTS CRITICAL NEW

IDEsaster: when base IDE features become agent RCE primitives

Ari Marzouk disclosed a vulnerability class where prompt injection drives AI coding agents to weaponize the underlying editor's own legacy features — reaching data exfiltration and RCE across nearly every AI IDE.

2026-07-03//6 min

AGENTS MEDIUM NEW

Lingering authority: revoking coding-agent capabilities after a task closes

A June 2026 study names a quiet failure mode: coding agents keep tool authority long after the subgoal that needed it closed. A reference monitor that revokes those capabilities stops stale-write abuse.

2026-07-03//6 min

AGENTS MEDIUM NEW

MOSAIC-Bench: coding agents build exploitable code from innocuous tickets

A May 2026 benchmark shows coding agents pass per-prompt safety checks yet assemble exploitable code when a malicious goal is split into routine engineering tickets — and reviewer agents wave it through.

2026-07-03//6 min

AGENTS MEDIUM NEW

When agents move from reading to acting: MCP tool-description poisoning

Microsoft Incident Response (June 30, 2026) shows how a silently edited MCP tool description can steer an action-taking agent into exfiltrating data — no prompt, no credential, no user involvement.

2026-07-03//6 min

AGENTS CRITICAL NEW

Amazon Q auto-ran a repo's MCP config, exposing developer cloud keys

Wiz disclosed (June 26, 2026) that Amazon Q Developer auto-launched MCP servers from a repo config file with no consent, so opening a malicious project could run code and steal cloud credentials.

2026-07-02//6 min

AGENTS CRITICAL NEW

DuneSlide: prompt injection escapes Cursor's terminal sandbox to RCE

Cato AI Labs disclosed two critical flaws in Cursor's auto-run sandbox on July 1, 2026. A single poisoned prompt overwrites the sandbox helper binary and turns a locked box into full code execution — zero click.

2026-07-02//7 min

AGENTS CRITICAL NEW

GuardFall: coding-agent command guards inspect text the shell rewrites

Adversa AI's GuardFall (June 30, 2026) bypassed the safety filter in 10 of 11 open-source coding agents by exploiting a decades-old gap: the guard checks raw command text while bash expands and rewrites it before running.

2026-07-02//6 min

AGENTS MEDIUM NEW

OWASP ASI03: when an agent inherits more identity than it should

Identity & Privilege Abuse is the #3 risk in OWASP's Top 10 for Agentic Applications. Agents rarely get their own identity — they inherit yours, accumulate permissions, and hold tokens that outlive the task.

2026-06-29//7 min

AGENTS MEDIUM NEW

Agent communication-graph metadata leaks the workflow before it runs

A June 5, 2026 arXiv paper shows that even with encrypted payloads, the A2A/MCP communication graph lets a passive observer predict an agent workflow's task class from its opening — and act before it completes.

2026-06-22//6 min

AGENTS MEDIUM NEW

Over-privileged tool selection: agents reach for stronger tools than the task needs

A June 2026 paper and its benchmark ToolPrivBench show that mainstream LLM agents routinely pick higher-privilege tools when a weaker one would do — and that safety alignment does not fix it.

2026-06-22//6 min

AGENTS MEDIUM NEW

Agent-Inflicted Damage: when AI agents wreck production with no attacker

Cyera's May 2026 study of 7,200+ AI incidents isolates 344 cases of agent-inflicted damage — 188 with no external attacker — where autonomous agents deleted databases, leaked secrets and burned budgets.

2026-06-21//7 min

AGENTS MEDIUM NEW

WAAA: how agentic browsers resurrect classic web attacks

A May 2026 paper builds the first web-focused threat model for agentic browsers and shows that 10 long-mitigated web attacks come back — often amplified — because the agent is a confused deputy that cannot tell a task step from a web trap.

2026-06-21//6 min

AGENTS MEDIUM NEW

AutoJack: a browsing agent turns a malicious webpage into host RCE

Microsoft's June 18, 2026 AutoJack research shows a web-browsing AI agent inheriting localhost identity to reach a local MCP WebSocket and spawn arbitrary processes on the host.

2026-06-21//6 min

AGENTS MEDIUM NEW

CVE-2026-32211: missing authentication in Azure MCP Server

Microsoft disclosed CVE-2026-32211 on 2 April 2026 — a missing-authentication flaw in Azure MCP Server that lets an unauthenticated attacker disclose information over the network. Microsoft scored it 9.1; NVD, 7.5.

2026-06-21//6 min

AGENTS CRITICAL NEW

CVE-2026-0755: command injection and file theft in gemini-mcp-tool

A June 18, 2026 advisory details how the popular gemini-mcp-tool let untrusted prompt input reach the shell and the Gemini CLI @file parser — CVSS 9.8 RCE and arbitrary file exfiltration, fixed in 1.1.6.

2026-06-21//6 min

AGENTS MEDIUM NEW

Overeager Coding Agents: Out-of-Scope Actions on Benign Tasks

Two May 2026 benchmarks measure coding agents that overstep on benign requests — deleting files, wiping credentials — and find the agent framework, not the model, drives the risk.

2026-06-21//6 min

AGENTS MEDIUM NEW

Sleeper Memory Poisoning: dormant attacks on stateful LLM agents

A May 2026 paper shows attackers can plant fabricated 'memories' through a document or webpage that lie dormant, then steer an assistant's actions across many later sessions.

2026-06-21//6 min

AGENTS CRITICAL NEW

Tool selection hijacking: forcing an agent to pick the attacker's tool

An NDSS 2026 attack and an April 2026 IBM paper target the same blind spot: the step where an agent chooses which tool to call. Poison the catalog and the agent picks yours, with 70–100% success.

2026-06-21//6 min

AGENTS MEDIUM NEW

Stored prompt injection: when an injection outlives the session

A June 2026 arXiv paper reframes prompt injection as a stored, cross-session problem: once adversarial text lands in an agent's persistent state, it can steer executions long after the attacker is gone.

2026-06-20//6 min

AGENTS MEDIUM NEW

MemPoison: backdooring agent memory through ordinary conversation

A May 2026 arXiv paper plants a triggerable backdoor in an LLM agent's long-term memory just by chatting with it — and is engineered to survive the selective extraction and rewriting stages meant to filter poisoned content.

2026-06-20//6 min

AGENTS MEDIUM NEW

NRT-Bench: multi-turn red-teaming of LLM agents that run a plant

A June 18, 2026 benchmark puts LLM operator agents in a simulated nuclear control room. Adaptive multi-turn attacks pushed the team past a safety limit in 8.7-12.1% of sessions — and the failures barely overlap across models.

2026-06-20//6 min

AGENTS MEDIUM NEW

Vertex AI 'Double Agents': over-privileged service agents as a cloud escalation path

Unit 42 showed (31 March 2026) that a Vertex AI Agent Engine deployment exposes an over-scoped service-agent credential via the metadata service — turning a misconfigured agent into a path to read every bucket in the project.

2026-06-20//6 min

AGENTS MEDIUM NEW

Agent libOS: make the runtime, not the tool wrapper, the authority boundary

A June 2, 2026 arXiv paper argues most agent frameworks conflate tool visibility with resource authority — and proposes a library-OS runtime where capability checks live at primitive boundaries, not in tool wrappers.

2026-06-19//6 min

AGENTS MEDIUM NEW

Authority confusion: why tool-using agents misuse their own access

A May 2026 paper names a failure mode distinct from prompt injection: untrusted data should inform an agent's reasoning but never authorize side effects. AIRGuard enforces that line at action time.

2026-06-19//7 min

AGENTS CRITICAL NEW

CVE-2026-26268: Cursor's agent turns a git checkout into code execution

A malicious repo hides a bare Git repository with an automatic hook. When Cursor's AI agent runs git checkout to 'explain the codebase', the hook fires — arbitrary code execution on the developer's machine, no approval prompt. Patched in Cursor 2.5.

2026-06-19//6 min

AGENTS MEDIUM NEW

MCP Go SDK CSRF: a web page can trigger your local tools (CVE-2026-33252)

The official MCP Go SDK accepted cross-site browser POSTs without checking the Origin header. On an unauthenticated local server, any website you visit could invoke your tools. Patched in 1.4.1.

2026-06-19//6 min

AGENTS CRITICAL NEW

CVE-2026-26030: prompt injection becomes RCE in Microsoft Semantic Kernel

Microsoft's AI Red Team showed two Semantic Kernel flaws that turn a single injected prompt into host code execution. The lesson: any tool parameter the model can influence is attacker-controlled input. Patched May 7, 2026.

2026-06-19//6 min

AGENTS MEDIUM NEW

SkillAttack: automated red-teaming finds exploits in agent skills

An April 2026 paper, SkillAttack, reframes exploit discovery as a path-search problem and shows even well-intentioned agent skills are reachable — up to 0.93 attack success on adversarial skills.

2026-06-19//6 min

AGENTS MEDIUM NEW

User-mediated attacks: when the user is the injection channel

A January 2026 study of 12 commercial agents shows attackers don't need to touch the agent. They trick a benign user into forwarding poisoned content — which the instruction hierarchy then promotes to trusted user intent. Default bypass rates topped 92%.

2026-06-19//6 min

AGENTS MEDIUM NEW

Browser agents leak their model identity through how they click

A May 14, 2026 paper shows the on-page actions of an LLM browser agent fingerprint the underlying model with up to 96% accuracy across 14 frontier models — no spoofable headers needed.

2026-06-18//6 min

AGENTS MEDIUM NEW

AI Agent Traps: DeepMind's six-category map of how the web hijacks agents

Google DeepMind's 'AI Agent Traps' paper (SSRN, late March 2026) gives the first systematic taxonomy of adversarial web content that targets an agent's perception, reasoning, memory, action, multi-agent dynamics, and human overseer.

2026-06-18//7 min

AGENTS MEDIUM NEW

SearchGEO: making LLM search agents endorse attacker-published pages

A June 15, 2026 arXiv paper measures how attacker-controlled web content gets turned into an agent's endorsed recommendation — attack success ranges from 0% to 31.4% depending on the backend model.

2026-06-18//6 min

AGENTS MEDIUM NEW

ShadowMerge: poisoning graph-based agent memory by colliding relations

A May 2026 paper poisons graph-based agent memory with relations that share a real anchor and channel but carry a conflicting value — reaching 93.8% attack success on Mem0 while input-side filters miss it.

2026-06-18//5 min

AGENTS MEDIUM NEW

Zombie agents: when a self-evolving LLM agent stays compromised across sessions

A one-time indirect injection observed during a benign session can be written to an agent's long-term memory and later replayed as instruction — turning a transient prompt into persistent control. Attack paper dated February 2026, defense (CAMS) May 2026.

2026-06-18//7 min

AGENTS CRITICAL NEW

AI coding agents: attackers go for the credential, not the model

Six 2026 exploits against Codex, Claude Code, Copilot and Vertex AI all bypassed model-level defenses and reached the same target — the agent's runtime credentials. The root cause is an identity governance gap, not a prompt problem.

2026-06-17//6 min

AGENTS MEDIUM NEW

FragFuse: fragmented queries that bypass LLM agent access control

A June 14, 2026 arXiv paper shows a banned request can be split into benign fragments, parked in an agent's long-term memory, then fused at retrieval time — bypassing access controls 86.3% of the time.

2026-06-17//6 min

AGENTS MEDIUM NEW

Reasoning-extension DoS: when the AI guardrail becomes the attack surface

A June 2026 paper shows a single poisoned document can trap reasoning-based AI guardrails in extended thinking loops, slowing shared agent workflows by up to 148x. The target is availability, not integrity.

2026-06-17//6 min

AGENTS CRITICAL NEW

LangGraph checkpointers: from SQL injection to RCE on self-hosted agents

Check Point Research chained a SQL injection in LangGraph's checkpointer with an unsafe msgpack deserialization to reach remote code execution. Disclosed June 11, 2026; all three CVEs are patched.

2026-06-17//7 min

AGENTS MEDIUM NEW

Termination poisoning: trapping LLM agents in unbounded loops

A May 2026 arXiv paper shows that injected prompts can distort an agent's own 'am I done?' judgment, forcing unbounded computation. The LoopTrap framework reports up to 25x step amplification.

2026-06-17//6 min

AGENTS MEDIUM NEW

Cross-domain multi-agent LLM systems: seven security challenges

A Perspective published June 13, 2026 in npj Artificial Intelligence maps seven security challenges that appear when LLM agents from different organizations collaborate without any shared trust model.

2026-06-16//7 min

AGENTS CRITICAL NEW

Flowise CVE-2026-41264: LLM-written pandas code that escalates to RCE

A prompt injection in Flowise's CSV Agent makes the model emit Python that escapes a regex denylist and runs OS commands. Disclosed April 15, 2026 and patched in 3.1.0.

2026-06-15//6 min

AGENTS CRITICAL NEW

CVE-2026-46519: when an MCP server filters tools at display but not at execution

mcp-server-kubernetes enforced its read-only and allow-list controls only in tools/list, never in tools/call. Any client that knew a tool name could run it. A clean lesson in presentation-layer vs execution-layer authorization.

2026-06-15//6 min

AGENTS CRITICAL NEW

DNS rebinding turns localhost MCP servers into a remote attack surface

A coordinated 2025–2026 disclosure wave hit every major MCP SDK over one root cause: HTTP servers on localhost that skip Host/Origin validation. The latest, CVE-2026-11624 in Google's MCP Toolbox (June 13, 2026), is rated Critical 9.4.

2026-06-15//7 min

AGENTS MEDIUM NEW

Splunk MCP Server logs auth tokens in clear text (CVE-2026-20205)

Splunk's MCP Server app wrote users' session and authorization tokens unmasked into the _internal index — a CWE-532 secrets-in-logs flaw that turns log access into token theft. Fixed in app v1.0.3.

2026-06-15//6 min

AGENTS MEDIUM NEW

TOCTOU in AI agents: atomicity violations between observation and action

An old operating-systems bug class resurfaces in agents: the world changes between when an agent looks and when it acts. New 2026 research formalizes it for GUI, browser, and multi-agent systems.

2026-06-15//6 min

AGENTS MEDIUM NEW

ConVerse: when two agents talk, the stronger one leaks more

A benchmark for agent-to-agent conversations finds privacy attacks succeed up to 88% of the time and security breaches up to 60% — and that more capable models leak more, not less.

2026-06-13//6 min

AGENTS MEDIUM NEW

Causality laundering: when a blocked tool call still leaks data

An April 2026 paper shows that denying an agent's tool call is not the end of the attack: the denial itself is an information channel. Flat taint tracking misses it.

2026-06-12//7 min

AGENTS MEDIUM NEW

Claude Code GitHub Action: how the Read tool leaked CI/CD secrets

Microsoft Threat Intelligence found that Claude Code Action's Read tool bypassed the Bash env scrub to reach /proc/self/environ, leaking the runner's ANTHROPIC_API_KEY. Patched in v2.1.128.

2026-06-12//6 min

AGENTS MEDIUM NEW

Context-Fractured Decomposition: jailbreaks through artifact provenance gaps

A June 8, 2026 arXiv paper formalizes the 'provenance gap' in tool-using agents: harmful behavior assembled from individually innocuous tool actions across time, lifting jailbreak success up to 28.3 points.

2026-06-11//6 min

AGENTS CRITICAL NEW

Cursor allowlist bypass: shell built-ins poison the environment for RCE

CVE-2026-22708 lets a prompt injection use trusted shell built-ins like export and typeset to poison environment variables in Cursor, turning an approved git or python command into remote code execution. Patched in 2.3.

2026-06-11//6 min

AGENTS MEDIUM NEW

SABER: coding agents fail operational safety even when they refuse bad prompts

A May 31, 2026 benchmark scores LLM coding agents on the final state of a real workspace, not on prompt refusal. Even the best model leaves a harmful violation in over half of runs.

2026-06-11//6 min

AGENTS MEDIUM NEW

Memory Control Flow Attacks: when stored memory steers an agent's tools

A March 2026 paper shows poisoned agent memory doesn't just corrupt content — it hijacks the control flow of tool selection, forcing unintended tools and skipped steps in over 90% of trials, across tasks and long after injection.

2026-06-10//7 min

AGENTS MEDIUM NEW

MS-Agent's shell tool: a regex denylist turns prompt injection into RCE

CVE-2026-2256 lets attacker-controlled content steer ModelScope's MS-Agent into running OS commands. The root cause is a familiar anti-pattern: guarding a shell tool with a regex denylist instead of an allowlist.

2026-06-08//6 min

AGENTS MEDIUM NEW

OWASP ASI02: when an agent turns its own tools against you

Tool Misuse & Exploitation is the #2 risk in OWASP's Top 10 for Agentic Applications 2026. The danger isn't an agent gaining new tools — it's misusing the ones it already holds, via over-privilege, poisoned descriptors, or unsafe chaining.

2026-06-08//6 min

AGENTS CRITICAL NEW

Remote MCP servers: 40% unauthenticated, OAuth broken on the rest

A May 2026 arXiv study scanned 7,973 live remote MCP servers: 40.55% expose tools with no authentication, and all 119 OAuth-enabled servers tested carried at least one flaw — 9 CVEs assigned.

2026-06-08//6 min

AGENTS MEDIUM NEW

Five attacks on x402: when AI agents pay, the cross-layer seams leak

A May 12, 2026 paper formally breaks x402, the HTTP 402 agentic payment protocol. Five attacks across settlement, replay, web handling and discovery — one replayed payment yielded 248 grants on a live endpoint.

2026-06-08//6 min

AGENTS CRITICAL NEW

CVE-2026-45497: command injection turns Microsoft 365 Copilot into an RCE path

On June 4 2026 MSRC disclosed CVE-2026-45497, a command-injection flaw in Microsoft 365 Copilot rated as remote code execution with a scope change across the service boundary. Fixed server-side.

2026-06-05//6 min

AGENTS MEDIUM NEW

When an MCP tool argument becomes an Android intent: mobile-mcp's injection sinks

CVE-2026-35394 lets a model-controlled URL fire arbitrary Android intents through mobile-mcp's mobile_open_url tool. Paired with a sibling path-traversal CVE, it shows a pattern: MCP tool arguments flowing unvalidated into platform sinks.

2026-06-05//6 min

AGENTS MEDIUM NEW

VIPER-MCP: 67 CVEs from taint-style flaws across 40,000 MCP servers

A May 20, 2026 arXiv paper audited 39,884 open-source MCP server repos, confirmed 106 zero-days end-to-end and got 67 CVE IDs assigned. The story is the pattern: untrusted agent input reaching shell, network and file-system sinks.

2026-06-05//6 min

AGENTS MEDIUM NEW

AIRQ scores 100 production AI agents: 98% carry the lethal trifecta

Adversa AI's June 2026 AI Risk Quadrant rates 100 commercial agents on attack surface, blast radius and defenses. Only 11% are well-defended; tool execution alone explains 76% of blast radius.

2026-06-04//7 min

AGENTS CRITICAL NEW

Self-propagating agent worms and the temporal re-entry defense

A May 2026 paper formalizes how persistent agent state lets a prompt-injection payload write itself back into the LLM context, propagate across agents zero-click, and proposes RTW-A — a defense proven under a No Persistent Worm Propagation theorem.

2026-06-04//7 min

AGENTS MEDIUM NEW

Tool poisoning across 7 MCP clients: a comparative security posture

A March 2026 empirical study tests four tool-poisoning attacks against Claude Desktop, Claude Code, Cursor, Cline, Continue, Gemini CLI and Langflow — and finds most protection comes from the model, not the client.

2026-06-04//7 min

AGENTS MEDIUM NEW

Authorization propagation: the agent security gap prompt-injection fixes won't close

A May 6, 2026 paper by Krti Tallam argues multi-agent systems have a distinct authorization-propagation problem — transitive delegation, aggregation inference, temporal validity — that survives even a perfect prompt-injection defense.

2026-06-03//7 min

AGENTS MEDIUM NEW

ClawTrojan: stored prompt injection becomes a persistent agent backdoor

A May 29, 2026 arXiv paper shows injection hidden in a file can be stored by a local agent and run later — reaching 95.5% attack success where single-turn injection scores near zero.

2026-06-03//6 min

AGENTS MEDIUM NEW

Opus 4.8's system card puts a number on browser-agent prompt injection: 31.5%

Anthropic's May 28, 2026 Claude Opus 4.8 system card reports a 31.5% pre-safeguard hijack rate for its browser agent — the only concrete prompt-injection metric a frontier lab published this spring.

2026-06-03//6 min

AGENTS CRITICAL NEW

CVE-2026-30615: prompt injection rewrites Windsurf's MCP config into RCE

OX Security's April 15, 2026 advisory shows how attacker-controlled content can make the Windsurf IDE register a malicious MCP STDIO server and run commands — with no user click. The class spans coding agents, but Windsurf got the CVE.

2026-06-03//6 min

AGENTS MEDIUM NEW

Brittle agents: indirect injection survives multi-step tool calls

An April 4, 2026 paper tests 6 defenses against 4 indirect-injection vectors across 9 LLM backbones in multi-step agents — advanced injections bypass nearly all of them, and some surface mitigations backfire.

2026-06-02//6 min

AGENTS CRITICAL NEW

Langroid SQLChatAgent: prompt-to-SQL injection escalates to RCE (CVE-2026-25879)

Disclosed June 1, 2026, CVE-2026-25879 (CVSS 9.8) lets a prompt-injected SQL agent run dialect-specific primitives like COPY FROM PROGRAM, turning a chat box into code execution on the database host.

2026-06-02//7 min

AGENTS MEDIUM NEW

MCP sampling: how malicious servers abuse the reverse LLM channel

MCP's sampling feature lets a server ask the client's model for completions. Unit 42 showed (Dec 2025) how a malicious server turns that reverse channel into covert tool calls, conversation hijacking, and compute theft.

2026-06-02//6 min

AGENTS CRITICAL NEW

Just ask the bot: Meta's AI support assistant and the Instagram takeovers

Over the May 30–31, 2026 weekend, attackers hijacked high-profile Instagram accounts by asking Meta's AI support bot to relink an account email. No prompt injection required — only excessive agency.

2026-06-02//6 min

AGENTS MEDIUM NEW

Stop fixating on the prompt: hijacking an agent's reasoning and memory

An April 2026 paper, JailAgent, drives an agent to malicious tool calls without touching the user prompt — by perturbing its reasoning trace and memory retrieval instead. The prompt was never the whole attack surface.

2026-06-02//6 min

AGENTS CRITICAL NEW

TrustFall: project MCP settings turn the folder-trust click into RCE

Adversa AI's TrustFall (May 7, 2026) shows four agentic coding CLIs auto-start project-defined MCP servers the moment a developer accepts the folder-trust prompt — one keypress on the dev machine, zero clicks in CI.

2026-06-02//7 min

AGENTS CRITICAL NEW

CrewAI: a silent sandbox fallback turns prompt injection into RCE (VU#221883)

Four CrewAI flaws let prompt injection chain into RCE, SSRF and file read via a Code Interpreter that silently drops out of Docker. CERT/CC's May 20, 2026 update confirms the full fix.

2026-06-01//6 min

AGENTS CRITICAL NEW

Flowise CVE-2026-40933: importing a shared chatflow is enough for RCE

Obsidian Security's May 28, 2026 write-up shows how Flowise's Custom MCP node turns a stdio MCP config into server-side code execution — and how merely importing a shared chatflow can trigger it, no save or run required.

2026-06-01//6 min

AGENTS MEDIUM NEW

Token-drain attacks: economic denial-of-service via agent tool chains

Two 2026 papers show a malicious tool or skill can steer an LLM agent into long tool-calling loops that multiply token cost 6–658× while still returning the right answer — a stealthy take on OWASP's Unbounded Consumption.

2026-06-01//7 min

AGENTS CRITICAL NEW

SymJack: one approved file copy becomes RCE in six AI coding agents

Adversa AI disclosed on May 26, 2026 a symlink-hijack pattern that turns a single benign-looking shell copy into a config overwrite and host RCE across Claude Code, Cursor, Gemini, Antigravity, Copilot, Grok Build and Codex CLIs.

2026-05-30//6 min

AGENTS MEDIUM NEW

Blindfold: action-level jailbreaks bypass semantic defenses on embodied LLMs

A SenSys '26 paper (May 11–14, 2026) introduces Blindfold, an automated framework that jailbreaks embodied LLMs by decomposing harmful goals into individually benign actions — up to 53% higher attack success than semantic-level baselines on a real 6DoF robotic arm.

2026-05-29//6 min

AGENTS MEDIUM NEW

MemMorph: hijacking tool selection in LLM agents through fluent memory poisoning

A May 24, 2026 arXiv paper from NTU Singapore shows three plausible-looking memory entries can steer an agent toward an attacker-chosen tool with 85.9% success — and survive three off-the-shelf defenses.

2026-05-29//6 min

AGENTS MEDIUM NEW

The agent harness is your real privilege boundary — and most teams draw it in the wrong place

A May 26, 2026 Pillar Security write-up argues the harness — Claude Code, Cursor, Codex — holds the secrets, tools and hooks an agent never sees. Recent harness bugs and CVE-2026-22708 make the case concrete.

2026-05-28//7 min

AGENTS CRITICAL NEW

Microsoft Copilot Cowork: poisoned skills exfiltrate M365 files with no approval

PromptArmor's May 26, 2026 disclosure shows that a five-line prompt injection inside a Copilot Cowork skill file can leak SharePoint and OneDrive documents through auto-approved Teams messages — no patch closes the design.

2026-05-28//7 min

AGENTS MEDIUM NEW

Temporal memory contamination: longitudinal safety drift in memory-equipped LLM agents

Three arXiv papers from April and May 2026 converge on a failure mode complementary to memory poisoning — memory-equipped agents drift unsafe as benign context accumulates, with compressed summaries acting as a laundering channel.

2026-05-28//7 min

AGENTS MEDIUM

Networks of agents break in new ways: Microsoft's red-team, plus RAMPART and Clarity

Microsoft Research red-teamed an internal platform of 100+ always-on agents. Four attack patterns — propagation, amplification, trust capture, proxy chains — show up only at the network level. RAMPART and Clarity, open-sourced May 20, 2026, are the response.

2026-05-27//8 min

AGENTS CRITICAL

Antigravity find_by_name: when a native tool call jumps over Secure Mode

On April 20, 2026, Pillar Security disclosed that a single unsanitised parameter in Google Antigravity's find_by_name tool turned file search into arbitrary code execution — and bypassed the IDE's strictest sandbox.

2026-05-27//7 min

AGENTS CRITICAL

ClaudeBleed: when a browser agent trusts the wrong extension

LayerX disclosed ClaudeBleed on May 6, 2026: a trust-boundary flaw let any Chrome extension drive Claude in Chrome and exfiltrate Gmail, Drive and GitHub data. The first patch was bypassed within hours.

2026-05-27//7 min

AGENTS CRITICAL

MCP STDIO transport: the design choice that became 11 CVEs and 200,000 exposed agents

On April 16, 2026, OX Security disclosed that Anthropic's MCP STDIO transport executes any OS command it is handed. Anthropic called it 'by design'. The cascade has produced eleven downstream CVEs in six weeks.

2026-05-27//7 min

AGENTS CRITICAL

When prompts become shells: prompt injection escalates to RCE in agent frameworks

Two CVEs in Microsoft Semantic Kernel and four in CrewAI — all disclosed in early 2026 — turn a single injected prompt into remote code execution on the host. The pattern is structural, not incidental.

2026-05-27//7 min

AGENTS MEDIUM

Poison once, exploit forever: persistent memory poisoning of LLM agents (OWASP ASI06)

An April 2026 arXiv paper on cross-site memory poisoning and a May 13, 2026 OWASP post on the Cisco MemoryTrap finding against Claude Code converge on the same lesson: agent memory is a trust boundary.

2026-05-26//7 min

AGENTS MEDIUM

Treating AI agents like operating systems: a CISPA blueprint for isolation and privilege

A May 14, 2026 CISPA paper applies decades of OS security thinking to LLM agents. Tested on four OpenClaw-like systems, two weakness classes — cross-user exfiltration and unauthorized network egress — fail in every single one.

2026-05-26//7 min

AGENTS CRITICAL

The Lethal Trifecta: when an agent reads private data, untrusted content, and can phone home

Simon Willison's framework for the single architectural mistake that turned 2026's wave of AI-agent data exfiltration vulnerabilities into a class, not a coincidence.

2026-05-26//7 min

AGENTS MEDIUM

MCP Back-End Vulnerabilities: classic flaws resurface across AI database bridges

Akamai's May 12, 2026 research found SQL injection (CVE-2025-66335), missing authentication, and unsanitised inputs across three MCP servers — Apache Doris, Apache Pinot, and Alibaba RDS. The pattern, not the bugs, is the story.

2026-05-26//7 min

AGENTS CRITICAL

Semantic Kernel: when a prompt becomes a shell (CVE-2026-25592, CVE-2026-26030)

Microsoft disclosed two critical vulnerabilities in Semantic Kernel on May 7, 2026 that turn a single injected prompt into host-level code execution. The root cause is architectural: tool registries and eval() treated as features, not security boundaries.

2026-05-26//7 min

AGENTS MEDIUM

Trust No Tool: cognitive poisoning of LLM agents through tool feedback

A May 17, 2026 arXiv paper introduces 'cognitive poisoning' — a malicious tool that wins the agent's trust over many benign-looking turns and only weaponises the final action. The defence target shifts from prompts to trajectory.

2026-05-26//7 min

AGENTS CRITICAL

CVE-2026-35435: Azure AI Foundry's M365 published agents trusted callers they shouldn't have

Disclosed May 7, 2026 (CVSS 8.6), an improper access-control flaw in Azure AI Foundry let unauthorized attackers elevate privilege through M365 published agents. Microsoft reports active exploitation; mitigations are available before a patch.

2026-05-25//6 min

AGENTS CRITICAL

Azure SRE Agent: a multi-tenant token check that let strangers watch your incidents (CVE-2026-32173)

Disclosed April 20, 2026, an Entra ID app-registration misconfiguration on Azure SRE Agent's /agentHub WebSocket let any tenant connect, listen to every prompt, reasoning step, CLI command and credential — silently.

2026-05-25//7 min

AGENTS CRITICAL

Claw Chain: four OpenClaw CVEs that turn an AI agent into the attacker's hands

Disclosed May 15, 2026, Cyera Research's Claw Chain chains four patched OpenClaw flaws — sandbox escape, env-var disclosure, MCP loopback EoP, symlink read escape — into full host takeover via the agent itself.

2026-05-25//7 min

AGENTS CRITICAL

Comment and Control: one prompt injection pattern, three vendors leaking GitHub Actions secrets

Disclosed April 15, 2026, Comment and Control turns ordinary PR titles, issue bodies and HTML comments into credential-exfiltration channels in Claude Code, Gemini CLI and GitHub Copilot Agent.

2026-05-25//7 min

AGENTS CRITICAL

PraisonAI CVE-2026-44338: an unauthenticated agent server, exploited in 3h44

Disclosed May 11, 2026, CVE-2026-44338 ships PraisonAI with authentication hard-disabled in its legacy API server. A CVE-Detector scanner hit the endpoint less than four hours later.

2026-05-25//6 min

AGENTS CRITICAL

Localhost agent hijack: cross-origin WebSocket attacks on AI coding agents

CVE-2026-44211 (CVSS 9.7), disclosed May 7, 2026, shows how a single visit to a malicious page can hijack an AI coding agent running on a developer's laptop. The attack class is generic — and architectural.

2026-05-22//7 min

AGENTS CRITICAL

Prompts as shells: when prompt injection becomes RCE in agent frameworks

Two CVEs disclosed in Microsoft Semantic Kernel on May 7, 2026 (CVE-2026-25592, CVE-2026-26030) show how a single injected prompt can pivot from text to remote code execution on the agent's host.

2026-05-22//7 min