system: OPERATIONAL
← back to all hacks
AGENTS CRITICAL

Semantic Kernel: when a prompt becomes a shell (CVE-2026-25592, CVE-2026-26030)

Microsoft disclosed two critical vulnerabilities in Semantic Kernel on May 7, 2026 that turn a single injected prompt into host-level code execution. The root cause is architectural: tool registries and eval() treated as features, not security boundaries.

2026-05-26 // 7 min affects: semantic-kernel <1.39.4 (Python), semantic-kernel <1.71.0 (.NET)

What is this?

On May 7, 2026, Microsoft Security Response Center published advisories for two critical vulnerabilities in Semantic Kernel, the company’s flagship agent framework. CVE-2026-25592 (CVSS 10.0) affects the .NET SDK before 1.71.0; CVE-2026-26030 (CVSS 9.8) affects the Python SDK before 1.39.4. Both turn a single injected prompt into host-level remote code execution.

Microsoft’s accompanying research post, “When prompts become shells”, demonstrates launching calc.exe on the agent host with no browser exploit, no malicious attachment, and no memory-corruption bug — only a crafted natural-language input.

How it works

CVE-2026-26030 (Python, InMemoryVectorStore). When a Semantic Kernel application uses the default in-memory vector store with the Search Plugin, the user-supplied filter expression is compiled into a Python lambda and run through eval(). Any string the model can place in that filter — directly, or via retrieved document content — is therefore Python code on the host:

# Simplified vulnerable pattern (do NOT run)
filter_expr = user_or_model_supplied_string
fn = eval(f"lambda item: {filter_expr}")  # [REDACTED-eval]
results = [it for it in store if fn(it)]

CVE-2026-25592 (.NET, SessionsPythonPlugin). The plugin that executes model-generated code in Azure Container Apps dynamic sessions also exposed an internal helper, DownloadFileAsync, decorated with [KernelFunction]. That attribute makes a .NET method visible to the LLM as a callable tool. With no path validation, the model could create a payload inside the sandbox, then ask the framework itself to write that payload to an arbitrary location on the host running the agent — bypassing the sandbox boundary entirely.

Both bugs share one architectural mistake: treating the tool registry ([KernelFunction]) and the filter language (eval) as ergonomic conveniences rather than as security-critical surfaces.

Why it matters

Prompt injection has long been described as a “content security” problem. These CVEs collapse that framing. Once an LLM is wired to tools and given untrusted text — a retrieved document, a web page, a user message — every exposed function becomes part of the attack surface and every string interpreter becomes a code path.

The pattern is not specific to Semantic Kernel. Microsoft’s research and concurrent advisories from third parties show the same class of bugs in other agent frameworks throughout 2026 (CrewAI, LangFlow, LiteLLM, GPT Researcher), often via MCP STDIO command injection or eval/exec in plugin systems. Any team building agents with tools, RAG, or code execution should assume this class applies to them until proven otherwise.

Defenses

  • Patch immediately. Upgrade to semantic-kernel >= 1.39.4 (Python) and >= 1.71.0 (.NET). Pin the version in your lockfile and verify in CI.
  • Audit the tool registry. List every method exposed to the model ([KernelFunction], @kernel_function, tool decorators in other frameworks). Remove anything that crosses a trust boundary — file I/O on the host, network egress, secret access — unless that exposure is intentional and reviewed.
  • Ban eval/exec on model-influenced inputs. Replace lambda-based filters with a parsed AST or a domain-specific language with an explicit allowlist of operators. Filter expressions are user input, not code.
  • Sandbox the tool host, not just the code interpreter. Azure Container Apps sandboxed the Python session; it did not sandbox the .NET host that owned the file-system helper. The trust boundary must wrap the caller of the tool, not only the tool’s payload.
  • Apply the lethal-trifecta lens. As Simon Willison frames it, an agent that combines (1) access to private data, (2) exposure to untrusted content, and (3) the ability to act externally is exploitable by default. Remove one leg per environment when possible.
  • Log tool calls with arguments. Detect anomalous parameters (paths outside expected directories, suspicious code in filter strings) before they execute.

Status

ComponentAffectedPatched inCVSS
semantic-kernel (Python, InMemoryVectorStore)< 1.39.41.39.49.8
semantic-kernel (.NET, SessionsPythonPlugin)< 1.71.01.71.010.0

Key dates: disclosure and patches published by Microsoft on May 7, 2026. CVE entries created the same week; GitHub Security Advisory GHSA-xjw9-4gw8-4rqx and the NVD record for CVE-2026-26030 are the primary references.

Sources