system: OPERATIONAL
← back to all hacks
SUPPLY CHAIN MEDIUM NEW

trust_remote_code=False isn't a boundary: vLLM's recurring model-load RCE

CVE-2026-27893 (disclosed March 27, 2026) is vLLM's third trust_remote_code bypass. Two model files hardcode trust_remote_code=True, silently overriding an operator's opt-out and enabling RCE from a malicious model repo.

2026-06-05 // 6 min affects: vllm, vllm-0.10.1-to-0.17.x, nemotron-vl, kimi-k25

In brief vLLM’s --trust-remote-code=False is supposed to stop a model repository from running arbitrary Python on your inference host. CVE-2026-27893, disclosed March 27, 2026 (CVSS 8.8), is the third time that boundary has been bypassed — this time because two model files hardcode trust_remote_code=True. It affects vLLM 0.10.1 through 0.17.x; the fix landed in 0.18.0. The lesson is not one bug but a pattern: per-file opt-in is not a trust boundary.

What is this?

CVE-2026-27893 is a protection-mechanism failure (CWE-693) in vLLM, the inference and serving engine behind a large share of production LLM deployments. Operators can pass --trust-remote-code=False to refuse running custom Python shipped inside a model repository. The GitHub Security Advisory, published March 27, 2026, shows that two model implementation files ignore that setting and pass a literal trust_remote_code=True to Hugging Face Transformers — so the remote code runs anyway.

What makes this worth covering is not the single CVE but its lineage. The advisory itself names two predecessors: CVE-2025-66448 (December 1, 2025, the auto_map config-loading path) and CVE-2026-22807 (the broader auto_map startup path). Each was patched; each time the same trust boundary fell again through a different code path. CVE-2026-27893 is the third in that sequence.

How it works

The operator’s choice is propagated correctly through vLLM’s config hierarchy as self.config.model_config.trust_remote_code. The two vulnerable call sites simply don’t read it. Per the advisory, the offending lines are:

# vllm/model_executor/models/nemotron_vl.py (vision encoder load)
AutoModel.from_config(config, trust_remote_code=True)

# vllm/model_executor/models/kimi_k25.py (image processor load)
cached_get_image_processor(model_name, trust_remote_code=True)

Because the literal True overrides the global setting, Hugging Face Transformers downloads and executes Python from the referenced repository at model-load time, with the privileges of the vLLM process. The trigger path: an attacker publishes a model repo targeting the Nemotron-VL or Kimi-K25 architecture; an operator loads it — believing --trust-remote-code=False protects them; vLLM dispatches into one of these two files; the hardcoded True wins; the repo’s code runs. The bypass is silent — no warning, no log entry signals that the operator’s setting was overridden. CVSS rates it 8.8 (AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H): network-reachable, no attacker privileges, but the operator must initiate the load (UI:R).

Why it matters

The recurrence is the story. vLLM delegates model-specific behaviour to individual files in model_executor/models/, and each file must independently honour the global trust_remote_code flag. There is no central chokepoint that prevents a file from hardcoding True. That means every new model implementation is a fresh opportunity to reintroduce the same class — which is exactly what happened three times across config.py, the auto_map startup path, and now two model files.

For defenders, the practical danger is a false sense of safety. Teams that adopted --trust-remote-code=False as a control may be running models through an affected path with no indication the control is inert. And the broader lesson generalises well beyond vLLM: any framework that scatters enforcement of a security-critical setting across many independent files, without centralised mediation, is structurally prone to this failure. As of disclosure there was no public proof-of-concept and no evidence of in-the-wild exploitation, but the affected versions span a year of releases (0.10.1–0.17.x).

Defenses

  1. Upgrade to vLLM 0.18.0 or later. The fix in PR #36192 replaces the hardcoded True with self.config.model_config.trust_remote_code at both call sites. Check your version with pip show vllm.
  2. Don’t treat trust_remote_code=False as your only boundary. Run inference in a minimal container with restricted egress, isolate the serving tier from sensitive data stores, and segment it to limit lateral movement if the process is compromised.
  3. Verify model provenance. Restrict loading to trusted, signed publishers; checksum or sign model artifacts before loading rather than relying on a runtime flag — particularly for Nemotron-VL and Kimi-K25 architecture models.
  4. Scan your own builds for the pattern. Because this class has recurred, grep model_executor/models/*.py for trust_remote_code=True literals (and similar from_pretrained / from_config calls that don’t propagate config) in any custom or forked vLLM. Wire that check into CI so a future model file can’t reintroduce it.
  5. Watch for the silent override. In environments where models are pre-cached, an outbound .py fetch from huggingface.co during load — or an unexpected child process under a vLLM worker — is a useful hunting signal that remote code ran when it shouldn’t have.

Status

ItemReferenceDateNotes
CVE-2026-27893 advisoryGitHub (GHSA-7972-pg2x-xr59)2026-03-27Hardcoded trust_remote_code=True, CVSS 8.8, CWE-693
Affected versionsvLLM0.10.1 → 0.17.xNemotron-VL and Kimi-K25 load paths
Patched versionvLLM 0.18.0Fix PR #36192
CVE-2025-66448GitHub (GHSA-8fr4-5q9j-m8gm)2025-12-01First bypass, auto_map in config path
CVE-2026-22807GitHub advisory2026Second bypass, broader auto_map startup path

The right framing isn’t “patch one CVE.” It’s that trust_remote_code in vLLM has been a leaky boundary three times over, and a serving stack that relies on it as a hard control should add isolation and provenance checks that don’t depend on a single per-file flag being set correctly.

Sources