system: OPERATIONAL
← back to all hacks
INFRASTRUCTURE CRITICAL NEW

ChromaToast: a pre-auth RCE in the ChromaDB vector database

HiddenLayer's May 18, 2026 disclosure (CVE-2026-45829, CVSS 10.0) shows ChromaDB's Python server loads an attacker's HuggingFace model and runs its code before it ever checks authentication.

2026-06-12 // 6 min affects: chromadb

What is this?

On May 18, 2026, HiddenLayer researcher Esteban Tonglet published ChromaToast: Served Pre-Auth, describing CVE-2026-45829 — a pre-authentication remote code execution flaw in ChromaDB, one of the most widely deployed open-source vector databases (roughly 13 million pip downloads a month, 27,500 GitHub stars, and documented production use at Mintlify, Weights & Biases, Factory AI, Capital One and UnitedHealthcare). The bug carries a CVSS 4.0 base score of 10.0, the maximum, and is classified as CWE-94 (code injection). It was introduced in version 1.0.0 and, at disclosure, remained unpatched through 1.5.8.

The short version: any unauthenticated attacker who can reach the Python server over HTTP can run arbitrary code inside the database process. ChromaDB sits at the centre of retrieval-augmented generation (RAG) pipelines, so a compromise reaches far beyond one server.

How it works

ChromaDB lets a client choose which embedding model a collection uses, passing the model name and parameters in the collection-creation request. The Python FastAPI server fetches and loads that model directly from HuggingFace. One of the parameters a client can set is trust_remote_code, a standard HuggingFace flag that, when true, tells the library to download and execute Python module files shipped inside the model repository. ChromaDB only validates that kwargs are primitive types — a boolean passes — so trust_remote_code: true flows untouched into AutoModel.from_pretrained(). Whoever controls the referenced repository controls what runs on the host.

The second half is a timing defect. The vulnerable endpoint (POST /api/v2/tenants/{tenant}/databases/{db}/collections) is labelled as authenticated, but the server instantiates the embedding function — downloading and executing the model — before it runs the authentication check. By the time the credential check fires and the request is rejected with a 500, the attacker’s code has already executed. The equivalent V1 endpoint cannot be disabled, so blocking one path does not close the hole. HiddenLayer summarises the root cause as two compounding failures: the server trusts a client-supplied model identifier without restriction, and acts on that trust before authenticating the caller — a textbook confused-deputy problem where the trust boundary was placed at the wrong point.

Why it matters

A successful trigger gives the attacker the privileges of the ChromaDB process: environment variables, API keys, mounted secrets, and the full embedding store on disk. Exposure is broad. HiddenLayer’s Shodan survey found that 73% of internet-reachable ChromaDB instances run the vulnerable version range, and earlier UpGuard research catalogued more than 1,170 publicly accessible deployments — many holding real production data — because ChromaDB ships with authentication disabled by default.

The data-layer angle is what makes this worse than a single host takeover. A vector store concentrates proprietary embeddings and RAG knowledge bases. An attacker with access can poison embeddings so downstream LLM applications retrieve attacker-controlled content as authoritative context, and research on embedding inversion shows stored vectors can leak substantial portions of their source text — turning an infrastructure compromise into a quiet document-exfiltration and AI-integrity incident.

Defenses

Because no fix existed for the Python server at disclosure, mitigation is mostly architectural. Prefer the Rust-based deployment path (chroma run and the Docker images), which does not carry this flaw. If the Python FastAPI server must stay, restrict its network reachability to known, trusted application hosts and terminate all external access at a reverse proxy that authenticates before any request reaches ChromaDB. Enable ChromaDB’s built-in authentication and enforce TLS, apply least privilege so ingestion accounts cannot read everything, and ship API logs to a SIEM to catch anomalous reads or collection changes. More broadly, treat vector databases as production data infrastructure — the same isolation, access review and vulnerability management you already apply to relational databases — and scan model artifacts before they reach any runtime, since loading a model from an untrusted registry is equivalent to running untrusted code.

Status

CVE-2026-45829 affects ChromaDB 1.0.0 through 1.5.8 (Python FastAPI server only; the Rust server is unaffected). HiddenLayer reports it first contacted Chroma on February 17, 2026, with follow-ups on February 24, March 5 and April 16 before publishing on May 18, 2026 — roughly a 90-day window — after receiving no response. Verify whether any release after 1.5.8 includes the fix against ChromaDB’s changelog and GitHub advisory before relying on an upgrade alone, and apply the network and authentication mitigations regardless.

Sources