system: OPERATIONAL
← back to all hacks
RESEARCH MEDIUM NEW

DrainCode: energy-and-cost DoS via RAG corpus poisoning in code generation

A January 2026 attack, DrainCode, poisons a code-RAG corpus so retrieved snippets coerce the model into longer-but-still-correct output — inflating latency ~85% and energy ~49%. The target is availability and cost, not integrity.

2026-06-22 // 6 min affects: rag-systems, code-generation-llms, code-rag, ai-coding-assistants

What is this?

Almost all published work on poisoning a retrieval-augmented generation (RAG) corpus targets integrity: making the model give a wrong answer, recommend an attacker’s product, or leak data. “DRAINCODE: Stealthy Energy Consumption Attacks on Retrieval-Augmented Code Generation via Context Poisoning” (arXiv:2601.20615), by Yanlin Wang and colleagues at Sun Yat-sen University, Huawei Cloud and Nanyang Technological University, attacks a different property: availability and cost. It poisons the retrieval corpus of a code-generation system so that the answer stays functionally correct, but the model spends far more tokens, latency and energy producing it.

The framing is timely. A broad survey published two months later, “Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions” (arXiv:2604.08304, April 2026), explicitly names availability — “denial and refusal abuse” — as one of the recognised RAG attack classes, alongside disinformation and exfiltration. DrainCode is a concrete instance of that under-studied class, aimed specifically at the code-assistant setting where RAG is now standard.

How it works

A RAG code assistant retrieves snippets from a corpus and places them in the model’s context before generation. DrainCode poisons that corpus with snippets that are syntactically valid and semantically inert — they look like ordinary helper code — but are crafted to push the model toward verbose output instead of stopping early.

The paper describes three conceptual ingredients, none of which require knowing the victim’s queries in advance:

  • A hypothetical-query construction step generates plausible questions from a retrieved snippet, so the poison is query-agnostic — it does not depend on predicting what users will ask (a limitation of earlier RAG attacks).
  • A gradient-guided mutation process optimises the inert trigger against two objectives at once: an end-of-sequence (EOS) term that suppresses early termination so the model keeps generating, and a KL-divergence constraint that keeps the output distribution close to the clean case so the generated code stays correct.
  • An efficiency layer — multi-position mutation plus a reuse buffer — that makes the corpus poisoning converge several times faster than prior energy attacks.

The result is “coerced verbosity”: longer-but-correct code. The authors report up to a 3×–10× increase in output length, ~85% more latency, and ~49% more energy, while preserving 95–99% functional accuracy across the models tested. Crucially, because the code still passes tests and the trigger text is not obviously malformed, the attack evaded both classifier-based and perplexity-based filters in their evaluation. No poisoned trigger strings are reproduced here; the mechanism is summarised at a conceptual level only.

Why it matters

This is a denial-of-service and denial-of-wallet vector hiding inside correctness. Most monitoring for RAG poisoning watches for wrong, harmful, or off-policy outputs. An attack that leaves the answer right but triples the cost slips past those checks. At the scale of an IDE assistant that fires on every keystroke or save, a sustained verbosity tax translates directly into GPU contention, higher inference bills, slower responses for co-running users, and — for self-hosted fleets — real energy and carbon overhead. OWASP’s “Unbounded Consumption” risk for LLM applications captures exactly this failure mode.

The corpus is the weak point. Code-RAG systems routinely index large, semi-trusted bodies of code — internal monorepos, dependency snapshots, scraped public repositories — that few teams curate as a security-sensitive asset. A single poisoned snippet that survives retrieval can impose the verbosity tax on every query that pulls it in, with no user interaction and no integrity violation to trip an alarm.

Defenses

Because DrainCode is built to defeat content classifiers and perplexity filters, defence has to move to the pipeline and the budget, not just the text.

Cap the blast radius at generation time. Enforce per-request token and time budgets and a sensible max_tokens, and treat sustained output-length inflation as a first-class anomaly signal rather than a quality nuisance. Track the distribution of output length, latency and energy per query class; DrainCode’s whole signature is a path that is abnormally long for an ordinary answer. Apply rate limits and cost ceilings per user and per key, consistent with OWASP’s Unbounded Consumption guidance.

Harden the corpus as a security asset. The RAG taxonomy stresses knowledge-base integrity, provenance and remediation: curate and sign indexed sources, restrict who can write to the retrieval store, and keep the provenance to remove a poisoned snippet once detected. Treat all retrieved code as untrusted input, and prefer trusted, reviewed internal sources over opportunistically scraped corpora.

Watch the generator’s termination behaviour. Monitoring EOS/termination patterns and output entropy per retrieved-context set can surface the EOS-suppression signature even when the final code is correct. Finally, evaluate availability explicitly: red-team your RAG pipeline for resource exhaustion, not only for wrong answers, and measure tokens, latency and energy end to end through the real retriever rather than against the generator in isolation.

Status

ItemDetail
Primary paperDrainCode, arXiv:2601.20615, January 2026 (Sun Yat-sen U., Huawei Cloud, NTU)
Companion surveySecuring RAG taxonomy, arXiv:2604.08304, April 2026
Target propertyAvailability / cost (energy, latency, tokens) — not integrity
Reported impact~85% latency, ~49% energy, 3×–10× output length; 95–99% functional accuracy retained
StealthEvaded classifier- and perplexity-based defenses in evaluation
Best mitigationsToken/time budgets + output-length anomaly detection + corpus integrity/provenance

Sources