CRCP: RAG corpus poisoning that survives chunking and reranking
A June 9, 2026 arXiv paper shows many corpus-poisoning attacks quietly fail after reranking — and proposes CRCP, a chunk-aware variant built to survive realistic multi-stage RAG pipelines. The lesson is about how you evaluate, not just how you defend.
What is this?
On June 9, 2026, Xi Nie, Hongwei Li, Shenghao Wu, Mingxuan Li, Jiachen Li and Wenbo Jiang published “When Poison Fails After Retrieval: Revisiting Corpus Poisoning under Chunking and Reranking Pipelines” (arXiv:2606.11265, cs.CR). The paper makes a deceptively simple point: a lot of published RAG corpus-poisoning attacks look strong in the lab and then fall apart in a realistic pipeline.
Corpus poisoning is the family of attacks where an adversary injects crafted documents into the knowledge base a Retrieval-Augmented Generation system draws from, so that those documents get retrieved and steer the model’s answer. Earlier work — for example the GASLITE attack on dense retrievers (CCS ‘25) — showed this is feasible even at negligible poisoning rates. The new paper asks the follow-up question almost everyone skipped: does it still work once the documents pass through chunking, dense retrieval, reranking, and grounded generation — the way production systems actually run?
How it works
The authors’ finding is that many existing attacks substantially degrade after reranking, even when the poisoned passage scored high relevance at the dense-retrieval stage. They name the cause retrieval granularity mismatch.
Two things break the attacker’s assumptions:
Pipeline stage What it does Effect on a document-level payload
----------------- --------------------------- ----------------------------------
Chunking splits docs into passages adversarial signal gets fragmented
Dense retrieval embeds + ranks chunks poisoned chunk may still rank high
Reranking re-scores top candidates favors locally coherent answers
Generation answers from survivors payload often never makes the cut
Most prior attacks optimize a whole document to sit close to a query in embedding space. But the pipeline never sees that whole document — it sees the pieces left after chunking, and a cross-encoder reranker that prefers a passage which locally reads like an answer over one that merely has high global similarity. The adversarial signal, spread across the original document, is cut apart and discarded.
The paper’s contribution, CRCP (Chunk-aware and Rerank-Consistent Poisoning), is a method that optimizes against this reality rather than around it. CRCP jointly targets retrieval relevance, reranker consistency, and chunk-boundary robustness, and it explicitly models chunking transformations during optimization so the resulting passages are locally self-contained — each surviving chunk still carries the adversarial intent regardless of where the boundaries fall. Reported across standard RAG benchmarks with multiple retrievers and rerankers, existing methods are highly sensitive to chunk size and reranking strategy, while CRCP holds up far better. (No payloads or optimization code are reproduced here; this is a summary of a published method.)
Why it matters
There are two distinct takeaways, and the second is the important one.
The narrow takeaway: a reranker is not a reliable defense against corpus poisoning. It raised the bar against naive document-level attacks — which is genuinely useful — but a chunk-aware attacker can be built to clear that bar.
The broader takeaway is methodological, and it cuts both ways. If you evaluate an attack in a simplified retrieval setting, you will overstate it. If you evaluate a defense the same way, you will understate the threat and ship a control that looks effective only because you tested it against attacks that ignored your pipeline. This is the same trap the prompt-injection field fell into with static test strings: defenses that pass against yesterday’s attacks tell you very little. The authors frame RAG poisoning as a multi-stage retrieval consistency problem, not a retrieval-only one — meaning the relevant attack surface is the whole chunk-retrieve-rerank-generate chain.
This is research on a benchmark, not an in-the-wild incident, and it requires the attacker to get content into your corpus in the first place — the real precondition for every poisoning attack.
Defenses
-
Control who can write to the corpus. Poisoning needs an injection path: open web crawls, user-uploaded documents, scraped wikis, shared drives. Treat ingestion as an untrusted boundary — provenance tracking, source allow-lists, and review for any corpus the model will treat as ground truth.
-
Don’t treat reranking as a security control. Use it for quality, but assume a determined attacker can produce chunk-level passages that survive it. Layer defenses instead of relying on one stage.
-
Filter at chunk granularity. Because the attack lives in chunks, so should detection. Chunk-wise perplexity and anomaly filtering — as in the RAGuard defense (arXiv:2510.25025) — catches statistically odd passages before generation. Pair it with provenance scoring of the source document.
-
Combine sparse and dense retrieval. Hybrid BM25 + vector retrieval breaks attacks tuned purely against an embedding space, since a passage optimized for embedding proximity rarely also wins on lexical match. It is not a complete fix, but it removes an entire class of dense-only optimizations.
-
Evaluate against your real pipeline. This is the paper’s core lesson applied defensively: red-team poisoning through your actual chunk size, retriever, and reranker, with adaptive attackers — not a single-stage toy setup. A control validated only in a simplified setting is unproven in production.
-
Inspect what was retrieved. Log the passages and source documents that fed each answer. When an output looks manipulated, retrieval-level provenance is what lets you find the poisoned record.
Status
| Item | Reference | Date | Notes |
|---|---|---|---|
| CRCP paper | arXiv:2606.11265 | 2026-06-09 | Chunk-aware, rerank-consistent corpus poisoning |
| Dense-retrieval poisoning | GASLITE, arXiv:2412.20953 | CCS ‘25 | Feasible at ≤0.0001% poisoning rate |
| Defense reference | RAGuard, arXiv:2510.25025 | 2025-10 | Chunk-wise perplexity filtering |
| Real-world status | — | — | Benchmark research; no in-the-wild incident reported |
The headline is not “rerankers are useless.” It is that RAG security has to be measured end to end — chunking, retrieval, reranking, generation — because an attacker optimizing for your real pipeline will find the parts your single-stage evaluation never tested.