RAG pipelines face knowledge base poisoning via prompt injection
Author Cor E warns that RAG pipelines introduce a new prompt injection attack surface — knowledge base poisoning — where malicious instructions buried in uploaded documents can hijack LLM behavior when retrieved, and proposes scanning at both ingestion and query time as defenses.
Score breakdown
Teams building RAG pipelines should add chunk-level scanning at both document ingestion and query time to prevent malicious documents from silently hijacking LLM behavior in production.
- 01Knowledge base poisoning embeds prompt injection instructions inside uploaded documents, which execute silently when retrieved by a RAG query.
- 02The attack bypasses standard input validation because malicious content enters through the document ingestion pipeline, not user input fields.
- 03Query-time defense: each retrieved chunk is scanned via a scrub API before prompt assembly; blocked chunks are dropped, adding per-query latency.
Cor E's article argues that RAG (Retrieval-Augmented Generation) pipelines represent the next major frontier for prompt injection attacks, distinct from the direct user-input injection that most teams already guard against. In a RAG system, a user's query triggers a vector database search, and the retrieved document chunks are injected directly into the LLM's context window — meaning any malicious instructions embedded in those chunks will be read and potentially followed by the model. The article illustrates this with a concrete example: an attacker uploads a PDF containing hidden instructions to redirect users to a phishing login page at a malicious URL. The document looks benign in the library but becomes a live attack the moment it is retrieved.
The article proposes two defensive insertion points in the pipeline.
The article proposes two defensive insertion points in the pipeline. The first is query-time scanning: before assembling the prompt, each retrieved chunk is passed through a scrub API endpoint; chunks flagged as clean or flagged are included, while blocked chunks are silently dropped. The tradeoff is added latency — one API call per chunk per query. The second, preferred approach is ingestion-time scanning: when a document is uploaded, its chunks are scanned in batch (up to 100 chunks per request, processed in parallel) before any embeddings are created or stored, keeping the knowledge base clean at the source. The article recommends using both layers where possible, with ingestion-time scanning as the primary defense and query-time scanning as a backstop for legacy content or externally sourced data.
The post notes that RAG is increasingly being adopted in regulated industries such as healthcare, legal, and finance, where a poisoned knowledge base constitutes not just a product bug but a compliance incident. The described scanning capabilities are attributed to a product called Sentinel, described as an AI firewall for LLM applications.
Key facts
- 01Knowledge base poisoning embeds prompt injection instructions inside uploaded documents, which execute silently when retrieved by a RAG query.
- 02The attack bypasses standard input validation because malicious content enters through the document ingestion pipeline, not user input fields.
- 03Query-time defense: each retrieved chunk is scanned via a scrub API before prompt assembly; blocked chunks are dropped, adding per-query latency.
- 04Ingestion-time defense: document chunks are batch-scanned (up to 100 chunks per request, in parallel) before embedding and storage.