RAG data poisoning via documents in ChatGPT

RAG (Retrieval-Augmented Generation) poisoning occurs when a malicious or manipulated document is uploaded to influence an AI system’s responses. In a RAG framework, the AI retrieves external information from uploaded sources to augment its answers, combining retrieved data with its generative capabilities. By injecting false, biased, or harmful content into these documents, an attacker can disrupt the AI’s output, causing it to generate misleading or damaging information. This vulnerability exploits the system’s reliance on external sources without rigorous validation. Preventing RAG poisoning requires robust safeguards, such as content sanitization, authenticity checks, and anomaly detection, to ensure the integrity of uploaded materials and maintain trustworthy AI outputs.