/terms/rag
RAG (Retrieval-Augmented Generation)
Citation status
Last checked 2026-05-21
What is RAG?
A two-stage AI architecture introduced in 2020 by Lewis et al. (Facebook AI Research). Stage 1: a retriever fetches relevant passages from a corpus given the user query. Stage 2: a generator (the LLM) produces an answer conditioned on the retrieved passages, typically citing them inline. Most modern AI search engines — Perplexity, Bing Chat, Google AI Overview, Claude search — are RAG systems with proprietary refinements.
Status in 2026
Foundational. Nearly every production AI search engine uses some flavor of RAG. The variations matter — naive RAG, hybrid retrieval, agentic RAG, hierarchical RAG, self-RAG — but the two-stage retrieve-then-generate core is universal. Understanding RAG is the prerequisite for understanding why structured content gets cited and unstructured content does not.
How it relates to other concepts
- Underlies sub-document retrieval — RAG operates at the passage level.
- Companion to agentic retrieval — RAG is the substrate; agentic retrieval is the orchestration that decides when and how to invoke RAG.
- Mechanically dependent on vector embeddings for the semantic-match component of retrieval.
- Direct technical mechanism for GEO success — what gets retrieved is what gets generated, and what gets cited.
Related terms
FAQ
- Is RAG the same as a search engine?
- A RAG system contains a search engine (the retrieval stage), but adds a generation stage. Traditional search returns ranked links; RAG returns a synthesized answer grounded in those links, often with inline citations to the retrieved sources.
- What is the difference between RAG and fine-tuning?
- Fine-tuning permanently modifies the language model's weights to encode new knowledge. RAG injects external content at runtime without changing the model. Most production systems combine both — fine-tuning for behavior, RAG for fresh and citeable knowledge.
- How do I optimize content for RAG-based AI engines?
- Structure content so retrieval succeeds (clear headings, schema markup, semantic clarity) and so generation cites you (passage-level optimization, statistical density, attributable claims, recent dates). These are the same techniques that comprise GEO.