RAG
What is it?
Retrieval-Augmented Generation (RAG) is a technique that gives an LLM an “open-book” exam rather than a closed-book one. Instead of relying solely on what the model memorized during training (which might be outdated or incorrect), RAG searches for relevant information in an external private dataset and feeds it to the model alongside the question.
Why is it Important?
- Accuracy & Hallucination Reduction: By grounding the model in retrieved facts, it is less likely to make things up.
- Freshness: You can update the external knowledge base instantly without re-training the expensive model.
- Privacy & Authority: It allows general-purpose models (like GPT-4) to answer questions about proprietary, private data securely.
Technical View
RAG is a pipeline of three steps: Retrieve (find data), Augment (add to context), Generate (answer).