05. Advanced: Self-RAG with LangGraph
The Problem with Traditional RAG
Standard RAG is a straight pipeline: Retrieve → Generate. But what happens when:
- Retrieved documents are irrelevant to the question?
- The generated answer hallucinates beyond the documents?
- The answer is technically correct but not useful?
Traditional RAG is blind—it can't detect or correct these failures.
What is Self-RAG?
Self-RAG (Self-Reflective RAG) adds feedback loops that enable the system to:
- Grade Documents: Is the retrieved content actually relevant?
- Grade Generation: Is the answer grounded in the documents?
- Correct: If something is wrong, re-retrieve or regenerate
Key Insight: Self-RAG transforms RAG from a pipeline into a control loop that can detect and correct its own failures.
Architecture with LangGraph
Self-RAG is naturally expressed as a state machine with conditional edges:
Implementation
Step 1: Define State
Track all the information needed for grading and looping:
from typing import TypedDict, List
class GraphState(TypedDict):
question: str
documents: List[str]
generation: str
relevance: str # "yes" or "no"
hallucination: str # "yes" or "no"
useful: str # "yes" or "no"Step 2: Build the Nodes
def retrieve(state: GraphState):
"""Retrieve documents from vector store"""
print("---RETRIEVE---")
question = state["question"]
# In production: Use a real vector store
docs = vector_store.similarity_search(question, k=3)
return {"documents": [doc.page_content for doc in docs]}Step 3: Conditional Edges
The decision logic that controls the flow:
def decide_to_generate(state: GraphState) -> str:
"""Route based on document relevance"""
if state["relevance"] == "yes":
print("---DOCS RELEVANT: Generate---")
return "generate"
else:
print("---DOCS NOT RELEVANT: Search Web---")
return "web_search"
def decide_to_finish(state: GraphState) -> str:
"""Route based on hallucination check"""
if state["hallucination"] == "no":
print("---ANSWER GROUNDED: Finish---")
return "finish"
else:
print("---HALLUCINATED: Regenerate---")
return "regenerate"Step 4: Build the Graph
from langgraph.graph import StateGraph, END
workflow = StateGraph(GraphState)
# Add nodes
workflow.add_node("retrieve", retrieve)
workflow.add_node("grade_documents", grade_documents)
workflow.add_node("generate", generate)
workflow.add_node("grade_generation", grade_generation)
workflow.add_node("web_search", web_search)
# Set entry point
workflow.set_entry_point("retrieve")
# Add edges
workflow.add_edge("retrieve", "grade_documents")
workflow.add_edge("web_search", "grade_documents")
workflow.add_edge("generate", "grade_generation")
# Conditional edges
workflow.add_conditional_edges(
"grade_documents",
decide_to_generate,
{
"generate": "generate",
"web_search": "web_search"
}
)
workflow.add_conditional_edges(
"grade_generation",
decide_to_finish,
{
"finish": END,
"regenerate": "generate"
}
)
# Compile
app = workflow.compile()Step 5: Run the Agent
inputs = {"question": "What are AI Agents?"}
for output in app.stream(inputs):
for key, value in output.items():
print(f"Finished: {key}")
# Output:
# ---RETRIEVE---
# Finished: retrieve
# ---GRADE DOCUMENTS---
# ---DOCS RELEVANT: Generate---
# Finished: grade_documents
# ---GENERATE---
# Finished: generate
# ---GRADE GENERATION---
# ---ANSWER GROUNDED: Finish---
# Finished: grade_generationSelf-RAG Grading Criteria
| Check | Question | If Failed |
|---|---|---|
| Relevance | Is the document about the topic? | Re-retrieve or web search |
| Grounding | Is the answer supported by docs? | Regenerate |
| Usefulness | Does the answer address the question? | Regenerate with different prompt |
Comparison: Traditional RAG vs Self-RAG
| Aspect | Traditional RAG | Self-RAG |
|---|---|---|
| Architecture | Pipeline | Control loop |
| Error Handling | None | Automatic correction |
| Hallucination | Undetected | Detected and fixed |
| Latency | Lower (single pass) | Higher (multiple passes) |
| Reliability | Variable | More consistent |
| Cost | Lower | Higher (more LLM calls) |
Advanced: CRAG (Corrective RAG)
CRAG extends Self-RAG with more sophisticated correction strategies:
Hands-on Practice
In the notebook, you will:
Define State Schema
Create a TypedDict for tracking all grading information
Build Grading Nodes
Implement document and generation graders with structured output
Add Conditional Logic
Create decision functions for routing
Visualize the Graph
Use LangGraph's built-in visualization
Test with Edge Cases
Try queries that trigger different paths
Key Takeaways
- Feedback loops enable self-correction - Traditional RAG fails silently; Self-RAG detects and fixes
- Structured output for grading - Use Pydantic models for reliable yes/no decisions
- LangGraph makes it simple - Conditional edges naturally express routing logic
- Trade-off: latency vs reliability - More checks mean more LLM calls but fewer errors
References & Further Reading
Academic Papers
-
"Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection" - Asai et al., 2023
- arXiv:2310.11511 (opens in a new tab)
- Foundation for self-reflective retrieval
-
"Corrective Retrieval Augmented Generation" - Yan et al., 2024
- arXiv:2401.15884 (opens in a new tab)
- Advanced correction strategies
-
"Active Retrieval Augmented Generation" - Jiang et al., 2023
- arXiv:2305.06983 (opens in a new tab)
- When to retrieve, not just what
-
"FLARE: Active Retrieval Augmented Generation" - Jiang et al., 2023
- arXiv:2305.06983 (opens in a new tab)
- Forward-looking active retrieval
Related Tutorials
- LangGraph Adaptive RAG: Tutorial (opens in a new tab)
- LangGraph CRAG: Tutorial (opens in a new tab)
Next Steps
Congratulations on completing Week 2! Head to the Weekend Project to build a Self-Correcting Coder that applies these reflection patterns to code generation.