05. Advanced: Self-RAG with LangGraph

The Problem with Traditional RAG

Standard RAG is a straight pipeline: Retrieve → Generate. But what happens when:

Retrieved documents are irrelevant to the question?
The generated answer hallucinates beyond the documents?
The answer is technically correct but not useful?

Traditional RAG is blind—it can't detect or correct these failures.

What is Self-RAG?

Self-RAG (Self-Reflective RAG) adds feedback loops that enable the system to:

Grade Documents: Is the retrieved content actually relevant?
Grade Generation: Is the answer grounded in the documents?
Correct: If something is wrong, re-retrieve or regenerate

Key Insight: Self-RAG transforms RAG from a pipeline into a control loop that can detect and correct its own failures.

Architecture with LangGraph

Self-RAG is naturally expressed as a state machine with conditional edges:

Implementation

Step 1: Define State

Track all the information needed for grading and looping:

from typing import TypedDict, List
 
class GraphState(TypedDict):
    question: str
    documents: List[str]
    generation: str
    relevance: str      # "yes" or "no"
    hallucination: str  # "yes" or "no"
    useful: str         # "yes" or "no"

Step 2: Build the Nodes

def retrieve(state: GraphState):
    """Retrieve documents from vector store"""
    print("---RETRIEVE---")
    question = state["question"]
 
    # In production: Use a real vector store
    docs = vector_store.similarity_search(question, k=3)
 
    return {"documents": [doc.page_content for doc in docs]}

Step 3: Conditional Edges

The decision logic that controls the flow:

def decide_to_generate(state: GraphState) -> str:
    """Route based on document relevance"""
    if state["relevance"] == "yes":
        print("---DOCS RELEVANT: Generate---")
        return "generate"
    else:
        print("---DOCS NOT RELEVANT: Search Web---")
        return "web_search"
 
def decide_to_finish(state: GraphState) -> str:
    """Route based on hallucination check"""
    if state["hallucination"] == "no":
        print("---ANSWER GROUNDED: Finish---")
        return "finish"
    else:
        print("---HALLUCINATED: Regenerate---")
        return "regenerate"

Step 4: Build the Graph

from langgraph.graph import StateGraph, END
 
workflow = StateGraph(GraphState)
 
# Add nodes
workflow.add_node("retrieve", retrieve)
workflow.add_node("grade_documents", grade_documents)
workflow.add_node("generate", generate)
workflow.add_node("grade_generation", grade_generation)
workflow.add_node("web_search", web_search)
 
# Set entry point
workflow.set_entry_point("retrieve")
 
# Add edges
workflow.add_edge("retrieve", "grade_documents")
workflow.add_edge("web_search", "grade_documents")
workflow.add_edge("generate", "grade_generation")
 
# Conditional edges
workflow.add_conditional_edges(
    "grade_documents",
    decide_to_generate,
    {
        "generate": "generate",
        "web_search": "web_search"
    }
)
 
workflow.add_conditional_edges(
    "grade_generation",
    decide_to_finish,
    {
        "finish": END,
        "regenerate": "generate"
    }
)
 
# Compile
app = workflow.compile()

Step 5: Run the Agent

inputs = {"question": "What are AI Agents?"}
 
for output in app.stream(inputs):
    for key, value in output.items():
        print(f"Finished: {key}")
 
# Output:
# ---RETRIEVE---
# Finished: retrieve
# ---GRADE DOCUMENTS---
# ---DOCS RELEVANT: Generate---
# Finished: grade_documents
# ---GENERATE---
# Finished: generate
# ---GRADE GENERATION---
# ---ANSWER GROUNDED: Finish---
# Finished: grade_generation

Self-RAG Grading Criteria

Check	Question	If Failed
Relevance	Is the document about the topic?	Re-retrieve or web search
Grounding	Is the answer supported by docs?	Regenerate
Usefulness	Does the answer address the question?	Regenerate with different prompt

Comparison: Traditional RAG vs Self-RAG

Aspect	Traditional RAG	Self-RAG
Architecture	Pipeline	Control loop
Error Handling	None	Automatic correction
Hallucination	Undetected	Detected and fixed
Latency	Lower (single pass)	Higher (multiple passes)
Reliability	Variable	More consistent
Cost	Lower	Higher (more LLM calls)

Advanced: CRAG (Corrective RAG)

CRAG extends Self-RAG with more sophisticated correction strategies:

Hands-on Practice

In the notebook, you will:

Define State Schema

Create a TypedDict for tracking all grading information

Build Grading Nodes

Implement document and generation graders with structured output

Add Conditional Logic

Create decision functions for routing

Visualize the Graph

Use LangGraph's built-in visualization

Test with Edge Cases

Try queries that trigger different paths

Key Takeaways

Feedback loops enable self-correction - Traditional RAG fails silently; Self-RAG detects and fixes
Structured output for grading - Use Pydantic models for reliable yes/no decisions
LangGraph makes it simple - Conditional edges naturally express routing logic
Trade-off: latency vs reliability - More checks mean more LLM calls but fewer errors

References & Further Reading

Self-RAG Paper CRAG Paper LangGraph RAG Tutorial

Academic Papers

"Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection" - Asai et al., 2023
- arXiv:2310.11511 (opens in a new tab)
- Foundation for self-reflective retrieval
"Corrective Retrieval Augmented Generation" - Yan et al., 2024
- arXiv:2401.15884 (opens in a new tab)
- Advanced correction strategies
"Active Retrieval Augmented Generation" - Jiang et al., 2023
- arXiv:2305.06983 (opens in a new tab)
- When to retrieve, not just what
"FLARE: Active Retrieval Augmented Generation" - Jiang et al., 2023
- arXiv:2305.06983 (opens in a new tab)
- Forward-looking active retrieval

Next Steps

Congratulations on completing Week 2! Head to the Weekend Project to build a Self-Correcting Coder that applies these reflection patterns to code generation.

04. Modern Stack (Graph)Weekend Project