English
Week 2: Reasoning
05. Advanced (Self-RAG)

05. Advanced: Self-RAG with LangGraph

The Problem with Traditional RAG

Standard RAG is a straight pipeline: Retrieve → Generate. But what happens when:

  • Retrieved documents are irrelevant to the question?
  • The generated answer hallucinates beyond the documents?
  • The answer is technically correct but not useful?

Traditional RAG is blind—it can't detect or correct these failures.

What is Self-RAG?

Self-RAG (Self-Reflective RAG) adds feedback loops that enable the system to:

  1. Grade Documents: Is the retrieved content actually relevant?
  2. Grade Generation: Is the answer grounded in the documents?
  3. Correct: If something is wrong, re-retrieve or regenerate

Key Insight: Self-RAG transforms RAG from a pipeline into a control loop that can detect and correct its own failures.

Architecture with LangGraph

Self-RAG is naturally expressed as a state machine with conditional edges:

Implementation

Step 1: Define State

Track all the information needed for grading and looping:

from typing import TypedDict, List
 
class GraphState(TypedDict):
    question: str
    documents: List[str]
    generation: str
    relevance: str      # "yes" or "no"
    hallucination: str  # "yes" or "no"
    useful: str         # "yes" or "no"

Step 2: Build the Nodes

def retrieve(state: GraphState):
    """Retrieve documents from vector store"""
    print("---RETRIEVE---")
    question = state["question"]
 
    # In production: Use a real vector store
    docs = vector_store.similarity_search(question, k=3)
 
    return {"documents": [doc.page_content for doc in docs]}

Step 3: Conditional Edges

The decision logic that controls the flow:

def decide_to_generate(state: GraphState) -> str:
    """Route based on document relevance"""
    if state["relevance"] == "yes":
        print("---DOCS RELEVANT: Generate---")
        return "generate"
    else:
        print("---DOCS NOT RELEVANT: Search Web---")
        return "web_search"
 
def decide_to_finish(state: GraphState) -> str:
    """Route based on hallucination check"""
    if state["hallucination"] == "no":
        print("---ANSWER GROUNDED: Finish---")
        return "finish"
    else:
        print("---HALLUCINATED: Regenerate---")
        return "regenerate"

Step 4: Build the Graph

from langgraph.graph import StateGraph, END
 
workflow = StateGraph(GraphState)
 
# Add nodes
workflow.add_node("retrieve", retrieve)
workflow.add_node("grade_documents", grade_documents)
workflow.add_node("generate", generate)
workflow.add_node("grade_generation", grade_generation)
workflow.add_node("web_search", web_search)
 
# Set entry point
workflow.set_entry_point("retrieve")
 
# Add edges
workflow.add_edge("retrieve", "grade_documents")
workflow.add_edge("web_search", "grade_documents")
workflow.add_edge("generate", "grade_generation")
 
# Conditional edges
workflow.add_conditional_edges(
    "grade_documents",
    decide_to_generate,
    {
        "generate": "generate",
        "web_search": "web_search"
    }
)
 
workflow.add_conditional_edges(
    "grade_generation",
    decide_to_finish,
    {
        "finish": END,
        "regenerate": "generate"
    }
)
 
# Compile
app = workflow.compile()

Step 5: Run the Agent

inputs = {"question": "What are AI Agents?"}
 
for output in app.stream(inputs):
    for key, value in output.items():
        print(f"Finished: {key}")
 
# Output:
# ---RETRIEVE---
# Finished: retrieve
# ---GRADE DOCUMENTS---
# ---DOCS RELEVANT: Generate---
# Finished: grade_documents
# ---GENERATE---
# Finished: generate
# ---GRADE GENERATION---
# ---ANSWER GROUNDED: Finish---
# Finished: grade_generation

Self-RAG Grading Criteria

CheckQuestionIf Failed
RelevanceIs the document about the topic?Re-retrieve or web search
GroundingIs the answer supported by docs?Regenerate
UsefulnessDoes the answer address the question?Regenerate with different prompt

Comparison: Traditional RAG vs Self-RAG

AspectTraditional RAGSelf-RAG
ArchitecturePipelineControl loop
Error HandlingNoneAutomatic correction
HallucinationUndetectedDetected and fixed
LatencyLower (single pass)Higher (multiple passes)
ReliabilityVariableMore consistent
CostLowerHigher (more LLM calls)

Advanced: CRAG (Corrective RAG)

CRAG extends Self-RAG with more sophisticated correction strategies:

Hands-on Practice

In the notebook, you will:

Define State Schema

Create a TypedDict for tracking all grading information

Build Grading Nodes

Implement document and generation graders with structured output

Add Conditional Logic

Create decision functions for routing

Visualize the Graph

Use LangGraph's built-in visualization

Test with Edge Cases

Try queries that trigger different paths

Key Takeaways

  1. Feedback loops enable self-correction - Traditional RAG fails silently; Self-RAG detects and fixes
  2. Structured output for grading - Use Pydantic models for reliable yes/no decisions
  3. LangGraph makes it simple - Conditional edges naturally express routing logic
  4. Trade-off: latency vs reliability - More checks mean more LLM calls but fewer errors

References & Further Reading

Academic Papers

Related Tutorials

Next Steps

Congratulations on completing Week 2! Head to the Weekend Project to build a Self-Correcting Coder that applies these reflection patterns to code generation.