01. Planning Agents & Chain of Thought

Why Planning Matters

The ReAct pattern from Week 1 works well for simple tasks, but complex problems require a different approach. When agents improvise step-by-step, they can:

Get stuck in infinite loops
Lose track of the overall goal
Make myopic decisions that hurt long-term outcomes

Planning Agents solve this by creating a complete plan before taking action—just like how humans think before they act.

Chain of Thought (CoT) Prompting

The foundation of planning is Chain of Thought—prompting the LLM to "think step by step."

The Magic Phrase

Simply adding "Let's think step by step" dramatically improves reasoning:

# Standard Prompting
prompt_standard = f"Question: {problem}\nAnswer:"
 
# CoT Prompting
prompt_cot = f"Question: {problem}\nLet's think step by step."

Why CoT Works

Aspect	Standard Prompt	Chain of Thought
Process	Direct answer	Explicit reasoning steps
Accuracy	Prone to errors on complex problems	Higher accuracy
Transparency	Black box	Visible reasoning
Error Detection	Hard to identify	Errors visible in reasoning

Research Insight: Wei et al. (2022) showed that CoT prompting can improve performance on math word problems from 17.9% to 78.7% accuracy on the GSM8K benchmark.

CoT in Practice

problem = """
John had 5 apples. He gave 2 to Mary and ate 1.
Then he got 3 more from Mike, but dropped 1 on the way home.
How many apples does John have now?
"""
 
def get_completion(prompt):
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt}
        ],
        temperature=0
    )
    return response.choices[0].message.content
 
# With CoT
response = get_completion(f"{problem}\nLet's think step by step.")
# Output:
# 1. John starts with 5 apples
# 2. Gives 2 to Mary: 5 - 2 = 3 apples
# 3. Eats 1: 3 - 1 = 2 apples
# 4. Gets 3 from Mike: 2 + 3 = 5 apples
# 5. Drops 1: 5 - 1 = 4 apples
# Answer: 4 apples

Plan-and-Execute Pattern

Building on CoT, the Plan-and-Execute pattern separates planning from execution:

Architecture Components

Planner

Analyzes the query and creates a structured plan with ordered steps

Executor

Executes each step using available tools

Synthesizer

Combines results from all steps into a final answer

Implementing the Planner

Use Pydantic for structured output:

from pydantic import BaseModel, Field
from typing import List
 
class PlanStep(BaseModel):
    id: int = Field(description="Step number (starts from 1)")
    description: str = Field(description="What to do in this step")
    tool: str = Field(description="Tool to use (search or calculate)")
    args: str = Field(description="Arguments for the tool")
 
class Plan(BaseModel):
    steps: List[PlanStep] = Field(description="Ordered list of steps")
 
def create_plan(query: str) -> Plan:
    completion = client.beta.chat.completions.parse(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Create a step-by-step plan using available tools."},
            {"role": "user", "content": query}
        ],
        response_format=Plan
    )
    return completion.choices[0].message.parsed

Implementing the Executor

def execute_plan(plan: Plan, tools: dict) -> str:
    results = {}
 
    for step in plan.steps:
        if step.tool in tools:
            result = tools[step.tool](step.args)
            results[step.id] = result
            print(f"Step {step.id}: {result}")
 
    # Synthesize final answer
    return synthesize_results(results)

Plan-and-Execute vs ReAct

Feature	ReAct	Plan-and-Execute
Approach	Interleaved thinking and acting	Plan first, then execute
Flexibility	Highly adaptive	Follows predetermined plan
Best For	Exploration, interactive tasks	Multi-step analysis, recipes
Weakness	Can get lost (myopic)	Rigid if plan is wrong
Recovery	Natural adaptation	Requires explicit replanning

Advanced: Replanning

When execution fails, a Replanner can adjust the strategy:

def execute_with_replanning(plan: Plan, max_replans: int = 2):
    for attempt in range(max_replans):
        results, failed_step = execute_plan(plan)
 
        if failed_step is None:
            return results  # Success!
 
        # Replan from the failed step
        plan = replan(plan, failed_step, results)
 
    return results  # Best effort

Hands-on Practice

In the notebook, you will:

Experience CoT

Compare standard prompting vs. CoT prompting on logic puzzles

Build a Planner

Create a Pydantic-based planner that outputs structured plans

Implement Execution

Execute plans using mock search and calculator tools

Run the Full Pipeline

Combine planning and execution for complex queries

Key Takeaways

CoT is the foundation - "Think step by step" dramatically improves reasoning
Separate concerns - Planning and execution are different cognitive tasks
Structured output - Use Pydantic/JSON Schema for reliable plan formats
Plan recovery - Implement replanning for robust agents

References & Further Reading

Chain-of-Thought Paper Plan-and-Solve Prompting ReWOO Paper

Academic Papers

"Chain-of-Thought Prompting Elicits Reasoning in Large Language Models" - Wei et al., 2022
- arXiv:2201.11903 (opens in a new tab)
- The foundational paper introducing CoT prompting
"Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning" - Wang et al., 2023
- arXiv:2305.04091 (opens in a new tab)
- Extends CoT with explicit planning
"ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models" - Xu et al., 2023
- arXiv:2305.18323 (opens in a new tab)
- Efficient plan-and-execute architecture
"Least-to-Most Prompting Enables Complex Reasoning in Large Language Models" - Zhou et al., 2023
- arXiv:2205.10625 (opens in a new tab)
- Problem decomposition for complex reasoning

Next Steps

Now that you understand planning, head to Reflection Agents to learn how agents can critique and improve their own outputs!

Overview 02. LangGraph