01. Planning Agents & Chain of Thought
Why Planning Matters
The ReAct pattern from Week 1 works well for simple tasks, but complex problems require a different approach. When agents improvise step-by-step, they can:
- Get stuck in infinite loops
- Lose track of the overall goal
- Make myopic decisions that hurt long-term outcomes
Planning Agents solve this by creating a complete plan before taking action—just like how humans think before they act.
Chain of Thought (CoT) Prompting
The foundation of planning is Chain of Thought—prompting the LLM to "think step by step."
The Magic Phrase
Simply adding "Let's think step by step" dramatically improves reasoning:
# Standard Prompting
prompt_standard = f"Question: {problem}\nAnswer:"
# CoT Prompting
prompt_cot = f"Question: {problem}\nLet's think step by step."Why CoT Works
| Aspect | Standard Prompt | Chain of Thought |
|---|---|---|
| Process | Direct answer | Explicit reasoning steps |
| Accuracy | Prone to errors on complex problems | Higher accuracy |
| Transparency | Black box | Visible reasoning |
| Error Detection | Hard to identify | Errors visible in reasoning |
Research Insight: Wei et al. (2022) showed that CoT prompting can improve performance on math word problems from 17.9% to 78.7% accuracy on the GSM8K benchmark.
CoT in Practice
problem = """
John had 5 apples. He gave 2 to Mary and ate 1.
Then he got 3 more from Mike, but dropped 1 on the way home.
How many apples does John have now?
"""
def get_completion(prompt):
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
],
temperature=0
)
return response.choices[0].message.content
# With CoT
response = get_completion(f"{problem}\nLet's think step by step.")
# Output:
# 1. John starts with 5 apples
# 2. Gives 2 to Mary: 5 - 2 = 3 apples
# 3. Eats 1: 3 - 1 = 2 apples
# 4. Gets 3 from Mike: 2 + 3 = 5 apples
# 5. Drops 1: 5 - 1 = 4 apples
# Answer: 4 applesPlan-and-Execute Pattern
Building on CoT, the Plan-and-Execute pattern separates planning from execution:
Architecture Components
Planner
Analyzes the query and creates a structured plan with ordered steps
Executor
Executes each step using available tools
Synthesizer
Combines results from all steps into a final answer
Implementing the Planner
Use Pydantic for structured output:
from pydantic import BaseModel, Field
from typing import List
class PlanStep(BaseModel):
id: int = Field(description="Step number (starts from 1)")
description: str = Field(description="What to do in this step")
tool: str = Field(description="Tool to use (search or calculate)")
args: str = Field(description="Arguments for the tool")
class Plan(BaseModel):
steps: List[PlanStep] = Field(description="Ordered list of steps")
def create_plan(query: str) -> Plan:
completion = client.beta.chat.completions.parse(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Create a step-by-step plan using available tools."},
{"role": "user", "content": query}
],
response_format=Plan
)
return completion.choices[0].message.parsedImplementing the Executor
def execute_plan(plan: Plan, tools: dict) -> str:
results = {}
for step in plan.steps:
if step.tool in tools:
result = tools[step.tool](step.args)
results[step.id] = result
print(f"Step {step.id}: {result}")
# Synthesize final answer
return synthesize_results(results)Plan-and-Execute vs ReAct
| Feature | ReAct | Plan-and-Execute |
|---|---|---|
| Approach | Interleaved thinking and acting | Plan first, then execute |
| Flexibility | Highly adaptive | Follows predetermined plan |
| Best For | Exploration, interactive tasks | Multi-step analysis, recipes |
| Weakness | Can get lost (myopic) | Rigid if plan is wrong |
| Recovery | Natural adaptation | Requires explicit replanning |
Advanced: Replanning
When execution fails, a Replanner can adjust the strategy:
def execute_with_replanning(plan: Plan, max_replans: int = 2):
for attempt in range(max_replans):
results, failed_step = execute_plan(plan)
if failed_step is None:
return results # Success!
# Replan from the failed step
plan = replan(plan, failed_step, results)
return results # Best effortHands-on Practice
In the notebook, you will:
Experience CoT
Compare standard prompting vs. CoT prompting on logic puzzles
Build a Planner
Create a Pydantic-based planner that outputs structured plans
Implement Execution
Execute plans using mock search and calculator tools
Run the Full Pipeline
Combine planning and execution for complex queries
Key Takeaways
- CoT is the foundation - "Think step by step" dramatically improves reasoning
- Separate concerns - Planning and execution are different cognitive tasks
- Structured output - Use Pydantic/JSON Schema for reliable plan formats
- Plan recovery - Implement replanning for robust agents
References & Further Reading
Academic Papers
-
"Chain-of-Thought Prompting Elicits Reasoning in Large Language Models" - Wei et al., 2022
- arXiv:2201.11903 (opens in a new tab)
- The foundational paper introducing CoT prompting
-
"Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning" - Wang et al., 2023
- arXiv:2305.04091 (opens in a new tab)
- Extends CoT with explicit planning
-
"ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models" - Xu et al., 2023
- arXiv:2305.18323 (opens in a new tab)
- Efficient plan-and-execute architecture
-
"Least-to-Most Prompting Enables Complex Reasoning in Large Language Models" - Zhou et al., 2023
- arXiv:2205.10625 (opens in a new tab)
- Problem decomposition for complex reasoning
Next Steps
Now that you understand planning, head to Reflection Agents to learn how agents can critique and improve their own outputs!