The agentic pattern library.
Eight patterns cover ~95% of real agentic systems. Almost every production agent is a composition of these. Learn them; stop reinventing. Each card has the PM questions you should answer before shipping it.
Every pattern below is annotated with four PM-side quadrants: When to use · PM questions to ask · Failure modes · Composability. Use these to spec a pattern, not just admire the diagram.
Prompt Chaining
Sequential workflowMulti-stage transformation — each output feeds the next input. Outline → Draft → Polish.
- ·Multi-stage transformation: research → draft → review
- ·Each step needs a different specialisation or tool set
- ·Earlier stages gate the logic of later ones
- →What exactly passes between stages — full text or a structured schema?
- →Where can the chain fail, and what is the fallback for each stage?
- →Does each agent need full history or just the prior stage's output?
- ·Error amplification — a bad Stage A produces a worse Stage B
- ·Context loss at handoffs when only partial data is passed
- ·No recovery path if a mid-chain agent fails silently
- ·Replace any node with an Orchestrator-Worker sub-flow
- ·Insert Human-in-the-Loop between high-stakes stage transitions
- ·Add Reflection to any stage's output before passing downstream
async function chain(input) {
const outline = await llm({ prompt: outlinePrompt(input) });
if (!isValid(outline)) return fallback(); // gate between steps
const draft = await llm({ prompt: draftPrompt(outline) });
const polished = await llm({ prompt: polishPrompt(draft) });
return polished;
}Routing
Dispatch workflowClassify the input first — then route to the right specialist.
- ·High-volume, multi-intent inputs (support, search, triage)
- ·Different intent types need very different handling logic
- ·Cost optimisation — cheap classifier, expensive specialists only when warranted
- →What are the routing categories — and what is the explicit fallback?
- →How confident must the classifier be before routing?
- →Who owns adding new categories as the intent space grows?
- ·Misclassification sends input to the wrong specialist
- ·Unrecognised intent falls into a poor or absent fallback handler
- ·Category space grows stale relative to real user behaviour
- ·Each specialist can be a Pipeline or Orchestrator pattern
- ·Add Reflection to the classifier for confidence scoring
- ·Route low-confidence cases directly to Human-in-the-Loop
async function route(query) {
const { kind } = await classify({ query }); // small/cheap LLM
switch (kind) {
case "refund": return refundAgent(query);
case "technical": return techAgent(query);
case "billing": return billingAgent(query);
default: return generalAgent(query);
}
}Parallelization / Fan-Out
Concurrent workflowMultiple agents tackle the same task in parallel — a synthesiser merges results.
- ·Research, competitive analysis, document review
- ·Tasks that benefit from multiple independent perspectives
- ·Speed matters and sub-tasks can run in true parallel
- →How does the synthesiser handle contradictions between agents?
- →Are agents truly independent, or sharing context that biases results?
- →What is the latency and cost implication of N parallel inference calls?
- ·Conflicting outputs the synthesiser cannot reconcile meaningfully
- ·Synthesis hallucination — inventing a consensus that wasn't there
- ·One slow agent blocking the entire merge step
- ·Fan-out workers can each be Single Agent + Tools internally
- ·Wrap the synthesiser with Reflection for quality validation
- ·Use Human-in-the-Loop at the selection step instead of auto-synthesis
// Sectioning
const sections = await Promise.all(
chapters.map((c) => llm({ prompt: writeChapter(c) }))
);
return assemble(sections);
// Voting / fan-out
const variants = await Promise.all(
hookTypes.map((h) => llm({ prompt: generate(h, signals) }))
);
return critic.pickBest(variants);Orchestrator–Workers
Hierarchical agentOne planner breaks the task — many specialists execute it. The shape of subtasks isn't known in advance.
- ·Complex tasks that can be cleanly decomposed at runtime
- ·Sub-tasks independent enough to delegate in parallel
- ·High-quality research, analysis, or multi-file coding projects
- →How does the orchestrator decide how to decompose the task?
- →How does it know a worker finished — and finished correctly?
- →What is the aggregation strategy (merge, rank, synthesise)?
- ·Poor task decomposition causes workers to drift off-target
- ·Workers executing without sufficient context from the original task
- ·Orchestrator overconfident in worker outputs without validation
- ·Workers can internally use Pipeline or Single Agent patterns
- ·Wrap worker outputs with Reflection before the orchestrator sees them
- ·Add Human-in-the-Loop before the orchestrator aggregates final output
async function orchestrate(goal) {
const plan = await planner({ goal }); // returns Task[]
const results = await Promise.all(
plan.tasks.map(t => workerAgent(t))
);
return synthesizer({ goal, results });
}Evaluator–Optimizer Loop ⭐
Reflection / self-critiqueGenerate → critique → regenerate. Loop until quality threshold is met. The powerhouse of agentic quality.
Drafts the artifact.
Returns structured review via tool-calling.
[
{ "location": "para 2",
"comment": "Too generic",
"severity": "high" }
]Applies fixes, produces v2. Loop.
- ·High-stakes outputs: writing, code, legal, medical, financial
- ·Output quality is hard to specify precisely in advance
- ·Trust-building phases where errors are expensive to correct after the fact
- →What is the quality rubric the critic evaluates against — and who wrote it?
- →How many iterations before a forced exit to prevent infinite loops?
- →Who defines 'good enough' — the agent, a threshold, or a human?
- ·Infinite loops when no convergence criterion is defined
- ·Critic missing the real problem while nitpicking surface-level issues
- ·Over-refinement producing generic, hedged, over-cautious output
- ·Wrap any pattern's final output step with Reflection
- ·The Critic can be a separate, smaller specialised model
- ·Combine with HITL: human reviews only outputs below threshold
async function evaluatorOptimizer(input, maxRounds = 2) {
let draft = await generator({ input });
for (let i = 0; i < maxRounds; i++) {
const review = await evaluator({ input, draft }); // returns Issue[]
if (review.issues.length === 0) break;
draft = await editor({ input, draft, review });
}
return draft;
}ReAct (Reason + Act)
Single-agent tool loopThe agent dynamically chooses tools based on intermediate observations. Thought → Action → Observation, repeat.
- ·Agent must choose tools dynamically based on intermediate results
- ·Open-ended research, debugging, or exploration tasks
- ·Inputs vary widely enough that you can't predetermine a tool sequence
- →Which tools does the agent actually need — and nothing more?
- →What is the maxSteps budget and what happens when it's hit?
- →How does the agent know when it's done vs. when to keep going?
- ·Tool hallucination (inventing results that weren't returned)
- ·Infinite tool loops or step-budget exhaustion with no answer
- ·Context window overflow on long-running ReAct chains
- ·Use as a worker inside Orchestrator-Workers for research subtasks
- ·Wrap the final answer with Reflection for quality-critical outputs
- ·Pair with Memory-Augmented to persist observations across sessions
async function react(goal, maxSteps = 8) {
const messages = [systemPrompt, { role: "user", content: goal }];
for (let i = 0; i < maxSteps; i++) {
const r = await llm({ messages, tools });
if (r.finish) return r.answer;
const obs = await runTool(r.tool, r.args); // ACT
messages.push(r.assistant, { role: "tool", content: obs });
}
}Human-in-the-Loop
Control / governanceAgent pauses at a defined checkpoint — human decides — flow continues. The pattern that buys you trust.
- ·High-risk or irreversible decisions (publishing, spending, deleting)
- ·Regulated domains: finance, legal, medical, compliance
- ·Early trust-building phase before moving to full automation
- →Exactly where does the flow pause — what trigger condition?
- →What information does the human need at that moment to decide well?
- →What is the escalation path if the human is unavailable or unresponsive?
- ·Checkpoint fatigue — humans rubber-stamping without reading
- ·Poorly designed review interface — human lacks context to decide
- ·Bottleneck breaks real-time or low-latency flow requirements
- ·Insert between any two stages in a Pipeline
- ·Pair with Reflection to reduce the volume reaching human review
- ·Route low-confidence Router outputs to HITL automatically
async function withApproval(stage, payload) {
const draft = await stage(payload);
const decision = await humanReview({
draft,
context: payload,
deadline: minutes(15),
});
if (decision.action === "reject") return null;
if (decision.action === "edit") return decision.edits;
return draft; // approved as-is
}Memory-Augmented
Persistent stateAgent maintains context and learns across sessions. The pattern that turns one-shot tools into ongoing assistants.
- ·Persistent assistants with ongoing user or account relationships
- ·Workflows where prior context is essential (projects, campaigns, accounts)
- ·Systems designed to improve by learning from interaction history
- →What gets stored — and with what retention and deletion policy?
- →How is memory retrieved: keyword match, semantic search, recency ranking?
- →Who can inspect, correct, or permanently delete memories?
- ·Memory poisoning — bad data persists and compounds over time
- ·Irrelevant retrieval polluting the agent's context window
- ·Privacy and data-governance risks at scale, especially across users
- ·Add to any pattern to give it session persistence
- ·Combine with HITL for memory write-approval in v1
- ·Use Reflection to validate what gets stored before it enters memory
async function withMemory(userId, input) {
const memories = await memory.search({
userId, query: input, k: 5,
});
const result = await agent({
input,
context: memories,
});
await memory.write({
userId, content: extractFacts(result),
});
return result;
}