QRefAI
Contents
Custom AI Agents

Part 5 — Orchestration

When do you actually need multiple agents — and what does a well-structured system look like?

6 min · Updated June 2026

It is tempting to architect everything as a swarm of specialised agents collaborating. Resist it.

Q5.1 — What is the multi-agent reality check?

Anthropic’s own published analysis is blunt: multi-agent systems use roughly 15× more tokensthan a single chat, so they only make economic sense when the task’s value is high enough to justify that, and the work is genuinely parallelisable or too large for one context window.

The decision tree:

  1. 1.Can a workflow — predefined steps — solve it? Do that. Cheapest, most reliable, most auditable.
  2. 2.If not, can a single agent with good tools and context management solve it? Do that.
  3. 3.Only if the task is genuinely parallel, exceeds a single context window, or spans many complex tool domains should you reach for multi-agent.

Most teams skip straight to step three. That is the mistake.

Decision tree for choosing between a workflow, a single agent, or a multi-agent system based on task complexity, parallelism, and context window size

Q5.2 — What are the five multi-agent patterns used in production?

When you do go multi-agent, the field has converged on five recurring shapes:

  • Supervisor (orchestrator-workers) is the 2026 default. One orchestrator agent owns the overall task and full context; it spins up ephemeral, isolated worker sub-agents for sub-tasks, each of which returns a compressed summary. This works because it combines a single point of coherent control with clean context isolation.
  • Pipeline (sequential)is staged refinement, where each agent’s output feeds the next — research → screen → schedule. Predictable and easy to reason about.
  • Fan-out (parallel) runs independent branches simultaneously and then merges them. The hard requirement is that the branches must be genuinely independent; if they need to coordinate mid-flight, this pattern breaks.
  • Debate has two agents argue a question and a third judge. Surprisingly effective for hard, subjective decisions, and cheap to wire up.
  • Swarm uses peer-to-peer agents with shared state and no fixed hierarchy. Powerful but hard to control. Reserve it for back-office work, almost never for a customer-facing journey.
The five multi-agent patterns used in production: supervisor, pipeline, fan-out, debate, and swarm — with their trade-offs and typical use cases

Q5.3 — What is the planner / generator / evaluator trio?

A particularly useful specialisation of the supervisor pattern for long-running tasks is to separate the agent that plans, the agent that does the work, and a separate agent that judges the result. Separating the doer from the judge measurably reduces the “graded its own homework” failure — where an agent confidently rates its own bad output as good — which matters enormously for subjective outputs like legal drafting or financial commentary.

The planner / generator / evaluator trio: three specialised agents where the planner decomposes the task, the generator does the work, and the evaluator judges the result

Q5.4 — What within-agent design patterns are actually used?

Independent of how many agents you have, each agent’s internal behaviour draws on a small set of patterns from the canonical Anthropic taxonomy:

  • Prompt chaining— break a task into sequential steps.
  • Routing— classify the input, then dispatch to the right handler. This is hugely underused; many problems that get built as agents are really routing problems.
  • Parallelisation— split into independent subtasks, or run the same task several times and vote.
  • Evaluator-optimizer— generate, critique with a separate evaluator, refine in a loop. Essential for high-stakes drafting.
  • ReAct— the baseline reason → act → observe loop.
  • Reflection— add a self-critique step; raises accuracy at the cost of latency.

You compose these. A contract-review agent might route by contract type, chain through extraction → analysis → drafting, and run an evaluator-optimizer loop on the final language. None of this is exotic; it is deliberate composition of simple patterns.

Within-agent design patterns: routing, sequential chaining, parallelisation, evaluator-optimizer loop, ReAct, and reflection