Vol. I ยท Issue 01Spring ยท 2026Builder live$49 USD ยท One-time

Blog

Multi-Agent Orchestration That Doesn't Collide

Run 18 AI agents in parallel without duplicate work. A practical guide to multi-agent orchestration using a shared Kanban board, atomic claims, and one human gate.

Multi-Agent Orchestration That Doesn't Collide

You can run eighteen AI agents at once and get eighteen agents doing the same three things. That is the default outcome when you spawn a swarm and hope they sort it out. The fix is boring and it works: give them one shared board, make them claim work atomically, and let state live somewhere other than the agents themselves. Do that and the agents stop racing each other. They start finishing things.

This is the part of multi-agent orchestration nobody shows you in the demo. The demo shows five terminal windows scrolling. Production shows a SQLite file that survives a crash and a single Telegram message asking you to approve four proposals. Here is how to build the second one.

Why agents collide in the first place

An agent is stateless between turns. It does not know what its sibling started two seconds ago. If you tell three research agents "find the top complaints about X," all three will fetch the same Reddit thread, summarize it three times, and bill you three times for the privilege.

Message queues feel like the answer. They are not, for this. Queues are great at moving work around and bad at being the durable record of what got done. When an agent dies mid-task (and they do), a queue-based setup loses the thread. You restart and you cannot tell what was finished, what was in flight, and what never started.

The pattern that holds up: a Kanban board backed by SQLite as the single source of truth. Every unit of work is a card. Agents claim cards, work in isolation, and move them to done. The board is the bus and the audit log at the same time. As Tonbi put it after running this in anger: "State lives on the board, not in the agent."

The dispatcher is the only shared brain

Here is the architecture that lets you scale past a couple of agents without chaos.

A single dispatcher loop polls for ready cards. It claims one atomically using SQLite row locking, then spawns the assigned agent in its own clean workspace. Two agents can never grab the same card because the database refuses the second claim. That one detail is what makes 18-agent parallelism safe. Everything else is decoration.

The flow looks like this:

  • Scout agents run on a schedule and drop candidate cards into the board.
  • The orchestrator scores each candidate and either shelves it or promotes it.
  • Qualifying cards fan out to parallel researchers, each on its own card.
  • A card auto-promotes from to-do to ready only when all its parent cards hit done.

That last point is the quiet hero. Parent-task dependencies resolve without polling loops or glue code. You declare "this card depends on those three," and the engine promotes it the moment the parents finish. No cron checking every 30 seconds whether the research is done yet.

# Orchestrator system prompt โ€” the line that actually matters
You must use kanban-create to dispatch all tasks.
Do not process issues directly.

# Without this line, the orchestrator will "helpfully" do the
# whole job in one context window and skip the board entirely.

That guardrail is not optional. Left alone, a capable orchestrator will try to complete the pipeline itself rather than dispatch it, because that is the path of least resistance in a single turn. You have to force the board.

A real pipeline: from complaint to shipped tool

The most useful version of this I have seen is a pain-point discovery pipeline, and it generalizes to almost any "watch the world, decide, act" job.

Two scout agents run hourly. One queries X through a Grok model assignment so it gets real-time access to the live database. The other works the web (Reddit, GitHub, forums). Both report candidates to the orchestrator, which does three things people usually skip:

1. Deduplicates. The same complaint shows up in five places. It becomes one card.
2. Scores against a rubric. Frequency, pain intensity, solvability, solution gap, strategic fit. Weighted, summed to 0โ€“100.
3. Shelves anything under 65. You never see the low-quality stuff. That threshold is the difference between a useful morning and an inbox full of noise.

Cards that clear the bar fan out to three researchers running at once: a source verifier, a context researcher, and an existing-solutions auditor. The orchestrator waits for all three, then routes the synthesized brief down one of two paths โ€” build it (analyst โ†’ builder โ†’ tester โ†’ deliverable) or explain it (researcher โ†’ producer โ†’ deck).

Put the rubric in its own file so scoring stays consistent across runs:

# scoring-rubric.md
frequency:       20   # how often does this complaint recur?
pain_intensity:  25   # how badly does it hurt the person?
solvability:     20   # can we actually fix it?
solution_gap:    20   # is there no good existing answer?
strategic_fit:   15   # does it match what we do?
# threshold: 65 โ€” tuned by hand after watching real output

Reference that file from the orchestrator skill. Now every run scores the same way, and when you want stricter output you change one number in one place.

Keep exactly one human gate

The temptation with orchestration is to add approval everywhere. Resist it. The whole point is to compress your involvement to the decisions only you can make.

In the pipeline above there is one gate. After scoring and research, the orchestrator sends a single Telegram message: "Four proposals awaiting your approval," each with the pain point, its sources, why build versus explain, and the proposed solution. You reply approve, shelf, or a tweak. Everything before that message and everything after it runs without you.

"There's one human gate. Everything else happens autonomously. I'm not touching any of it." That is the bar. If you find yourself babysitting the agents, your state model is wrong, not your prompt.

Self-healing beats perfect prompting

Agents produce broken output. A producer agent once wrote slides that referenced a temporary workspace path, the kind of thing that vanishes on the next run. The orchestrator caught it, marked the card as recovery-needed, and re-ran delivery to a persistent directory. No human involved.

You get that behavior for free when the board is durable. Because the card carries the state, a watchdog can inspect what an agent produced, decide it is wrong, and re-spawn the work. Dead tasks get reclaimed. Crashed agents get replaced. The board remembers what the agent forgot.

If you only take one design rule from this: make the coordination layer durable and dumb, and let the agents be smart and disposable. The opposite โ€” smart coordination, precious agents โ€” is how you end up with a system that works in the demo and falls over on Tuesday.

When NOT to orchestrate

Multi-agent is overkill for a lot of jobs. If your task is a straight line of steps with no real parallelism, one agent with good skills and solid memory will beat a swarm and cost less. Reach for orchestration when you have genuine fan-out (many independent units of the same work) or genuine specialization (research, build, test, all needing different tools and prompts).

And do not cheap out on the orchestrator's model. The coordination layer makes the routing decisions; a tiny model fumbles them. Use a strong model for the orchestrator and cheap models for the disposable workers. That split is also better for your token budget than running everything on the expensive tier.

FAQ

What is multi-agent orchestration?

It is the practice of coordinating several AI agents so they work on one goal in parallel without duplicating effort or corrupting each other's output. In practice it means a shared coordination layer (usually a board or database), atomic task claiming, and clear dependencies between tasks.

How do you stop AI agents from doing duplicate work?

Make them claim tasks atomically from a single shared store. With SQLite row locking, only one agent can claim a given card; the database rejects any second claim. Agents coordinate through card state, never by talking to each other directly.

How many agents can run in parallel?

There is no hard ceiling from the pattern itself. Setups running 18 workers at once are common. The real limits are your compute, your API rate limits, and your budget, not the orchestration model.

Do I need a message queue for multi-agent orchestration?

No, and a durable board is usually better for agent work. A queue moves messages but does not serve as a reliable record of what finished. A SQLite-backed board survives restarts, supports dependency resolution, and doubles as your audit log.

Build it without starting from scratch

You do not need to invent the dispatcher, the scoring rubric, and the approval gate from a blank file. OpenClawCrew starter kits ship the orchestration patterns above as real, editable files โ€” profiles for each specialist role, the Kanban coordination layer, the rubric template, and the one-gate approval flow โ€” set up to run on your own machine on Hermes or OpenClaw. Grab the $49 starter kit to wire it yourself, or book a done-for-you setup if you would rather have your fleet running by the end of the week. Either way you skip the part where eighteen agents fight over the same Reddit thread.

Related reading: AI agent cron automation, AI agent webhooks, connecting your agent to your tools, and least-privilege agent security.