Back to Blog
AIClaudeAnthropicClaude CodeAgentsDeveloper Tools

Opus 4.8 Runs a Thousand Subagents. Your Tests Are the Bottleneck.

Claude Opus 4.8 dynamic workflows can spawn up to 1,000 parallel subagents in a single session. The model isn't the limit. Your test suite is.

June 9, 2026·5 min read

Anthropic released Opus 4.8 on May 28. The headline feature: dynamic workflows. One orchestrator session, up to 1,000 parallel subagents, all running simultaneously, each in its own context window.

The model doesn't write code and check it. It plans the work, spawns a fleet, runs them all at once, and hands you the final answer.

That's not iteration. That's a different kind of machine.

What Dynamic Workflows Actually Do

The architecture is worth understanding because it's genuinely different from anything that shipped before.

When you trigger a dynamic workflow, Claude doesn't just spawn more agents the way you'd imagine — it writes an orchestration script. JavaScript, running in the background. The plan lives in script variables, not Claude's context window. The noise from hundreds of subagent runs — the dead ends, the search results, the file dumps — none of that surfaces in your session.

What you get back is just the answer.

Each subagent has its own context window, its own tools, its own focus. They run in parallel — up to 16 concurrently, up to 1,000 total per run — and the orchestrator waits for them, collects their outputs, and synthesizes. A task that would take the sum of all subtasks sequentially now takes as long as the longest individual subtask.

For a codebase migration touching hundreds of thousands of lines, that's not a small speedup. That's a category shift.

The Sentence Anthropic Said Quietly

Anthropic described one use case like this: "Claude Code alongside Opus 4.8 can now carry out codebase-scale migrations across hundreds of thousands of lines of code from kickoff to merge, with the existing test suite as its bar."

Most people skimmed past "with the existing test suite as its bar."

That sentence is carrying the whole weight of the feature.

Dynamic workflows don't verify correctness themselves. They delegate verification to your tests. The orchestrator ships code, runs your test suite, checks if it passes, and uses that signal to decide whether the work is done.

No tests? No signal. No signal means the orchestrator has no way to know if a thousand agents did the right thing or the wrong thing. It ships anyway.

The Uncomfortable Take

The bottleneck just moved. It used to be generation — how fast could you get working code out of the AI. That problem is solved. 1,000 agents running in parallel solved it.

The bottleneck is now verification.

If you're building with dynamic workflows on a codebase that has 40% test coverage, you're not getting 10x faster. You're getting 10x faster at shipping wrong things. The agents will generate at scale. Your test suite will catch at whatever coverage it has. Everything in the gap lands in production.

I've seen this framing get misread as "go write more tests before you can use Opus 4.8." That's not the point. The point is that test coverage has always been a proxy for how confidently you can change code, and dynamic workflows make that relationship explicit and unavoidable.

Before this, you could ignore low coverage and still ship slowly but somewhat safely because a human was reading the diff. Now the diff is coming from a fleet, and a human is reading the summary. The safety net changed. Your test suite is the only thing that kept pace with the upgrade.

What This Looks Like in Practice

I've been running dynamic workflows on my portfolio project — well-covered, good surface area. What it feels like:

  1. Describe the task. A refactor, a migration, a set of feature adds across components.
  2. The orchestrator writes the script, spawns the subagents.
  3. They work in parallel. 16 concurrently, rotating through the queue.
  4. The test suite runs. Green means done. Red means the orchestrator re-routes.
  5. I see the final diff.

The session never fills up with search noise or file reads. Context stays clean. I'm in the manager role the same one skills and subagents built up — except now the team scales to a thousand.

Fastest run so far: a component library refactor touching 23 files, one session, under 20 minutes. That used to be a half-day with breaks to recover my attention.

What I Changed in My Setup

Two concrete things.

First, I prioritized test coverage for the parts of my codebase I actually want dynamic workflows touching. Not exhaustive coverage everywhere — targeted coverage on the modules I want the orchestrator to be able to move fast in. No tests on a module means I'm not pointing the fleet at it yet.

Second, I added a step before dispatching any large dynamic job: check what's tested. That's just flight recorder thinking applied upfront. Know the safety surface before you start, not after something breaks.

Dynamic workflows are available in Claude Code research preview on Enterprise, Team, and Max plans. If you're on Max, it's already there. You just have to ask for it.

# In Claude Code on Opus 4.8, ask Claude to plan and execute at scale:
# "Run a dynamic workflow to migrate all button components to the new design tokens"
# Claude writes the orchestration script. The fleet does the rest.

The orchestrator does the work. But it does the work using your tests as the judge.

The Model Isn't the Limit Anymore

That's the real story of Opus 4.8.

The compute is there. The scale is there. A thousand agents can now work in parallel on your codebase, your docs, your pipeline — whatever you're building. Generation at this scale was the hard problem. It's solved.

The developers who'll use this well aren't waiting for Anthropic to make agents smarter. They're the ones who already made their codebases verifiable.

Write the tests. Point the fleet at them.