·4 min read

Agent teams are not sub-agents

Anthropic shipped agent teams in Claude Code last week, alongside Opus 4.6. At first glance they look like sub-agents. They work differently.

Sub-agents are workers. You give them a task, they disappear into their own context window, and come back with a result. If you spin up three sub-agents, each works blind to what the others are doing. They report to you. You synthesize.

Sub-agentsYouA1A2A3agents work in isolation
Agent Teamsshared tasksLeadA1A2A3agents coordinate directly

Agent teams talk to each other. They share a task list, and when one agent discovers something that affects another's work, it sends a message directly. The lead coordinates, but the teammates adjust based on what others have found. The shared task list also prevents agents from stepping on each other's work. Each agent sees what the others have claimed, started, and finished before it touches anything.

I tested this on Andelo's codebase. Andelo has around 90 database migrations and modules that depend on each other in ways that aren't obvious from the schema alone. I set up a team: one agent on the assessment flow for external valuers, one on improvement tracking, one on row-level security across both.

With sub-agents, each would have finished its own analysis and reported back. I'd then reconcile their findings, catching where one agent's assumptions conflicted with another's work.

With teams, the RLS agent flagged that assessment assignments used a different access pattern than the improvement module. It messaged the assessment agent, which adjusted its analysis on the fly. I didn't broker that exchange. The agents did.

On a second run I added a fourth agent as a devil's advocate. Its only job: challenge the other three. The RLS agent had assumed that assessment data should inherit the same row-level policies as improvement data. The devil's advocate caught that external valuers need a different access scope than internal users. The other agents had glossed over the distinction. One pushback, one assumption corrected before it became policy.

The adversarial agent costs extra tokens. What it catches is worth the overhead.

Parallel and sequential

Not every team should run in parallel. Some tasks need a sequential handoff where agent A finishes before agent B starts, because B depends on A's output.

I used this when planning a migration for Andelo's assessment module. First agent: map every schema dependency. Second agent: draft the migration plan from that map. Third agent: check the plan against our RLS policies. Each waited for the previous one to finish. Running them in parallel would have produced a migration plan built on incomplete dependency data.

Parallel works when agents have distinct scopes. Sequential works when each step depends on the last. Picking the right pattern matters more than prompt wording.

ParallelA1A2A3Synthall run at once
Sequentialwaits...waits...ABCeach waits for previous

Steering mid-run

Once a team starts, you can open any individual agent's session and send it instructions while the others keep running.

I used this when the dependency-mapping agent spent too long on a set of legacy tables already marked for deletion. I told it to skip those and focus on the active schema. It adjusted without disrupting the other agents.

You can also set approval gates: tell the team to pause at a milestone and wait for your sign-off before continuing. Useful when the next step is expensive or hard to reverse.

Combining sub-agents and teams

The two work well together. I've started using sub-agents for upfront grunt work (analyzing a module's structure, summarizing a schema) and then handing that summary to an agent team for the interdependent analysis.

Sub-agents are cheaper for isolated tasks. Teams coordinate better on overlapping ones. Running a sub-agent first keeps the team's token budget focused on work that actually needs coordination.

Practical limits

Three to five agents is the useful range. Beyond five, coordination overhead grows faster than output quality, and token costs climb. A team run on Andelo consumed 150,000 to 300,000 tokens depending on scope.

Task scoping matters. One large task ("analyze the entire assessment system") produces worse results than three focused deliverables ("map the dependencies," "find the RLS conflicts," "draft the migration"). Smaller tasks let each agent finish and share findings before the next step begins. Other agents can adjust course instead of building on stale assumptions.

You can assign different models to different agents. Haiku for the analysis grunt work, Opus for architectural decisions. Match cost to complexity.

Setup is one environment variable: CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 in your Claude Code settings. It's still a research preview, and the token usage is real. But for codebases where modules are coupled, this is the first multi-agent setup I've used where the AI caught cross-cutting concerns on its own, and when pushed, challenged its own conclusions.