AI assisted Software Engineering: Scaffolding Your Way from One Agent to a Team

AI assisted Software Engineering: Scaffolding Your Way from One Agent to a Team
Photo by Vlad Hilitanu / Unsplash

How agent-teams-scaffold turns any repository into a launchpad for Claude Code Agent Teams — and why that's the cleanest jump from Level 6 to Level 7 of AI adoption.


The wall between "an agent" and "a team of agents"

If you've spent time with coding agents, you've probably felt a ceiling. One agent, however capable, is still one context window doing one thing at a time. You hand it a task, it works, you review. That's genuinely useful — but it's a single worker, not a team.

Every's Eight Levels of AI Adoption names this ceiling precisely. The levels run:

  1. Chatbot — submit a task, get a response from a standalone model.
  2. Copilot — AI embedded in your files, collaborating in real time.
  3. Agent — describe a task, the agent executes step-by-step with approvals.
  4. Autopilot — the agent finishes independently, then you review.
  5. Workflows — structured systems with guardrails around agents.
  6. Assistant — an always-on agent that works proactively across one domain.
  7. Multi-agent — several long-running agents in parallel, each with its own role.
  8. (and beyond)

The gap between Level 6 (Assistant) and Level 7 (Multi-agent) is the one most people get stuck at. At Level 6 you supervise one autonomous worker. At Level 7 you stop supervising a worker and start leading a team — multiple specialized agents running simultaneously, each owning a distinct responsibility, coordinating with each other rather than all reporting back to you.

That shift is real work. You need each agent to know the codebase, know its lane, and not collide with its teammates. Setting that up by hand, per repository, every time, is exactly the kind of friction that keeps people parked at Level 6.

agent-teams-scaffold removes that friction.

claude plugin marketplace add TechPreacher/agent-teams-scaffold
claude plugin install agent-teams-scaffold@techpreacher

What it is

agent-teams-scaffold is a Claude Code
skill (packaged as a self-contained plugin) that takes a plain repository and lays down everything needed to run Claude Code Agent Teams — Claude Code's experimental multi-agent mode where one lead session coordinates several teammates, each with its own context window, messaging each other directly instead of merely reporting up to a parent.

Point the skill at a repo and it generates a .claude/ setup tailored to that codebase:

File Purpose
.claude/agents/security-reviewer.md A read-only reviewer subagent, assigned one scope per spawn
.claude/settings.json Merged with the experimental teams flag CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
.claude/TEAM_PROMPTS.md Ready-to-paste prompts that spawn a review swarm
.claude/launch-team.sh A bash/zsh + tmux launcher (executable)
CLAUDE.md Project-context scaffold — the shared ground truth teammates load at startup

The default flavor it sets up is a security review team: one teammate per scope (authentication, input validation, supply chain, secrets), each read-only, each auditing its lane and then cross-challenging the others to converge on a single severity-ranked report. But the pattern generalizes to any read-heavy, parallelize-able job — architecture review, test-gap analysis, dependency-upgrade impact.

The design idea that makes it trustworthy

The most important thing about this tool is a deliberate split between two kinds of work:

  • A deterministic Python generator writes the boilerplate. It auto-detects your stack (build/test/lint commands), writes safe defaults and TODO markers, and is rigorously non-destructive: it merges into an existing settings.json instead of overwriting, and if you already have a root CLAUDE.md it writes a snippet alongside for you to merge rather than clobbering your file. It never needs a model to run, and it never destroys your data.
  • The model supplies the judgment. After the script runs, Claude reads your actual repository and replaces the TODOs with real module boundaries (so teammates don't edit the same files), corrected build/test/lint commands, and stack-specific security focus (an HTTP API → authn middleware and request validation; a published package → the supply-chain scope).

Boilerplate by script, tailoring by model. You get the reproducibility of a generator and the context-awareness of an agent, without either pretending to be the other.

How to use it

Install as a plugin:

claude plugin marketplace add TechPreacher/agent-teams-scaffold
claude plugin install agent-teams-scaffold@techpreacher

Scaffold a repo — just ask Claude:

Use the agent-teams-scaffold skill on ~/code/my-service.

Claude runs the generator, then reads the repo and fills in the real details. (You can also run the generator directly with python3 .../scaffold.py --repo ~/code/my-service --scopes auth,input,supplychain,secrets, which skips the model-tailoring step.)

Launch the team:

~/code/my-service/.claude/launch-team.sh

Then paste a prompt from .claude/TEAM_PROMPTS.md to the lead, and press Shift+Tab to lock the lead into coordination-only (delegate) mode. From there you're leading a team — that's Level 7.

A worked example: reviewing an Express API

Say you have a small Node/Express service, payments-api, and you want a security pass before shipping. Here's the whole loop.

1. Scaffold it. In Claude Code:

Use the agent-teams-scaffold skill on ~/code/payments-api.

The generator detects the stack (JavaScript/TypeScript → npm test, npm run lint) and writes the .claude/ files. Claude then reads the repo and tailors them — it notices this is an HTTP API with JWT auth and a Postgres layer, so it sharpens the reviewer's focus on auth middleware and query construction, and fills CLAUDE.md's module boundaries:

## Module boundaries
- `src/routes/` — HTTP handlers; each file one resource
- `src/auth/` — JWT issuance + verification middleware
- `src/db/` — Postgres query layer
- `src/lib/` — shared helpers (logging, config)

2. Launch the team and spawn the swarm.

~/code/payments-api/.claude/launch-team.sh

Paste the security-swarm prompt from .claude/TEAM_PROMPTS.md to the lead, then Shift+Tab to put the lead in coordination-only mode. It spawns four read-only teammates, one per scope:

auth-reviewer: token issuance/validation, session lifecycle, access-control, privilege escalationinput-reviewer: untrusted input, SQL/command/template injection, unsafe deserialization, path traversalsupplychain-reviewer: lockfile integrity, known-vulnerable deps, post-install scripts, pinningsecrets-reviewer: hardcoded credentials, secrets in history, insecure defaults, sensitive logging

Each teammate works its own lane in its own context window — input-reviewer greps src/db/ for string-built queries while supplychain-reviewer runs npm audit, in parallel, neither blocking the other.

3. Each reviewer reports a severity-ranked table. For example, input-reviewer:

Severity Location (file:line) Issue Recommendation
CRITICAL src/db/users.js:42 SQL built by string interpolation of req.query.email — injectable Use a parameterized query ($1 placeholder)
MEDIUM src/routes/refunds.js:88 amount parsed with parseFloat on raw body, no bounds check Validate type and range before use

…and auth-reviewer:

Severity Location (file:line) Issue Recommendation
HIGH src/auth/verify.js:17 JWT verified with algorithms unset — accepts alg: none Pin algorithms: ['RS256']
LOW src/auth/verify.js:31 Expiry checked but no clock-skew leeway Add small clockTolerance

4. They cross-challenge, then converge. This is the part a single agent can't do. Once all four report, the lead has them critique each other. secrets-reviewer flagged a hardcoded key in src/lib/config.js:5 — but supplychain-reviewer points out it's a test fixture only loaded under NODE_ENV=test, so it's downgraded from HIGH to INFO. A false positive dies before it reaches you.

The team then merges into one deduplicated, severity-ranked report:

Severity Location Issue
CRITICAL src/db/users.js:42 SQL injection via interpolated email
HIGH src/auth/verify.js:17 JWT accepts alg: none
MEDIUM src/routes/refunds.js:88 Unvalidated refund amount
LOW src/auth/verify.js:31 No JWT clock-skew leeway
INFO src/lib/config.js:5 Test-only hardcoded key (not a prod risk)

Four specialists, four lanes, one converged report with the false positive already filtered — in roughly the wall-clock time one reviewer would have taken to do a single scope. That's the Level 7 payoff, and you got there without writing a line of coordination plumbing.

What to expect

  • It's a head start, not a black box. The generated files are meant to be read and edited. The TODO markers are the point: they tell you and the model exactly what still needs human-or-model judgment. Expect to spend a few minutes reviewing what got tailored.
  • It's genuinely non-destructive. Run it twice, run it on a repo that already has a .claude/ setup — it merges and snippets rather than overwriting. Safe to re-run as your repo evolves.
  • Teams cost more. A team runs roughly 3–7× the tokens of a single session. The right workflow is to plan in plan mode first (cheap), then hand the approved plan to the team. For 4+ teammates, give each its own git worktree so they don't collide on files.
  • It's experimental. Agent Teams is off by default and requires Claude Code v2.1.32+. Split-pane UX needs tmux 3.2+ or iTerm2; otherwise teammates run in-process in one terminal, and there's no session resume for in-process teammates — closing the terminal loses them.

Who should use it

  • Engineers stuck at Level 6 who have gotten comfortable with a single autonomous agent and want the next rung without hand-rolling multi-agent plumbing for every repo.
  • Security-minded teams who want a repeatable, read-only review swarm wired into their repos — the default scope set (auth, input, supply chain, secrets) is ready to run.
  • Teams managing many repositories who want a consistent, reproducible Agent Teams setup across all of them instead of bespoke .claude/ directories that drift apart.
  • Anyone exploring parallel-agent workflows who wants a safe, non-destructive starting point they can read, edit, and re-run.

If you've never run a single agent autonomously, start there first — this tool assumes you're ready to graduate, not starting from scratch.

Why this is the 6 → 7 lever

Level 7 isn't just "run more agents." It's a change in role: from supervising one worker's workflow to coordinating several specialists at once. The thing standing between most people and that change isn't capability — the agents are capable — it's setup cost and coordination discipline. Each agent needs to know the repo, stay in its lane, and not stomp on its teammates' files.

agent-teams-scaffold encodes exactly that discipline into files: a shared CLAUDE.md so every teammate starts from the same ground truth, disjoint module boundaries so they don't collide, scoped reviewer roles so each agent owns one responsibility, and paste-ready spawn prompts so launching a team is a thirty-second operation instead of an afternoon of plumbing.

In other words, it doesn't just let you run a team — it bakes in the coordination structure that makes a team work. That's the difference between bouncing off the Level 7 wall and walking through it.


agent-teams-scaffold is open source. The generator is pure-stdlib Python with a unittest suite and CI; the whole thing is a self-referential single-plugin marketplace you can install, read, and fork.