Merge per-task reviews into one task reviewer (iteration 2)

Iteration-1 profiling: implementers and per-dispatch overhead dominate
(429 of 686 subagent turns; controller coordination is half the dollars
and scales with dispatch count), reviewers are individually lean, and
the controller pasted the diff in only 2 of 22 review dispatches when
the guidance was phrased as optional.

Changes: spec-reviewer-prompt.md + code-quality-reviewer-prompt.md
replaced by task-reviewer-prompt.md (one reviewer, one reading of a
pasted diff, two verdicts: spec compliance //⚠️ and task quality);
one fix dispatch can address both kinds of findings; controller now
runs git diff itself and pastes it (imperative, not optional);
implementers run focused tests while iterating and the full suite once
before committing; flowchart, example, Red Flags, tool tables updated.
The broad final whole-branch review is unchanged.
This commit is contained in:
Jesse Vincent
2026-06-09 23:58:28 -07:00
parent 3e3e1e701e
commit e3c74fc1c9
7 changed files with 198 additions and 295 deletions

View File

@@ -5,11 +5,11 @@ description: Use when executing implementation plans with independent tasks in t
# Subagent-Driven Development
Execute plan by dispatching fresh subagent per task, with two-stage review after each: spec compliance review first, then code quality review.
Execute plan by dispatching a fresh implementer subagent per task, a combined task review (spec compliance + code quality, one reviewer, one reading of the diff) after each, and a broad whole-branch review at the end.
**Why subagents:** You delegate tasks to specialized agents with isolated context. By precisely crafting their instructions and context, you ensure they stay focused and succeed at their task. They should never inherit your session's context or history — you construct exactly what they need. This also preserves your own context for coordination work.
**Core principle:** Fresh subagent per task + two-stage review (spec then quality) = high quality, fast iteration
**Core principle:** Fresh subagent per task + one task review (spec + quality verdicts) + broad final review = high quality, fast iteration
**Continuous execution:** Do not pause to check in with your human partner between tasks. Execute all tasks from the plan without stopping. The only reasons to stop are: BLOCKED status you cannot resolve, ambiguity that genuinely prevents progress, or all tasks complete. "Should I continue?" prompts and progress summaries waste their time — they asked you to execute the plan, so execute it.
@@ -36,7 +36,7 @@ digraph when_to_use {
**vs. Executing Plans (parallel session):**
- Same session (no context switch)
- Fresh subagent per task (no context pollution)
- Two-stage review after each task: spec compliance first, then code quality
- Combined review after each task (spec compliance + code quality verdicts), broad review at the end
- Faster iteration (no human-in-loop between tasks)
## The Process
@@ -51,12 +51,9 @@ digraph process {
"Implementer subagent asks questions?" [shape=diamond];
"Answer questions, provide context" [shape=box];
"Implementer subagent implements, tests, commits, self-reviews" [shape=box];
"Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)" [shape=box];
"Spec reviewer subagent confirms code matches spec?" [shape=diamond];
"Implementer subagent fixes spec gaps" [shape=box];
"Dispatch code quality reviewer subagent (./code-quality-reviewer-prompt.md)" [shape=box];
"Code quality reviewer subagent approves?" [shape=diamond];
"Implementer subagent fixes quality issues" [shape=box];
"Run git diff, dispatch task reviewer subagent (./task-reviewer-prompt.md)" [shape=box];
"Task reviewer reports spec ✅ and quality approved?" [shape=diamond];
"Dispatch fix subagent for Critical/Important findings" [shape=box];
"Mark task complete in todo list" [shape=box];
}
@@ -70,15 +67,11 @@ digraph process {
"Implementer subagent asks questions?" -> "Answer questions, provide context" [label="yes"];
"Answer questions, provide context" -> "Dispatch implementer subagent (./implementer-prompt.md)";
"Implementer subagent asks questions?" -> "Implementer subagent implements, tests, commits, self-reviews" [label="no"];
"Implementer subagent implements, tests, commits, self-reviews" -> "Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)";
"Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)" -> "Spec reviewer subagent confirms code matches spec?";
"Spec reviewer subagent confirms code matches spec?" -> "Implementer subagent fixes spec gaps" [label="no"];
"Implementer subagent fixes spec gaps" -> "Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)" [label="re-review"];
"Spec reviewer subagent confirms code matches spec?" -> "Dispatch code quality reviewer subagent (./code-quality-reviewer-prompt.md)" [label="yes"];
"Dispatch code quality reviewer subagent (./code-quality-reviewer-prompt.md)" -> "Code quality reviewer subagent approves?";
"Code quality reviewer subagent approves?" -> "Implementer subagent fixes quality issues" [label="no"];
"Implementer subagent fixes quality issues" -> "Dispatch code quality reviewer subagent (./code-quality-reviewer-prompt.md)" [label="re-review"];
"Code quality reviewer subagent approves?" -> "Mark task complete in todo list" [label="yes"];
"Implementer subagent implements, tests, commits, self-reviews" -> "Run git diff, dispatch task reviewer subagent (./task-reviewer-prompt.md)";
"Run git diff, dispatch task reviewer subagent (./task-reviewer-prompt.md)" -> "Task reviewer reports spec ✅ and quality approved?";
"Task reviewer reports spec ✅ and quality approved?" -> "Dispatch fix subagent for Critical/Important findings" [label="no"];
"Dispatch fix subagent for Critical/Important findings" -> "Run git diff, dispatch task reviewer subagent (./task-reviewer-prompt.md)" [label="re-review"];
"Task reviewer reports spec ✅ and quality approved?" -> "Mark task complete in todo list" [label="yes"];
"Mark task complete in todo list" -> "More tasks remain?";
"More tasks remain?" -> "Dispatch implementer subagent (./implementer-prompt.md)" [label="yes"];
"More tasks remain?" -> "Dispatch final code reviewer subagent (../requesting-code-review/code-reviewer.md)" [label="no"];
@@ -119,7 +112,7 @@ single-file mechanical fixes.
Implementer subagents report one of four statuses. Handle each appropriately:
**DONE:** Proceed to spec compliance review.
**DONE:** Run `git diff BASE..HEAD`, then dispatch the task reviewer.
**DONE_WITH_CONCERNS:** The implementer completed the work but flagged doubts. Read the concerns before proceeding. If the concerns are about correctness or scope, address them before review. If they're observations (e.g., "this file is getting large"), note them and proceed to review.
@@ -133,12 +126,12 @@ Implementer subagents report one of four statuses. Handle each appropriately:
**Never** ignore an escalation or force the same model to retry without changes. If the implementer said it's stuck, something needs to change.
## Handling Spec Reviewer ⚠️ Items
## Handling Reviewer ⚠️ Items
The spec reviewer may report "⚠️ Cannot verify from diff" items — requirements
that live in unchanged code or span tasks. These do not block dispatching the
code quality reviewer, but you must resolve each one yourself before marking
the task complete: you hold the plan and cross-task context the reviewer
The task reviewer may report "⚠️ Cannot verify from diff" items — requirements
that live in unchanged code or span tasks. These do not block the rest of the
review, but you must resolve each one yourself before marking the task
complete: you hold the plan and cross-task context the reviewer
lacks. If you confirm an item is a real gap, treat it as a failed spec
review — send it back to the implementer and re-review.
@@ -160,17 +153,17 @@ final whole-branch review. When you fill a reviewer template:
- Include the spec/design's global constraints that bind the task (version
floors, naming and copy rules, platform requirements) in the requirements
you paste — a reviewer can only enforce what you hand them.
- Paste the task's diff (`git diff BASE..HEAD` output) into the reviewer
prompt when it fits comfortably (up to a few hundred lines). A reviewer
with the diff in hand needs few or no tool calls.
- Run `git diff BASE..HEAD` yourself and paste the output into the reviewer
prompt (`--stat` plus the relevant hunks if it exceeds a few hundred
lines). A reviewer with the diff in hand needs few or no tool calls; do
not make reviewers re-derive the diff.
- Dispatch fix subagents for Critical and Important findings. Record Minor
findings and move on — they roll up to the final whole-branch review.
## Prompt Templates
- [implementer-prompt.md](implementer-prompt.md) - Dispatch implementer subagent
- [spec-reviewer-prompt.md](spec-reviewer-prompt.md) - Dispatch spec compliance reviewer subagent
- [code-quality-reviewer-prompt.md](code-quality-reviewer-prompt.md) - Dispatch code quality reviewer subagent
- [task-reviewer-prompt.md](task-reviewer-prompt.md) - Dispatch task reviewer subagent (spec compliance + code quality, one dispatch)
- Final whole-branch review: use superpowers:requesting-code-review's [code-reviewer.md](../requesting-code-review/code-reviewer.md)
## Example Workflow
@@ -198,11 +191,9 @@ Implementer: "Got it. Implementing now..."
- Self-review: Found I missed --force flag, added it
- Committed
[Dispatch spec compliance reviewer]
Spec reviewer: Spec compliant - all requirements met, nothing extra
[Get git SHAs, dispatch code quality reviewer]
Code reviewer: Strengths: Good test coverage, clean. Issues: None. Task quality: Approved.
[Run git diff, dispatch task reviewer with the diff pasted in]
Task reviewer: Spec - all requirements met, nothing extra.
Strengths: Good test coverage, clean. Issues: None. Task quality: Approved.
[Mark Task 1 complete]
@@ -218,25 +209,17 @@ Implementer:
- Self-review: All good
- Committed
[Dispatch spec compliance reviewer]
Spec reviewer: ❌ Issues:
[Run git diff, dispatch task reviewer with the diff pasted in]
Task reviewer: Spec ❌:
- Missing: Progress reporting (spec says "report every 100 items")
- Extra: Added --json flag (not requested)
Issues (Important): Magic number (100)
[Implementer fixes issues]
Implementer: Removed --json flag, added progress reporting
[Dispatch fix subagent with all findings]
Fixer: Removed --json flag, added progress reporting, extracted PROGRESS_INTERVAL constant
[Spec reviewer reviews again]
Spec reviewer: Spec compliant now
[Dispatch code quality reviewer]
Code reviewer: Strengths: Solid. Issues (Important): Magic number (100)
[Implementer fixes]
Implementer: Extracted PROGRESS_INTERVAL constant
[Code reviewer reviews again]
Code reviewer: ✅ Task quality: Approved
[Task reviewer reviews again]
Task reviewer: Spec ✅. Task quality: Approved.
[Mark Task 2 complete]
@@ -270,13 +253,13 @@ Done!
**Quality gates:**
- Self-review catches issues before handoff
- Two-stage review: spec compliance, then code quality
- Task review carries two verdicts: spec compliance and code quality
- Review loops ensure fixes actually work
- Spec compliance prevents over/under-building
- Code quality ensures implementation is well-built
**Cost:**
- More subagent invocations (implementer + 2 reviewers per task)
- More subagent invocations (implementer + reviewer per task)
- Controller does more prep work (extracting all tasks upfront)
- Review loops add iterations
- But catches issues early (cheaper than debugging later)
@@ -285,20 +268,19 @@ Done!
**Never:**
- Start implementation on main/master branch without explicit user consent
- Skip reviews (spec compliance OR code quality)
- Skip task review, or accept a report missing either verdict (spec compliance AND task quality are both required)
- Proceed with unfixed issues
- Dispatch multiple implementation subagents in parallel (conflicts)
- Make subagent read plan file (provide full text instead)
- Skip scene-setting context (subagent needs to understand where task fits)
- Ignore subagent questions (answer before letting them proceed)
- Accept "close enough" on spec compliance (spec reviewer found issues = not done)
- Accept "close enough" on spec compliance (reviewer found spec issues = not done)
- Skip review loops (reviewer found issues = implementer fixes = review again)
- Let implementer self-review replace actual review (both are needed)
- Tell a reviewer what not to flag, or pre-rate a finding's severity in the
dispatch prompt ("treat it as Minor at most") — the plan's example code is
a starting point, not evidence that its weaknesses were chosen
- **Start code quality review before spec compliance is ✅** (wrong order)
- Move to next task while either review has open issues
- Move to next task while the review has open Critical/Important issues
**If subagent asks questions:**
- Answer clearly and completely