Compare commits

...

7 Commits

Author SHA1 Message Date
Drew Ritter
60f617489c refine(skills): staff-review round — fix spec-access contradiction, qualify constant bumps
Staff-review findings (4-reviewer panel):
- CONTRADICTION FIX: Spec Context said "Subagents never read the spec
  file themselves" while spec-reviewer-prompt grants exactly that
  access. Now: implementers never read it; the spec reviewer may, at
  the cited path.
- "a constant bump" was an unqualified trivial example — a one-line
  BCRYPT_ROUNDS or session-TTL change is a security-posture change;
  now qualified "with no security or behavioral consequences"
  (matching brainstorming's config-change qualifier). The diff-property
  definition adds "nothing security-relevant".
- Proportionality rewritten 146→~115 words (house style; one statement
  of the multi-task containment instead of two).
- Red Flags Never-line trimmed 33→14 words (pointer to Proportionality
  instead of third in-file restatement).
- Prompt-template rationale tails cut (the controller just read Spec
  Context; subagents need the pasted text, not the policy rationale).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 18:26:11 -07:00
Drew Ritter
a6ce936ac1 harden(skills): SDD proportionality resists over-use; pipeline consumes cited specs
Adversarial + consistency review findings (B1, B2, B3, B5, F1):
- Red Flags line read literally licensed skipping reviews on trivial
  tasks INSIDE multi-task plans; now states the only exception is a
  whole-plan trivial change and never-skip within multi-task plans.
- "a one-line edit" example blessed one-line behavioral changes
  (e.g. adding "|| user.isOwner"); dropped. Trivial is now defined as
  a property of the diff (no logic/control-flow/behavior change), not
  of the plan's self-description. The "nothing for review to catch"
  justification proved too much; replaced with the cost argument.
- "verify it" was undefined on the trivial path; now concrete (run
  tests/command, confirm output, verification-before-completion).
- Flowchart diamond now matches the prose: "fully-specified" + "any
  doubt = no" (the failing agents execute the flowchart literally).
- New Spec Context section + prompt-template updates: the controller
  reads the spec cited in the plan header and pastes cited sections
  into implementer/spec-reviewer prompts; the spec reviewer's
  diff-only rule gets a spec-document exception. Without this, the
  stack's reference-not-restate rule starves the SDD pipeline of
  requirements.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 18:24:57 -07:00
Drew Ritter
3c9870febe fix(skills): SDD review fanout scales with the change (SUP-333 #2)
subagent-driven-development mandated implementer + two-stage review +
final reviewer unconditionally — agy and opencode each dispatched 4
subagents for a one-line console.log in the 2026-06-09 quorum sweep
(cost-trivial-task-review-fanout), and the agents that passed did so
only by disobeying the skill.

- Proportionality rule: when the entire plan is one trivial,
  fully-specified mechanical change, implement directly, verify,
  commit — no review fanout. "When in doubt, it is not trivial."
  Within a multi-task plan the full pipeline still applies to every
  task regardless of size.
- Flowchart gets the trivial-exit diamond (the failing agents follow
  the flowchart literally; prose alone would not redirect them).
- Red Flags "never skip reviews" amended to reference the exception so
  the skill does not contradict itself.

TDD evidence (quorum):
- RED: agy 025324Z + opencode batches — 4 dispatches for 1 line
- GREEN: cost-trivial-task-review-fanout-opencode-20260610T002518Z-f3f5
  pass — 0 dispatches, $0.04, change landed on main checkout
- Canary: sdd-rejects-extra-features-claude-20260610T002901Z-458a pass —
  multi-task plan still runs implementer + two-stage review per task
  (tool-called Agent ✓, spec reviewer as YAGNI gate after each task)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 18:24:57 -07:00
Drew Ritter
81874ec5b1 refine(skills): staff-review round — trim reference rule, close executing-plans spec gap
Staff-review findings (4-reviewer panel):
- Reference paragraph rewritten 170→123 words preserving every
  behavioral condition (paraphrase/summarize coverage, no-skip guard,
  WHAT-WHY/HOW split, No Placeholders boundary, drift counter,
  zero-context rescope); fixes the "(brainstorming did)" syntax.
- **Spec:** header bracket: cut the never-skip sermon duplicated from
  the Overview (same loaded document); the conditional none-branch
  stays.
- executing-plans Step 1 now reads the spec the plan cites — plans are
  no longer self-contained, and the non-subagent execution path was
  never told (the eval only exercised the SDD consumer).
- writing-plans plan-location preference line gets the same
  existing-dir-is-not-a-preference guard as the spec path.
- brainstorming: deduplicate the docs/specs/ prohibition (step 6
  parenthetical stays; After-the-Design bullet was the second
  statement in one file).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 18:24:57 -07:00
Drew Ritter
49a91fd404 harden(skills): no-spec branch cannot be used to skip writing the spec
Eval-caught regression: the no-spec branch added to the **Spec:**
header gave the agent a sanctioned path to skip the spec doc entirely
("avoiding duplication by skipping the spec" —
cost-spec-plan-duplication-claude-20260610T213934Z-8e5b, fail). The
branch is now scoped: if brainstorming happened the spec exists and
must be cited; "none — requirements:" applies only when requirements
arrived conversationally and no spec doc was ever produced. The
reference-discipline paragraph states the same rule up front.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 14:49:01 -07:00
Drew Ritter
64d194a08e harden(skills): close paraphrase/no-spec/preference loopholes in plan reference rule
Adversarial review findings (C1, C2, C3, C5, A8, F3):
- "never restate" did not cover paraphrase/summary — the actual failure
  mode in the RED evidence; now "never restate, paraphrase, or summarize".
- The No Placeholders intra-plan repetition mandate gave a symmetric
  argument for re-inlining the spec; the rule now draws the line:
  repetition WITHIN the plan is required, copying FROM the spec is not.
- Drift argument was invertible ("snapshot to avoid drift"); now states
  snapshots hide drift.
- **Spec:** header gets a no-spec branch (state requirements once in
  the header, not per task) instead of inviting "no spec, rule is moot".
- Brainstorming path bullet: an existing differently-named docs dir is
  not a "user preference" override.
- Execution Handoff now notes review fanout scales (forward-ref to
  SDD's Proportionality rule) instead of promising unconditional
  two-stage review.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 14:34:56 -07:00
Drew Ritter
fa07663322 fix(skills): plans reference the spec instead of restating it (SUP-333 #1)
writing-plans told agents to "document everything they need to know"
assuming zero context — every agent in the 2026-06-09 six-agent quorum
sweep obeyed and restated the entire spec inline in the plan
(cost-spec-plan-duplication failed 5/5 completed agents; pi's plan was
683 lines of duplicated spec).

- writing-plans: state the division of labor — spec owns WHAT/WHY,
  plan owns HOW; cite the spec by path/section, never restate it.
  "Zero context" means mechanically executable steps, not duplication.
  Add a **Spec:** line to the plan header template.
- brainstorming: close the path loophole the re-run exposed — claude
  shortened docs/superpowers/specs/ to docs/specs/ in 2/2 runs; both
  path mentions now explicitly forbid the shortening.

TDD evidence (quorum):
- RED: batch-20260609T023452Z-68aa et al — 5/5 agents fail
- GREEN: cost-spec-plan-duplication-claude-20260609T234142Z-9625 pass
  (plan: "this plan does not restate them" + spec cited by path;
  both docs in docs/superpowers/)
- Canary: triggering-writing-plans-claude pass (skill still fires)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-09 16:52:21 -07:00
6 changed files with 27 additions and 10 deletions

View File

@@ -26,7 +26,7 @@ You MUST create a task for each of these items and complete them in order:
3. **Ask clarifying questions** — one at a time, understand purpose/constraints/success criteria
4. **Propose 2-3 approaches** — with trade-offs and your recommendation
5. **Present design** — in sections scaled to their complexity, get user approval after each section
6. **Write design doc** — save to `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md` and commit
6. **Write design doc** — save to `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md` and commit (exactly this path — not `docs/specs/`)
7. **Spec self-review** — quick inline check for placeholders, contradictions, ambiguity, scope (see below)
8. **User reviews written spec** — ask user to review the spec file before proceeding
9. **Transition to implementation** — invoke writing-plans skill to create implementation plan
@@ -109,7 +109,7 @@ digraph brainstorming {
**Documentation:**
- Write the validated design (spec) to `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md`
- (User preferences for spec location override this default)
- (An explicit user instruction overrides this default; an existing differently-named docs directory does not)
- Use elements-of-style:writing-clearly-and-concisely skill if available
- Commit the design document to git

View File

@@ -16,7 +16,7 @@ Load plan, review critically, execute all tasks, report when complete.
## The Process
### Step 1: Load and Review Plan
1. Read plan file
1. Read plan file, and the spec it cites in its `**Spec:**` header (plans reference requirements rather than restating them)
2. Review critically - identify any questions or concerns about the plan
3. If concerns: Raise them with your human partner before starting
4. If no concerns: Create todos for the plan items and proceed

View File

@@ -11,6 +11,8 @@ Execute plan by dispatching fresh subagent per task, with two-stage review after
**Core principle:** Fresh subagent per task + two-stage review (spec then quality) = high quality, fast iteration
**Proportionality:** Review fanout scales with the change. When the entire plan is one trivial, fully-specified mechanical change — a log statement, a typo fix, a constant bump with no security or behavioral consequences — implement it directly (or with a single implementer subagent), verify per superpowers:verification-before-completion (run the relevant command, confirm output), commit, and skip all review subagents, including the final reviewer: three review dispatches cost more than a one-line diff. Trivial is a property of the diff — it changes no logic, no control flow, and nothing security-relevant — not of the plan's self-description. Any doubt means not trivial: run the full pipeline. Within a multi-task plan, never skip reviews, regardless of task size.
**Continuous execution:** Do not pause to check in with your human partner between tasks. Execute all tasks from the plan without stopping. The only reasons to stop are: BLOCKED status you cannot resolve, ambiguity that genuinely prevents progress, or all tasks complete. "Should I continue?" prompts and progress summaries waste their time — they asked you to execute the plan, so execute it.
## When to Use
@@ -61,11 +63,16 @@ digraph process {
}
"Read plan, extract all tasks with full text, note context, create todos" [shape=box];
"Entire plan = one trivial, fully-specified mechanical change? (any doubt = no)" [shape=diamond];
"Implement directly, verify, commit (no review fanout)" [shape=box];
"More tasks remain?" [shape=diamond];
"Dispatch final code reviewer subagent for entire implementation" [shape=box];
"Use superpowers:finishing-a-development-branch" [shape=box style=filled fillcolor=lightgreen];
"Read plan, extract all tasks with full text, note context, create todos" -> "Dispatch implementer subagent (./implementer-prompt.md)";
"Read plan, extract all tasks with full text, note context, create todos" -> "Entire plan = one trivial, fully-specified mechanical change? (any doubt = no)";
"Entire plan = one trivial, fully-specified mechanical change? (any doubt = no)" -> "Implement directly, verify, commit (no review fanout)" [label="yes — see Proportionality"];
"Implement directly, verify, commit (no review fanout)" -> "Use superpowers:finishing-a-development-branch";
"Entire plan = one trivial, fully-specified mechanical change? (any doubt = no)" -> "Dispatch implementer subagent (./implementer-prompt.md)" [label="no"];
"Dispatch implementer subagent (./implementer-prompt.md)" -> "Implementer subagent asks questions?";
"Implementer subagent asks questions?" -> "Answer questions, provide context" [label="yes"];
"Answer questions, provide context" -> "Dispatch implementer subagent (./implementer-prompt.md)";
@@ -86,6 +93,10 @@ digraph process {
}
```
## Spec Context
If the plan's header cites a spec (`**Spec:** <path>`), read it once during plan extraction. Plans reference requirements rather than restating them — when a task cites a spec section, paste that section's text into the implementer and spec-reviewer prompts along with the task text. Implementer subagents never read the spec file themselves; the spec reviewer may additionally read it at the cited path (its prompt says so).
## Model Selection
Use the least powerful model that can handle each role to conserve cost and increase speed.
@@ -237,7 +248,7 @@ Done!
**Never:**
- Start implementation on main/master branch without explicit user consent
- Skip reviews (spec compliance OR code quality)
- Skip reviews — sole exception: a plan that is entirely one trivial change (see Proportionality)
- Proceed with unfixed issues
- Dispatch multiple implementation subagents in parallel (conflicts)
- Make subagent read plan file (provide full text instead)

View File

@@ -12,6 +12,8 @@ Subagent (general-purpose):
[FULL TEXT of task from plan - paste it here, don't make subagent read file]
[If the task cites spec sections, paste the cited sections' text here too]
## Context
[Scene-setting: where this fits, dependencies, architectural context]

View File

@@ -12,7 +12,7 @@ Subagent (general-purpose):
## What Was Requested
[FULL TEXT of task requirements]
[FULL TEXT of task requirements, including the text of any spec sections the task cites]
## What Implementer Claims They Built
@@ -28,7 +28,7 @@ Subagent (general-purpose):
git diff [BASE_SHA]..[HEAD_SHA]
```
Only read files in this diff. Do not crawl the broader codebase.
Only read files in this diff. Do not crawl the broader codebase. (One exception: if the requirements cite a spec document, you may read that spec at its cited path.)
## Read-Only Review

View File

@@ -7,16 +7,18 @@ description: Use when you have a spec or requirements for a multi-step task, bef
## Overview
Write comprehensive implementation plans assuming the engineer has zero context for our codebase and questionable taste. Document everything they need to know: which files to touch for each task, code, testing, docs they might need to check, how to test it. Give them the whole plan as bite-sized tasks. DRY. YAGNI. TDD. Frequent commits.
Write comprehensive implementation plans assuming the engineer has zero context for our codebase and questionable taste. Document everything they need to execute: which files to touch for each task, code, testing, docs they might need to check, how to test it. Give them the whole plan as bite-sized tasks. DRY. YAGNI. TDD. Frequent commits.
Assume they are a skilled developer, but know almost nothing about our toolset or problem domain. Assume they don't know good test design very well.
**Plans reference the spec; they never restate, paraphrase, or summarize it.** The spec owns the WHAT and WHY — requirements, acceptance criteria, design decisions; the plan owns the HOW — tasks, files, code, commands. Cite it by path in the header and by section where a task needs context. Reference discipline never means skipping the spec: if brainstorming produced one, it exists and the plan cites it. No Placeholders still requires repeating code and commands WITHIN the plan; copying FROM the spec is different: a step that needs a requirement's prose is under-specified — turn it into a concrete action. Snapshotting spec text into the plan hides drift, not prevents it. "Zero context" means each step is mechanically executable, not that the plan repeats the spec.
**Announce at start:** "I'm using the writing-plans skill to create the implementation plan."
**Context:** If working in an isolated worktree, it should have been created via the `superpowers:using-git-worktrees` skill at execution time.
**Save plans to:** `docs/superpowers/plans/YYYY-MM-DD-<feature-name>.md`
- (User preferences for plan location override this default)
- (An explicit user instruction overrides this default; an existing differently-named docs directory does not)
## Scope Check
@@ -53,6 +55,8 @@ This structure informs the task decomposition. Each task should produce self-con
**Goal:** [One sentence describing what this builds]
**Spec:** [Path to the spec doc, e.g. `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md` — requirements and design decisions live there; do not restate them here. Only if no spec doc exists (requirements arrived conversationally; brainstorming never ran): write "none — requirements:" and state them once here, not per task]
**Architecture:** [2-3 sentences about approach]
**Tech Stack:** [Key technologies/libraries]
@@ -145,7 +149,7 @@ After saving the plan, offer execution choice:
**If Subagent-Driven chosen:**
- **REQUIRED SUB-SKILL:** Use superpowers:subagent-driven-development
- Fresh subagent per task + two-stage review
- Fresh subagent per task + two-stage review (review fanout scales with the change — see that skill's Proportionality rule)
**If Inline Execution chosen:**
- **REQUIRED SUB-SKILL:** Use superpowers:executing-plans