fix(skills): plans reference the spec instead of restating it — end to end (SUP-333 A)

Consolidates the reference-discipline change with every consumer of it, so this PR is independently mergeable (previously split across two stacked PRs whose intermediate state left the SDD spec reviewer blind). writing-plans: plans reference the spec — never restate, paraphrase, or summarize it; spec owns WHAT/WHY, plan owns HOW; cite by path in the header (**Spec:** template line) and by section where a task needs context; No Placeholders repetition stays intra-plan; no-spec branch scoped to conversational-requirements-only (eval-caught: an agent used an unscoped no-spec branch to skip writing the spec entirely). brainstorming: spec path loophole closed (claude shortened docs/superpowers/specs/ to docs/specs/, documented run); an existing differently-named docs dir is not a "user preference". subagent-driven-development: Spec Context section — the controller reads the plan-cited spec and pastes cited sections into implementer and spec-reviewer prompts; the spec reviewer's diff-only rule gets a spec-document exception. Without this, reference discipline starves the pipeline of requirements. executing-plans: Step 1 reads the spec the plan cites (the non-subagent path; plans are no longer self-contained). Eval evidence (quorum, full-stack text): cost-spec-plan-duplication claude 3/3 pass (RED: 5/5 agents failed), codex pass, pi pass (the 683-line duplication RED agent); sdd-spec-context-consumed functional pass with deterministic dispatch-prompt check; writing-plans-no-spec-conversational 2/2 pass; triggering-writing-plans canary 3/3. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
chore(evals): bump submodule for Claude Haiku target
2026-07-04 00:29:04 +08:00 · 2026-06-11 00:20:49 -07:00 · 2026-06-10 16:31:16 -07:00
4 changed files with 7 additions and 24 deletions
--- a/2
+++ b/2
--- a/skills/brainstorming/SKILL.md
+++ b/skills/brainstorming/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: brainstorming
-description: "You MUST use this before any creative work - creating features, building components, adding functionality, or modifying behavior. Explores user intent, requirements and design before implementation. The one exception: a request that leaves zero design decisions open needs no design - implement it directly without invoking this skill (e.g. 'add a basic checkbox, nothing fancy' where the context leaves nothing to choose). Decisions ARE open - so invoke - if the change adds a file or dependency, touches a schema, API contract, or persisted data (even when the user stated the outcome), deletes or disables working functionality (even when the deletion is exactly what was asked), touches security posture at all (auth, sessions, permissions, CORS, crypto - even with the exact value stated), alters user-visible behavior beyond the stated change, has more than one plausible reading, or the user frames it as a feature or project to think through."
+description: "You MUST use this before any creative work - creating features, building components, adding functionality, or modifying behavior. Explores user intent, requirements and design before implementation."
 ---

 # Brainstorming Ideas Into Designs
@@ -10,22 +10,12 @@ Help turn ideas into fully formed designs and specs through natural collaborativ
 Start by understanding the current project context, then ask questions one at a time to refine the idea. Once you understand what you're building, present the design and get user approval.

 <HARD-GATE>
-Do NOT invoke any implementation skill, write any code, scaffold any project, or take any implementation action until you have presented a design and the user has approved it. This applies to EVERY project regardless of perceived simplicity, with exactly one exception.
-
-Exception — nothing to design: when the exception in this skill's description applies (zero open design decisions; its tripwire list puts the gate back on), implement directly. TDD and verification-before-completion still apply. Brainstorming exists to surface decisions; when there are none, the user's request IS the design. Any doubt: the gate holds.
+Do NOT invoke any implementation skill, write any code, scaffold any project, or take any implementation action until you have presented a design and the user has approved it. This applies to EVERY project regardless of perceived simplicity.
 </HARD-GATE>

 ## Anti-Pattern: "This Is Too Simple To Need A Design"

-Anything with open decisions goes through this process. A todo list, a single-function utility, a data migration — "simple" projects are where unexamined assumptions cause the most wasted work. The design can be short (a few sentences for truly simple projects), but if anything remains to decide, you MUST present it and get approval. Do not confuse this with the nothing-to-design exception above: "this seems simple, I'll skip the design" is a rationalization whenever decisions exist.
-
-| Excuse | Reality |
-|--------|---------|
-| "The codebase has an established pattern, so nothing is open" | A pattern answers HOW, not WHETHER or WHAT. Those decisions are still open unless the user made them. |
-| "I can infer the obvious choice" | If there is a choice to infer, a decision is open. Invoke. |
-| "The user said keep it simple / nothing fancy" | A hedge describes the solution's size, not the request's completeness. Check what remains undecided, not the tone. |
-| "Asking would waste the user's time" | One design question costs seconds; an unexamined assumption costs a rewrite. |
-| "The user already made that decision — they told me to delete it" | A requested deletion still has consequences the user may not have weighed (working feature, no usage data, alternatives). Surface them first; the tripwire applies to requested deletions. |
+Every project goes through this process. A todo list, a single-function utility, a config change — all of them. "Simple" projects are where unexamined assumptions cause the most wasted work. The design can be short (a few sentences for truly simple projects), but you MUST present it and get approval.

 ## Checklist

--- a/skills/subagent-driven-development/SKILL.md
+++ b/skills/subagent-driven-development/SKILL.md
@@ -11,8 +11,6 @@ Execute plan by dispatching fresh subagent per task, with two-stage review after

 **Core principle:** Fresh subagent per task + two-stage review (spec then quality) = high quality, fast iteration

-**Proportionality:** Review fanout scales with the change. When the entire plan is one trivial, fully-specified mechanical change — a log statement, a typo fix, a constant bump with no security or behavioral consequences — implement it directly (or with a single implementer subagent), verify per superpowers:verification-before-completion (run the relevant command, confirm output), commit, and skip all review subagents, including the final reviewer: three review dispatches cost more than a one-line diff. Trivial is a property of the diff — it changes no logic, no control flow, and nothing security-relevant — not of the plan's self-description. Any doubt means not trivial: run the full pipeline. Within a multi-task plan, never skip reviews, regardless of task size.
-
 **Continuous execution:** Do not pause to check in with your human partner between tasks. Execute all tasks from the plan without stopping. The only reasons to stop are: BLOCKED status you cannot resolve, ambiguity that genuinely prevents progress, or all tasks complete. "Should I continue?" prompts and progress summaries waste their time — they asked you to execute the plan, so execute it.

 ## When to Use
@@ -63,16 +61,11 @@ digraph process {
    }

    "Read plan, extract all tasks with full text, note context, create todos" [shape=box];
-    "Entire plan = one trivial, fully-specified mechanical change? (any doubt = no)" [shape=diamond];
-    "Implement directly, verify, commit (no review fanout)" [shape=box];
    "More tasks remain?" [shape=diamond];
    "Dispatch final code reviewer subagent for entire implementation" [shape=box];
    "Use superpowers:finishing-a-development-branch" [shape=box style=filled fillcolor=lightgreen];

-    "Read plan, extract all tasks with full text, note context, create todos" -> "Entire plan = one trivial, fully-specified mechanical change? (any doubt = no)";
-    "Entire plan = one trivial, fully-specified mechanical change? (any doubt = no)" -> "Implement directly, verify, commit (no review fanout)" [label="yes — see Proportionality"];
-    "Implement directly, verify, commit (no review fanout)" -> "Use superpowers:finishing-a-development-branch";
-    "Entire plan = one trivial, fully-specified mechanical change? (any doubt = no)" -> "Dispatch implementer subagent (./implementer-prompt.md)" [label="no"];
+    "Read plan, extract all tasks with full text, note context, create todos" -> "Dispatch implementer subagent (./implementer-prompt.md)";
    "Dispatch implementer subagent (./implementer-prompt.md)" -> "Implementer subagent asks questions?";
    "Implementer subagent asks questions?" -> "Answer questions, provide context" [label="yes"];
    "Answer questions, provide context" -> "Dispatch implementer subagent (./implementer-prompt.md)";
@@ -248,7 +241,7 @@ Done!

 **Never:**
 - Start implementation on main/master branch without explicit user consent
- Skip reviews — sole exception: a plan that is entirely one trivial change (see Proportionality)
+- Skip reviews (spec compliance OR code quality)
 - Proceed with unfixed issues
 - Dispatch multiple implementation subagents in parallel (conflicts)
 - Make subagent read plan file (provide full text instead)
--- a/skills/writing-plans/SKILL.md
+++ b/skills/writing-plans/SKILL.md
@@ -149,7 +149,7 @@ After saving the plan, offer execution choice:

 **If Subagent-Driven chosen:**
 - **REQUIRED SUB-SKILL:** Use superpowers:subagent-driven-development
- Fresh subagent per task + two-stage review (review fanout scales with the change — see that skill's Proportionality rule)
+- Fresh subagent per task + two-stage review

 **If Inline Execution chosen:**
 - **REQUIRED SUB-SKILL:** Use superpowers:executing-plans