fix(skills): brainstorming gate exempts nothing-to-design requests; description exceptions are authoritative (SUP-333 C)

Consolidates the brainstorming exception with its routing-layer semantics, so this PR is independently mergeable (previously split across two stacked PRs whose intermediate state left the always- injected routing text contradicting the shipped description). brainstorming: the nothing-to-design exception, earned by a tripwire scan stated in one line before acting. Tripwires precede the permission (skimmers stop at "implement directly"); security-posture touches re-gate even with the exact value stated; requested deletions re-gate; rationalization table per writing-skills bulletproofing. Description 971/1024 chars, YAML-validated. using-superpowers: description-level exceptions are authoritative (compliance, not rationalization); doubt means invoke; only the description can define one; the skip must state its scan; flowchart routes the exempt path through the scan statement; <EXTREMELY-IMPORTANT> defers in one parenthetical. writing-skills: negative triggering conditions are scope (allowed, required at the description) vs workflow summaries (still forbidden) — prevents a future checklist pass from stripping the exception. Eval evidence (quorum): RED cost-checkbox-over-trigger failed 5/6 agents (pi ⊘); GREEN claude 3/3, codex ✓, antigravity ✓ (kimi unchanged from baseline — does not read description exceptions); gate-still-fires: brainstorming-resists 2/2 + codex, spec-plan brainstorm leg 3/3. Boundary scenarios (security one-liner, requested deletion): pre-stack dev baseline 0/3 + 0/3 (silent edit every time — the blanket gate never fired on one-liners); this text 2/3 + 2/3, the first text in the corpus to catch these at any rate; scenarios ship as regression instruments (proposed in prime-radiant-inc/superpowers-evals#11, open). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Canary-caught addition: on the assembled text, triggering-writing-plans went 0/3 with claude citing "your explicit instruction wins per the priority rules" to skip writing-plans under the scenario's "don't ask me any questions" pressure — the Instruction Priority section read as licensing ad-hoc pressure to skip workflow steps. User Instructions now distinguishes pressure phrasing (changes interaction style) from instructions that name what to skip (honored), and tags the quoted rationalization.
2026-06-12 05:39:05 +08:00 · 2026-06-10 23:48:44 -07:00
parent 0cb1960068
commit 5c3af5f195
3 changed files with 30 additions and 6 deletions
--- a/skills/using-superpowers/SKILL.md
+++ b/skills/using-superpowers/SKILL.md
@@ -12,7 +12,7 @@ If you think there is even a 1% chance a skill might apply to what you are doing

 IF A SKILL APPLIES TO YOUR TASK, YOU DO NOT HAVE A CHOICE. YOU MUST USE IT.

-This is not negotiable. This is not optional. You cannot rationalize your way out of this.
+This is not negotiable. This is not optional. You cannot rationalize your way out of this. (The single carve-out: a skill whose own description says it does not apply — see The Rule.)
 </EXTREMELY-IMPORTANT>

 ## Instruction Priority
@@ -49,6 +49,10 @@ Skills speak in actions ("dispatch a subagent", "create a todo", "read a file")

 **Invoke relevant or requested skills BEFORE any response or action.** Even a 1% chance a skill might apply means that you should invoke the skill to check. If an invoked skill turns out to be wrong for the situation, you don't need to use it.

+**Documented exceptions in a skill's own description are authoritative.** When a description itself says the skill does not apply to a request (e.g. brainstorming's nothing-to-design exception), not invoking it is compliance, not rationalization. Any doubt about whether the exception's conditions hold means invoke. Only the skill's description can define such an exception; you cannot infer one.
+
+**An exception skip must be stated, never silent.** Before your first action, write one line naming the exception and the tripwire scan that came up empty — e.g. "Skipping brainstorming per its exception: no security/deletion/schema/new-file tripwires; outcome fully specified." If you did not write the scan line, you did not scan — invoke the skill instead.
+
 ```dot
 digraph skill_flow {
    "User message received" [shape=doublecircle];
@@ -69,7 +73,12 @@ digraph skill_flow {
    "Invoke brainstorming skill" -> "Might any skill apply?";

    "User message received" -> "Might any skill apply?";
-    "Might any skill apply?" -> "Invoke the skill" [label="yes, even 1%"];
+    "Might any skill apply?" -> "Skill's own description exempts this request?" [label="yes, even 1%"];
+    "Skill's own description exempts this request?" [shape=diamond];
+    "Skill's own description exempts this request?" -> "Invoke the skill" [label="no / any doubt"];
+    "Skill's own description exempts this request?" -> "State the one-line tripwire scan, then proceed" [label="yes, clearly"];
+    "State the one-line tripwire scan, then proceed" [shape=box];
+    "State the one-line tripwire scan, then proceed" -> "Respond (including clarifications)";
    "Might any skill apply?" -> "Respond (including clarifications)" [label="definitely not"];
    "Invoke the skill" -> "Announce: 'Using [skill] to [purpose]'";
    "Announce: 'Using [skill] to [purpose]'" -> "Has checklist?";
@@ -94,6 +103,7 @@ These thoughts mean STOP—you're rationalizing:
 | "I remember this skill" | Skills evolve. Read current version. |
 | "This doesn't count as a task" | Action = task. Check for skills. |
 | "The skill is overkill" | Simple things become complex. Use it. |
+| "Too trivial to scan the tripwire list" | The scan is one sentence. Write it or invoke the skill. |
 | "I'll just do this one thing first" | Check BEFORE doing anything. |
 | "This feels productive" | Undisciplined action wastes time. Skills prevent this. |
 | "I know what that means" | Knowing the concept ≠ using the skill. Invoke it. |
@@ -118,4 +128,6 @@ The skill itself tells you which.

 ## User Instructions

-Instructions say WHAT, not HOW. "Add X" or "Fix Y" doesn't mean skip workflows.
+Instructions say WHAT, not HOW. "Add X" or "Fix Y" doesn't mean skip workflows — unless a skill's own description exempts the request (see The Rule above).
+
+Pressure phrasing — "don't ask questions", "make assumptions", "just build it" — changes how you interact (state assumptions instead of asking), not which skills you invoke. Only an instruction that names what to skip ("don't write a plan", "skip TDD") or a description exception skips a workflow step. "Your instruction wins per the priority rules" applied to an unnamed workflow step is a rationalization.