From 95795c2e7b4cd6a2ac7e26516300a2aefbdfca39 Mon Sep 17 00:00:00 2001 From: Jesse Vincent Date: Thu, 11 Jun 2026 10:20:24 -0700 Subject: [PATCH] feat(writing-skills): form-selection table + micro-test wording method RED battery (35 opus authoring samples against the current skill) showed authors default to prohibition+rationalization-table for composition- shaping problems (T1: 5/5), where that form measurably backfires (prohibition 4.4 vs 3.6 no-guidance control vs 3.0 recipe restatement errors), and design only full-subagent verification with no wording micro-tests, no mandatory no-guidance control, no manual inspection of automated matches, no variance signal (T7: 5/5). Adds: Match the Form to the Failure (failure-type -> form table, nuance/ exemption rules), scope note on Bulletproofing, Micro-Test Wording subsection, two checklist lines. Deliberately narrow: T3/T4/T5/T6 RED samples showed Iron Law / elicit-first behavior already strong. --- skills/writing-skills/SKILL.md | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/skills/writing-skills/SKILL.md b/skills/writing-skills/SKILL.md index d7393dc7..7dcd5383 100644 --- a/skills/writing-skills/SKILL.md +++ b/skills/writing-skills/SKILL.md @@ -456,10 +456,29 @@ Different skill types need different test approaches: **All of these mean: Test before deploying. No exceptions.** +## Match the Form to the Failure + +Before writing guidance, classify the baseline failure. The form that bulletproofs one failure type measurably backfires on another. + +| Baseline failure | Right form | Wrong form | +|---|---|---| +| Skips/violates a rule under pressure (knows better, does it anyway) | Prohibition + rationalization table + red flags (see Bulletproofing below) | Soft guidance ("prefer...", "consider...") | +| Complies, but output has the wrong shape (bloated prompt, buried verdict, restated spec) | Positive recipe or contract: state what the output IS — its parts, in order | Prohibition list ("don't restate", "never narrate") | +| Omits a required element from something they already produce | Structural: REQUIRED field or slot in the template they fill in | Prose reminders near the template | +| Behavior should depend on a condition | Conditional keyed to an observable predicate ("if the brief exists, reference it") | Unconditional rule + exemption clauses | + +**Why prohibitions backfire on shaping problems:** under a competing incentive ("make the prompt self-contained"), agents negotiate with "don't X" and produce MORE of the unwanted content than with no guidance at all — measured head-to-head, prohibition wording scored worse than the no-guidance control while recipe wording scored best. A recipe leaves nothing to negotiate: the output matches the stated shape or it doesn't. + +**Rules for whichever form you pick:** +- **No nuance clauses.** "Don't X unless it matters" reopens the negotiation and measurably dilutes compliance. Express a real exception as its own conditional on an observable predicate. +- **Exemption clauses don't scope.** "This limit doesn't apply to code blocks" still suppresses code blocks. If part of the output must be exempt, restructure so the rule can't reach it. + ## Bulletproofing Skills Against Rationalization Skills that enforce discipline (like TDD) need to resist rationalization. Agents are smart and will find loopholes when under pressure. +**Scope:** this toolkit is for discipline failures — an agent that knows the rule and skips it under pressure. For wrong-shaped output or omitted elements, prohibition-based bulletproofing backfires; use the forms in Match the Form to the Failure instead. + **Psychology note:** Understanding WHY persuasion techniques work helps you apply them systematically. See persuasion-principles.md for research foundation (Cialdini, 2021; Meincke et al., 2025) on authority, commitment, scarcity, social proof, and unity principles. ### Close Every Loophole Explicitly @@ -553,6 +572,18 @@ Run same scenarios WITH skill. Agent should now comply. Agent found new rationalization? Add explicit counter. Re-test until bulletproof. +### Micro-Test Wording Before Full Scenarios + +Full pressure-scenario runs are the final gate, but they are slow and expensive per iteration. Verify the wording itself first with micro-tests: + +1. **One API call per sample.** System prompt = the realistic context the guidance will live in (the full skill or prompt template, not the guidance in isolation); user message = a task that tempts the failure. +2. **Always include a no-guidance control.** If the control doesn't exhibit the failure, there is nothing to fix — stop, don't author the guidance. +3. **5+ reps per variant.** Single samples lie. +4. **Manually read every flagged match.** Score programmatically if you like, but template echoes and quoted counter-examples masquerade as hits; automated counts alone overstate both failure and success. +5. **Variance is a metric.** When guidance lands, reps converge on the same shape. Five different interpretations across five reps means the wording isn't binding — tighten the form before adding words. + +Micro-tests verify wording; they do not replace pressure scenarios for discipline skills. + **Testing methodology:** See [testing-skills-with-subagents.md](testing-skills-with-subagents.md) for the complete testing methodology: - How to write pressure scenarios - Pressure types (time, sunk cost, authority, exhaustion) @@ -610,6 +641,8 @@ Deploying untested skills = deploying untested code. It's a violation of quality - [ ] Keywords throughout for search (errors, symptoms, tools) - [ ] Clear overview with core principle - [ ] Address specific baseline failures identified in RED +- [ ] Guidance form matches the failure type (see Match the Form to the Failure) +- [ ] Wording micro-tested against a no-guidance control (5+ reps, every flagged match read manually) - [ ] Code inline OR link to separate file - [ ] One excellent example (not multi-language) - [ ] Run scenarios WITH skill - verify agents now comply