mirror of
https://github.com/obra/superpowers.git
synced 2026-06-13 14:19:05 +08:00
d71eb57d712a16e2ba26f852adc5382054e81cc9
20 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
fa07663322 |
fix(skills): plans reference the spec instead of restating it (SUP-333 #1)
writing-plans told agents to "document everything they need to know" assuming zero context — every agent in the 2026-06-09 six-agent quorum sweep obeyed and restated the entire spec inline in the plan (cost-spec-plan-duplication failed 5/5 completed agents; pi's plan was 683 lines of duplicated spec). - writing-plans: state the division of labor — spec owns WHAT/WHY, plan owns HOW; cite the spec by path/section, never restate it. "Zero context" means mechanically executable steps, not duplication. Add a **Spec:** line to the plan header template. - brainstorming: close the path loophole the re-run exposed — claude shortened docs/superpowers/specs/ to docs/specs/ in 2/2 runs; both path mentions now explicitly forbid the shortening. TDD evidence (quorum): - RED: batch-20260609T023452Z-68aa et al — 5/5 agents fail - GREEN: cost-spec-plan-duplication-claude-20260609T234142Z-9625 pass (plan: "this plan does not restate them" + spec cited by path; both docs in docs/superpowers/) - Canary: triggering-writing-plans-claude pass (skill still fires) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> |
||
|
|
f2cbfbefeb |
Release v5.1.0 (#1468)
* docs: add Codex App compatibility design spec (PRI-823) Design for making using-git-worktrees, finishing-a-development-branch, and subagent-driven-development skills work in the Codex App's sandboxed worktree environment. Read-only environment detection via git-dir vs git-common-dir comparison, ~48 lines across 4 files, zero breaking changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: address spec review feedback for PRI-823 Fix three Important issues from spec review: - Clarify Step 1.5 placement relative to existing Steps 2/3 - Re-derive environment state at cleanup time instead of relying on earlier skill output - Acknowledge pre-existing Step 5 cleanup inconsistency Also: precise step references, exact codex-tools.md content, clearer Integration section update instructions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: address team review feedback for PRI-823 spec - Add commit SHA + data loss warning to handoff payload (HIGH) - Add explicit commit step before handoff (HIGH) - Remove misleading "mark as externally managed" from Path B - Add executing-plans 1-line edit (was missing) - Add branch name derivation rules - Add conditional UI language for non-App environments - Add sandbox fallback for permission errors - Add STOP directive after Step 0 reporting Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: clarify executing-plans in What Does NOT Change section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add cleanup guard test (#5) and sandbox fallback test (#10) to spec Both tests address real risk scenarios: - #5: cleanup guard bug would delete Codex App's own worktree (data loss) - #10: Local thread sandbox fallback needs manual Codex App validation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add implementation plan for Codex App compatibility (PRI-823) 8 tasks covering: environment detection in using-git-worktrees, Step 1.5 + cleanup guard in finishing-a-development-branch, Integration line updates, codex-tools.md docs, automated tests, and final verification. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(codex-tools): add named agent dispatch mapping for Codex (#647) * fix(writing-skills): correct false 'only two fields' frontmatter claim (#882) * Replace subagent review loops with lightweight inline self-review The subagent review loop (dispatching a fresh agent to review plans/specs) doubled execution time (~25 min overhead) without measurably improving plan quality. Regression testing across 5 versions (v3.6.0 through v5.0.4) with 5 trials each showed identical plan sizes, task counts, and quality scores regardless of whether the review loop ran. Changes: - writing-plans: Replace subagent Plan Review Loop with inline Self-Review checklist (spec coverage, placeholder scan, type consistency) - writing-plans: Add explicit "No Placeholders" section listing plan failures (TBD, vague descriptions, undefined references, "similar to Task N") - brainstorming: Replace subagent Spec Review Loop with inline Spec Self-Review (placeholder scan, internal consistency, scope check, ambiguity check) - Both skills now use "look at it with fresh eyes" framing Testing: 5 trials with the new skill show self-review catches 3-5 real bugs per run (spawn positions, API mismatches, seed bugs, grid indexing) in ~30s instead of ~25 min. Remaining defects are comparable to the subagent approach. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Revert "Replace subagent review loops with lightweight inline self-review" This reverts commit |
||
|
|
3f80f1c769 |
Reapply "Replace subagent review loops with lightweight inline self-review"
This reverts commit
|
||
|
|
4ae1a3d6a6 |
Revert "Replace subagent review loops with lightweight inline self-review"
This reverts commit
|
||
|
|
e6221a48c5 |
Replace subagent review loops with lightweight inline self-review
The subagent review loop (dispatching a fresh agent to review plans/specs) doubled execution time (~25 min overhead) without measurably improving plan quality. Regression testing across 5 versions (v3.6.0 through v5.0.4) with 5 trials each showed identical plan sizes, task counts, and quality scores regardless of whether the review loop ran. Changes: - writing-plans: Replace subagent Plan Review Loop with inline Self-Review checklist (spec coverage, placeholder scan, type consistency) - writing-plans: Add explicit "No Placeholders" section listing plan failures (TBD, vague descriptions, undefined references, "similar to Task N") - brainstorming: Replace subagent Spec Review Loop with inline Spec Self-Review (placeholder scan, internal consistency, scope check, ambiguity check) - Both skills now use "look at it with fresh eyes" framing Testing: 5 trials with the new skill show self-review catches 3-5 real bugs per run (spawn positions, API mismatches, seed bugs, grid indexing) in ~30s instead of ~25 min. Remaining defects are comparable to the subagent approach. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
f34ee479b7 |
fix: Windows brainstorm server lifecycle, restore execution choice
- Skip OWNER_PID monitoring on Windows/MSYS2 where the PID namespace is invisible to Node.js, preventing server self-termination after 60s (#770) - Document run_in_background: true for Claude Code on Windows (#767) - Restore user choice between subagent-driven and inline execution after plan writing; subagent-driven is recommended but no longer mandatory - Add Windows lifecycle test script verified on Windows 11 VM - Note #723 (stop-server.sh reliability) as already fixed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
2c6a8a352d |
Tone down review loops: single-pass plan review, raise issue bar
- Remove chunk-based plan review in favor of single whole-plan review - Add Calibration sections to both reviewer prompts so only serious issues block approval - Reduce max review iterations from 5 to 3 - Streamline reviewer checklists (spec: 7→5, plan: 7→4 categories) |
||
|
|
9ccce3bf07 |
Add context isolation principle to all delegation skills
Subagents should never inherit the parent session's context or history. The dispatcher constructs exactly what each subagent needs, keeping both sides focused: the subagent on its task, the controller on coordination. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
f3083e55b0 | Replace 'For Claude' with 'For agentic workers' in plan headers | ||
|
|
d48b14e5ac |
Add project-level scope assessment to brainstorming pipeline
Brainstorming now assesses whether a project is too large for a single spec and helps decompose into sub-projects. Scope check is inline in the understanding phase (testing showed it was skipped as a separate step). Spec reviewer also checks scope. Writing-plans has a backstop. |
||
|
|
daa3fb2322 |
Add architecture guidance and capability-aware escalation to skills
Add design-for-isolation and working-in-existing-codebases guidance to brainstorming. Add file size awareness and escalation prompts to SDD implementer and code quality reviewer. Writing-plans gets architecture section sizing guidance. Spec and plan reviewers get architecture and file size checks. |
||
|
|
7b99c39c08 |
Add plan review loop and checkbox syntax to writing-plans skill
Plans now include a review loop dispatching plan-document-reviewer subagent. Checkbox syntax (- [ ]) on steps for tracking progress. |
||
|
|
5e51c3ee5a |
feat: enforce subagent-driven-development on capable harnesses
- Subagent-driven-development is now mandatory when harness supports it - No longer offer choice between subagent-driven and executing-plans - Executing-plans reserved for harnesses without subagent capability - Update plan header to reference both execution paths Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> |
||
|
|
f57638a747 |
refactor: restructure specs and plans directories
- Specs (brainstorming output) now go to docs/superpowers/specs/ - Plans (writing-plans output) now go to docs/superpowers/plans/ - User preferences for locations override these defaults - Update all skill references and test files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> |
||
|
|
19df3db59b | fix(writing-plans): use 4-backtick fence for nested code blocks in Task Structure template | ||
|
|
030a222af1 |
Fix skill descriptions: remove workflow summaries that override flowcharts
Testing revealed that descriptions summarizing workflow cause Claude to follow the description instead of reading the skill body. Changed all descriptions to "when to use" triggers only: - dispatching-parallel-agents: 2+ independent tasks without shared state - executing-plans: have a written plan to execute with review checkpoints - requesting-code-review: completing tasks, features, or before merging - systematic-debugging: encountering bugs before proposing fixes - test-driven-development: implementing features before writing code - writing-plans: have spec/requirements for multi-step task before coding - writing-skills: updated with "description trap" documentation The description trap: workflow summaries in descriptions create shortcuts Claude takes, skipping the skill body entirely. |
||
|
|
79436abffa |
Update all superpowers skill references to use namespace prefix
Skills are now namespaced as superpowers:<name> when referenced. Updated all REQUIRED SUB-SKILL, RECOMMENDED SUB-SKILL, and REQUIRED BACKGROUND references to use the superpowers: prefix. Also added -design suffix to brainstorming skill's design document filename to distinguish from implementation plan documents. Files updated: - brainstorming: Added -design suffix, updated skill references - executing-plans: Updated finishing-a-development-branch reference - subagent-driven-development: Updated finishing-a-development-branch reference - systematic-debugging: Updated root-cause-tracing and test-driven-development references - testing-skills-with-subagents: Updated test-driven-development reference - writing-plans: Updated executing-plans and subagent-driven-development references - writing-skills: Updated test-driven-development, systematic-debugging, and testing-skills-with-subagents references |
||
|
|
141953a4be |
Improve skill cross-references for clarity and compliance
Update all skill references to use explicit requirement markers: - REQUIRED BACKGROUND: For prerequisite understanding - REQUIRED SUB-SKILL: For mandatory workflow dependencies - Complementary skills: For optional but helpful related skills Changes: - Remove old path format (skills/collaboration/X → X) - Add explicit "REQUIRED" markers to make dependencies clear - Update Integration sections with categorized skill relationships - Fix non-existent skill references - Update cross-reference documentation in writing-skills This makes it immediately clear which skills MUST be used vs optional references, helping Claude understand and comply with skill dependencies. |
||
|
|
48410c7f19 |
Standardize skill frontmatter names to lowercase and kebab-case
- Update all 20 skill frontmatter names to match their directory names in lowercase - Fix defense-in-depth name (was Defense-in-Depth-Validation) - Fix receiving-code-review name (was Code-Review-Reception) - Update all skill announcements and cross-references to use lowercase names - Update commands redirects to reference lowercase skill names Ensures consistent naming: skill directory names, frontmatter names, and documentation references all use lowercase kebab-case format (e.g., brainstorming, test-driven-development) |
||
|
|
9c9547cc04 | Now that skills are a first-class thing in Claude Code, restore them to the primary plugin |