Commit Graph

479 Commits

Author SHA1 Message Date
Drew Ritter
49bf5ad6dc Align Pi mapping with action vocabulary 2026-05-13 17:58:46 -07:00
Drew Ritter
4bd0973879 Bump evals submodule for Pi backend 2026-05-13 17:58:46 -07:00
Jesse Vincent
452f1ed40b chore: keep pi extension under .pi 2026-05-13 17:58:46 -07:00
Jesse Vincent
cafbc5a4bd feat: add pi superpowers package extension 2026-05-13 17:58:46 -07:00
Jesse Vincent
da35948daf docs: plan pi extension and evals work 2026-05-13 17:58:46 -07:00
Drew Ritter
d4d99117f2 Tighten cross-platform tool references 2026-05-13 17:46:28 -07:00
Jesse Vincent
01034bcf8f Phase E: action-language tool vocabulary
Replace Claude-Code-specific tool names in skill prose, prompt
templates, and OpenCode-facing docs with action-language descriptions
that resolve to each runtime's native tool via the per-platform refs.

Changes by category:

- Prose mentions ("Use TodoWrite to track...", "Use Task tool with
  general-purpose type") → action language ("Track each item as a
  todo", "Dispatch a general-purpose subagent")

- Prompt template headers (6 files): "Task tool (general-purpose):"
  → "Subagent (general-purpose):" — preserves the type information
  without naming Claude Code's specific dispatch tool

- DOT flowchart node labels: "Invoke Skill tool" → "Invoke the
  skill"; "Create TodoWrite todo per item" → "Create a todo per
  item"

- OpenCode INSTALL.md and docs/README.opencode.md: replace the old
  "TodoWrite → todowrite, Task → @mention" mapping (which both
  taught a vocabulary skills no longer use AND was wrong about
  @mention being a real OpenCode syntax) with an action-language
  mapping verified against the installed OpenCode CLI's tool
  inventory.

The platform-tools refs landed in Phase B already document each
runtime's resolution; skills now speak in the actions those refs
map. Tool names that genuinely belong only in the per-platform
dispatch section ("In Claude Code: Use the `Skill` tool") and the
Claude-Code-specific Bash run_in_background flag note in
visual-companion remain — those are intentional carve-outs.
2026-05-13 17:46:28 -07:00
Jesse Vincent
b87a5e4721 Phase D: cross-runtime tweaks (visual-companion, executing-plans, test)
Misc platform/runtime statements and adjacencies that don't fit the
prose, config-ref, README-ordering, or tool-vocabulary buckets:

- visual-companion frame template: rename CSS/HTML id #claude-content
  → #frame-content. The id is purely styling — nothing external
  references it. The brainstorm-server test that asserted the old
  string is updated in lockstep.

- visual-companion launch instructions: add a Copilot CLI section
  alongside Claude Code, Codex, and Gemini CLI; combine the Claude
  Code (macOS / Linux) and (Windows) sections so heading style
  matches the other (non-OS-qualified) platforms.

- visual-companion: "Use Write tool" → "Use your file-creation tool"
  for the cat/heredoc warning. The prohibition is what's load-
  bearing, not the tool name.

- executing-plans/SKILL.md: list all subagent-capable runtimes
  (Claude Code, Codex CLI, Codex App, Copilot CLI, Gemini CLI) and
  point at the per-platform tool refs as the source of truth.

- executing-plans/SKILL.md: relative path "using-superpowers/
  references/" → "../using-superpowers/references/" to resolve
  correctly from the executing-plans/ directory.

No bundled spec doc here — Phase D was scope-extension work that
took place across rounds, with no standalone spec authored.
2026-05-13 17:46:28 -07:00
Jesse Vincent
e47d6f4f85 Phase C: alphabetize README platform listings + spec
Quickstart link list and the per-harness install sub-sections both
reorder to strict alphabetical:

  Claude Code, Codex App, Codex CLI, Cursor, Factory Droid,
  Gemini CLI, GitHub Copilot CLI, OpenCode

Three blocks moved (Codex App swaps with Codex CLI; Cursor moves up
two slots; GitHub Copilot CLI moves up one). Claude Code stays first
by alphabetical chance.

Each install sub-section's content is byte-identical pre/post —
only the positions change. Quickstart anchors verified against the
new heading order.
2026-05-13 17:46:28 -07:00
Jesse Vincent
5c0402736e Phase B: config-file refs + per-platform tool refs + spec
Two structural changes:

1. Generalize CLAUDE.md-specific guidance:
   - "Project-specific conventions (put in CLAUDE.md)" → "(put in
     your instructions file)" in writing-skills/SKILL.md
   - "(explicit CLAUDE.md violation)" → "(explicit instruction-file
     violation)" in receiving-code-review/SKILL.md
   - The instruction-priority list in using-superpowers/SKILL.md
     stays inclusive (CLAUDE.md, GEMINI.md, AGENTS.md) — that's
     load-bearing, not a substitution opportunity.

2. Per-platform tool reference files at skills/using-superpowers/
   references/{claude-code,codex,copilot,gemini}-tools.md. Each ref
   documents:
   - The runtime's preferred instructions file (CLAUDE.md, AGENTS.md,
     GEMINI.md, etc.) and how it loads
   - The runtime's personal-skills directory + cross-runtime
     ~/.agents/skills/ path where applicable
   - Action-language → tool-name mapping table

Tool names and table content reflect the source-verified state from
direct inspection of openai/codex, google-gemini/gemini-cli,
sst/opencode, and the installed @github/copilot package. Filenames
and behaviors are sourced from each runtime's official docs.

Files in this commit also pick up later-phase changes that
accumulated on the same files (using-superpowers/SKILL.md "How to
Access Skills" overhaul, action-language flowchart, refs' final
table content). The bundled spec records original scope.
2026-05-13 17:46:28 -07:00
Jesse Vincent
d0e413b591 Phase A: agent-neutral prose + CSO → SDO + spec
Replace generic third-person "Claude" with "agents" / "your agent"
forms across active skill prose, the README intro, and the vendored
anthropic-best-practices.md reference. Carve-outs preserved:
historical attribution paths, the "Variant C: Claude.AI Emphatic
Style" example label, model identifiers (Haiku/Sonnet/Opus), and the
"In Claude Code:" per-platform skill-dispatch list.

Coined-term rename: "Claude Search Optimization (CSO)" → "Skill
Discovery Optimization (SDO)" in writing-skills/SKILL.md.

Files in this commit also pick up later-phase changes that
accumulated on the same files (dispatching-parallel-agents code-
example transformation, writing-skills numbering and path fixes).
The bundled spec at docs/superpowers/specs/ records the original
scope and the carve-outs.

README.md gets only its prose change here; the alphabetization
lands in Phase C's commit.
2026-05-13 17:46:28 -07:00
Drew Ritter
d25618db58 Move eval harness to submodule (#1541) 2026-05-13 12:25:41 -07:00
Drew Ritter
3d6dc90c6d fix(tdd): link testing anti-patterns reference (#1532)
Fixes #1529.
2026-05-12 17:22:42 -07:00
Drew Ritter
a152bb3932 [codex] replace Circle K signal with generic review guidance (#1531)
* Remove Circle K signal from review skill

* Add generic review hesitation guidance

* Use Jesse wording for review hesitation guidance
2026-05-12 17:22:19 -07:00
Drew Ritter
3dfb376268 fix: remove global worktree path fallback (#1476) 2026-05-12 10:24:45 -07:00
Drew Ritter
491df7360c fix(using-git-worktrees): repair skipped Step 2 numbering (#1522) 2026-05-11 17:50:01 -07:00
fuleinist
9088f563e7 fix: remove stale Cursor plugin refs 2026-05-11 17:04:35 -07:00
Stable Genius
d4cf61b4c8 fix(writing-skills): use markdown link for testing methodology reference 2026-05-11 16:51:00 -07:00
Drew Ritter
7f02ccd91b evals: use pre-commit hooks 2026-05-06 15:47:39 -07:00
Drew Ritter
35e42a16ce evals: add Gemini 2.5 Flash backend 2026-05-06 15:47:39 -07:00
Drew Ritter
58082d04f8 evals: drop drill source marker 2026-05-06 15:47:39 -07:00
Drew Ritter
3dc0ea6876 evals: remove unreleased wave scenarios 2026-05-06 15:47:39 -07:00
Jesse Vincent
0bf37499b4 Address adversarial review findings
- evals/README.md, evals/CLAUDE.md: fix uv install command from
  'uv sync --dev' to 'uv sync --extra dev'. Drill's pyproject.toml
  uses [project.optional-dependencies], so --dev is a no-op for
  pytest/ruff/ty; --extra dev is the correct invocation.
- tests/claude-code/run-skill-tests.sh: drop test-requesting-code-review.sh
  from integration_tests array (file deleted earlier in this branch).
- tests/claude-code/README.md: replace test-requesting-code-review.sh
  section with test-worktree-native-preference.sh (the worktree test
  is kept; the code-review test was lifted into drill).
- docs/testing.md, CLAUDE.md: remove "Copilot CLI" from the harness
  list. evals/backends/ has claude*, codex, gemini configs but no
  copilot.yaml, so the claim was unsupported.

Adversarial review credit: reviewer #2 found four legitimate issues
(uv-sync, run-skill-tests stale ref, README stale ref via #1, and
Copilot CLI fabrication); reviewer #1 found two distinct issues
(run-skill-tests + tests/claude-code/README.md). Reviewer #2 wins
this round.
2026-05-06 15:47:39 -07:00
Jesse Vincent
f7c5312265 docs: introduce evals/ as the canonical skill-behavior eval harness
- docs/testing.md split into Plugin tests + Skill behavior evals.
  Plugin tests section enumerates the bash tests that survive
  (kept by drill-coverage analysis or as describe-skill tests).
- CLAUDE.md adds Eval harness section pointing at evals/.
- README.md Contributing section mentions evals/ alongside tests/.
- .gitignore adds evals/{results,.venv,.env} as belt-and-suspenders
  (evals/.gitignore covers these locally; root-level entries help
  tooling that does not recurse into nested ignore files).
2026-05-06 15:47:39 -07:00
Jesse Vincent
f5175fb31a docs: annotate dated artifacts referencing lifted bash tests
- RELEASE-NOTES.md: note that test-requesting-code-review.sh and
  test-document-review-system.sh were lifted into drill scenarios
  on 2026-05-06; references are preserved as dated artifacts.
- docs/superpowers/plans/2026-03-23-codex-app-compatibility.md:
  note that tests/skill-triggering/ was lifted into drill scenarios
  on 2026-05-06; the run-all.sh reference is a dated artifact.

Subagent second-pass scrub confirmed no other active references in
the tree (excluding evals/ and the spec/plan for this work itself).
2026-05-06 15:47:39 -07:00
Jesse Vincent
45c7dc2cce tests: annotate three kept bash tests with drill coverage notes
- test-worktree-native-preference.sh: drill covers PRESSURE phase only;
  RED + GREEN baselines have no drill counterpart and are kept so
  the RED-GREEN-REFACTOR validation remains rerunnable end-to-end.
- test-subagent-driven-development-integration.sh: drill covers the
  YAGNI subset (forbidden exports + reviewer-as-gate). Bash adds
  >=3 commits, >=2 subagent dispatches, TodoWrite usage, test file
  existence check, and token-budget telemetry. Kept until drill
  scenario covers those or they are retired.
- test-subagent-driven-development.sh: tests agent's ability to
  *describe* SDD (string matches against expected keywords). Drill
  scenarios test behavior, not description-recall. Kept by design.

Subagent verification recorded in commit messages of subsequent
deletions; gap analyses driving these annotations are also in the
verification subagent reports for the gating sweep.
2026-05-06 15:47:39 -07:00
Jesse Vincent
39d29a6c28 tests: remove test-requesting-code-review.sh (covered by drill code-review-catches-planted-bugs)
Subagent verification: every bash assertion (skill invocation,
subagent dispatch, SQL injection flagged, credential handling
flagged, no merge approval) maps to drill verify checks. Drill is
stricter: bundles severity (Critical/Important) into the same
criteria as the finding itself (bash split severity into a separate
test). Setup parity covered (src/db.js with string concat + identity
hash, two commits).

The drill scenario header explicitly says it is the
"cross-harness, semantically-judged replacement for the bash test."
2026-05-06 15:47:39 -07:00
Jesse Vincent
f1d2005de3 tests: remove test-document-review-system.sh (covered by drill spec-reviewer-catches-planted-flaws)
Subagent verification: every bash assertion (TODO in Requirements
section flagged, "specified later" deferral flagged, Issues section
present, did-not-approve verdict) maps to drill verify.criteria
entries. Setup parity covered by setup.assertions (test-feature-design.md
exists with TODO + 'specified later' content). Drill is stricter:
asserts tool-called Agent (subagent dispatch) which the bash test
did not check.
2026-05-06 15:47:39 -07:00
Jesse Vincent
c0a65f1b4d tests: remove subagent-driven-dev fixtures (covered by drill sdd-go-fractals + sdd-svelte-todo)
The bash test had ZERO output assertions — it just ran claude -p
and printed token usage. Drill's scenarios are strictly more
rigorous:

go-fractals: skill-called SDD + tool-called Agent + go test ./...
passes + cmd/fractals/main.go exists + >=4 commits + LLM criteria
verifying real SDD workflow.

svelte-todo: skill-called SDD + tool-called Agent + npm test passes
+ playwright e2e passes + package.json + svelte.config.js or
vite.config.ts + >=4 commits + LLM criteria.

design.md and plan.md are byte-identical between bash fixtures and
drill fixtures (evals/fixtures/sdd-{go-fractals,svelte-todo}/).
Drill's setup helper (scaffold_sdd_*) forces git init -b main
(stricter than bash's reliance on init.defaultBranch). The
.claude/settings.local.json from bash scaffold.sh is unnecessary
for drill since permissions are managed via backend YAML.

Subagent verification: SAFE TO DELETE for both.
2026-05-06 15:47:39 -07:00
Jesse Vincent
f10cddac0d tests: remove run-claude-describes-sdd.sh (covered by drill mid-conversation-skill-invocation)
Subagent verification: every bash assertion (Skill tool invoked +
specific skill name 'subagent-driven-development' loaded after the
agent describes it conversationally in turn 1) maps to the drill
scenario's skill-called assertion + criteria paragraph requiring
the skill to fire in direct response to the second user message.
Drill additionally asserts tool-called Agent (subagent dispatch)
which is stricter than the bash test.

Other runners in tests/explicit-skill-requests/ (haiku, multiturn,
extended-multiturn) and their prompt files are preserved — they
have no drill coverage and exercise different behaviors.
2026-05-06 15:47:39 -07:00
Jesse Vincent
371f41596b tests: remove skill-triggering bash prompts (covered by drill triggering-* scenarios)
Subagent verification confirmed each prompt's intent matches its
corresponding drill scenario's turns[].intent verbatim, and each
scenario has both a deterministic skill-called assertion and a
semantic LLM criterion confirming the matching skill was loaded
(actually a stronger check than the bash test, which only confirms
the skill fires anywhere in the stream).

All 6 prompts deleted. The runner had no remaining prompts to drive,
so run-test.sh and run-all.sh deleted as well.
2026-05-06 15:47:39 -07:00
Jesse Vincent
6f0adebe96 evals: drop SUPERPOWERS_ROOT setup step from README/CLAUDE
The cli.py helper now defaults the env var. Mention as override only.
2026-05-06 15:47:39 -07:00
Jesse Vincent
fd5b53cb85 evals: drop SUPERPOWERS_ROOT from codex/gemini required_env
These backends only read SUPERPOWERS_ROOT via engine.py/setup.py's
os.environ access, which the new cli.py default helper supplies
automatically. claude*.yaml keep SUPERPOWERS_ROOT in required_env
because they interpolate ${SUPERPOWERS_ROOT} into --plugin-dir args.
2026-05-06 15:47:39 -07:00
Jesse Vincent
be0357f98a evals: default SUPERPOWERS_ROOT to parent of evals/ if unset
Adds _set_superpowers_root_default() to drill/cli.py, called at
module import after load_dotenv(). PROJECT_ROOT resolves to evals/
post-lift; its parent is the superpowers repo root, which is the
correct value for SUPERPOWERS_ROOT.

Existing env values are respected as overrides via os.environ.setdefault.

Tests:
- helper sets default when var is unset
- helper does not override when var is already set
2026-05-06 15:47:39 -07:00
Jesse Vincent
3b412a3836 Lift drill into evals/ at 013fcb8b7dbefd6d3fa4653493e5d2ec8e7f985b
rsync of obra/drill@013fcb8b7d into superpowers/evals/, excluding
.git/, .venv/, results/, .env/, __pycache__/, *.egg-info/,
.private-journal/.

The drill repo is unaffected by this commit; archival is a separate
manual step after this PR merges.

Source SHA recorded at evals/.drill-source-sha for divergence
detection.
2026-05-06 15:47:39 -07:00
Jesse Vincent
2e46e9590d Plan: lift drill into superpowers as evals/
15-task implementation plan derived from the design spec at
docs/superpowers/specs/2026-05-06-lift-drill-into-evals-design.md.

Each task is bite-sized (2-5 min steps) with exact commands, exact
file paths, and exact code where required. Subagent verification
gates per the spec are written out as concrete prompt templates.

Self-review:
- Spec coverage: every spec section maps to a task
- Placeholder scan: no TBD/TODO/placeholder/fill-in-later language
- Type consistency: helper named _set_superpowers_root_default
  consistently; drill SHA recorded in evals/.drill-source-sha
  consistently
2026-05-06 15:47:39 -07:00
Jesse Vincent
58f821314d Spec: address adversarial review findings
Two parallel reviewers raised legitimate issues against the lift-drill-
into-evals spec. Updates:

- Coverage map for tests/explicit-skill-requests/ corrected: 6 run-*.sh
  scripts + prompts, not "2 scenarios cover all". Several scripts
  (Haiku, multi-turn, please-use-brainstorming, use-systematic-debugging)
  have no drill counterpart and stay.
- tests/claude-code/test-subagent-driven-development.sh marked as
  meta/documentation test (asks agent to describe SDD); no drill
  scenario covers description tests; defaults to keep.
- Path-defaults section now shows verified evidence: PROJECT_ROOT
  resolves to evals/ post-move; only claude*.yaml substitute
  ${SUPERPOWERS_ROOT} in args (codex/gemini use it via os.environ
  in pre-run hooks); helper invocation order specified (after
  load_dotenv, before click definitions).
- Step 2 copy uses explicit rsync excludes (.git, .venv, results,
  .env, __pycache__, *.egg-info, .private-journal); checksum-level
  verification rather than file-count.
- Drill SHA recorded at copy time in commit message and
  evals/.drill-source-sha for divergence detection.
- evals/tests/ pytest suite added to verification protocol.
- Reference scrub list expanded: RELEASE-NOTES.md,
  docs/superpowers/plans/, .codex-plugin/ (corrected from .codex/),
  lefthook.yml. Excluded dirs called out (node_modules/, .venv/,
  evals/).
- Historical plan docs / RELEASE-NOTES handling: annotate, don't
  rewrite.
- evals/lefthook.yml move documented (drill ships its own;
  contributors run cd evals && lefthook run pre-commit manually).
- PR description checklist includes archival action item for
  obra/drill post-merge.

False finding rejected: svelte-todo fixture is complete on disk
(design.md + plan.md + scaffold.sh present); reviewer #1 #3 dropped.
2026-05-06 15:47:39 -07:00
Jesse Vincent
81472cc9e6 Spec: lift drill into superpowers as evals/
Records scope, branching, architecture, deletion gate, verification
protocol, path/config edits, migration ordering, and post-implementation
verification. Frames CI integration, scenario co-location, and Python
package rename as deferred work.

Per-file deletion of bash tests under superpowers/tests/ is gated by a
subagent that compares each bash assertion to its drill scenario's
verify block. Default keeps the bash test if any assertion is unmatched.

Branching: independent off dev (f/evals-lift), not stacked on
f/cross-platform.
2026-05-06 15:47:39 -07:00
robotsnh
b4363df1b9 docs: turned the dash in "- Jesse" into an escape sequence (#1474)
Replaced the bullet point next to "Jesse" in the sponsorship section of the `README` into a dash. This is needed so the `README` renders properly on markdown viewers.
2026-05-06 11:22:19 -07:00
Jesse Vincent
f2cbfbefeb Release v5.1.0 (#1468)
* docs: add Codex App compatibility design spec (PRI-823)

Design for making using-git-worktrees, finishing-a-development-branch,
and subagent-driven-development skills work in the Codex App's sandboxed
worktree environment. Read-only environment detection via git-dir vs
git-common-dir comparison, ~48 lines across 4 files, zero breaking changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: address spec review feedback for PRI-823

Fix three Important issues from spec review:
- Clarify Step 1.5 placement relative to existing Steps 2/3
- Re-derive environment state at cleanup time instead of relying on
  earlier skill output
- Acknowledge pre-existing Step 5 cleanup inconsistency

Also: precise step references, exact codex-tools.md content, clearer
Integration section update instructions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: address team review feedback for PRI-823 spec

- Add commit SHA + data loss warning to handoff payload (HIGH)
- Add explicit commit step before handoff (HIGH)
- Remove misleading "mark as externally managed" from Path B
- Add executing-plans 1-line edit (was missing)
- Add branch name derivation rules
- Add conditional UI language for non-App environments
- Add sandbox fallback for permission errors
- Add STOP directive after Step 0 reporting

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: clarify executing-plans in What Does NOT Change section

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add cleanup guard test (#5) and sandbox fallback test (#10) to spec

Both tests address real risk scenarios:
- #5: cleanup guard bug would delete Codex App's own worktree (data loss)
- #10: Local thread sandbox fallback needs manual Codex App validation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add implementation plan for Codex App compatibility (PRI-823)

8 tasks covering: environment detection in using-git-worktrees,
Step 1.5 + cleanup guard in finishing-a-development-branch,
Integration line updates, codex-tools.md docs, automated tests,
and final verification.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs(codex-tools): add named agent dispatch mapping for Codex (#647)

* fix(writing-skills): correct false 'only two fields' frontmatter claim (#882)

* Replace subagent review loops with lightweight inline self-review

The subagent review loop (dispatching a fresh agent to review plans/specs)
doubled execution time (~25 min overhead) without measurably improving plan
quality. Regression testing across 5 versions (v3.6.0 through v5.0.4) with
5 trials each showed identical plan sizes, task counts, and quality scores
regardless of whether the review loop ran.

Changes:
- writing-plans: Replace subagent Plan Review Loop with inline Self-Review
  checklist (spec coverage, placeholder scan, type consistency)
- writing-plans: Add explicit "No Placeholders" section listing plan failures
  (TBD, vague descriptions, undefined references, "similar to Task N")
- brainstorming: Replace subagent Spec Review Loop with inline Spec Self-Review
  (placeholder scan, internal consistency, scope check, ambiguity check)
- Both skills now use "look at it with fresh eyes" framing

Testing: 5 trials with the new skill show self-review catches 3-5 real bugs
per run (spawn positions, API mismatches, seed bugs, grid indexing) in ~30s
instead of ~25 min. Remaining defects are comparable to the subagent approach.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Revert "Replace subagent review loops with lightweight inline self-review"

This reverts commit bf8f7572eb.

* Reapply "Replace subagent review loops with lightweight inline self-review"

This reverts commit b045fa3950.

* Add v5.0.6 release notes

* Move brainstorm server metadata to .meta/ subdirectory

Metadata files (.server-info, .events, .server.pid, .server.log,
.server-stopped) were stored in the same directory served over HTTP,
making them accessible via the /files/ route. They now live in a .meta/
subdirectory that is not web-accessible.

Also fixes a stale test assertion ("Waiting for Claude" → "Waiting for
the agent").

Reported-By: 吉田仁

* Revert "Move brainstorm server metadata to .meta/ subdirectory"

This reverts commit ab500dade6.

* Separate brainstorm server content and state into peer directories

The session directory now contains two peers: content/ (HTML served to
the browser) and state/ (events, server-info, pid, log). Previously
all files shared a single directory, making server state and user
interaction data accessible over the /files/ HTTP route.

Also fixes stale test assertion ("Waiting for Claude" → "Waiting for
the agent").

Reported-By: 吉田仁

* Fix owner-PID false positive when owner runs as different user

ownerAlive() treated EPERM (permission denied) the same as ESRCH
(process not found), causing the server to self-terminate within 60s
whenever the owner process ran as a different user. This affected WSL
(owner is a Windows process), Tailscale SSH, and any cross-user
scenario.

The fix: `return e.code === 'EPERM'` — if we get permission denied,
the process is alive; we just can't signal it.

Tested on Linux via Tailscale SSH with a root-owned grandparent PID:
- Server survives past the 60s lifecycle check (EPERM = alive)
- Server still shuts down when owner genuinely dies (ESRCH = dead)

Fixes #879

* Fix owner-PID lifecycle monitoring for cross-platform reliability

Two bugs caused the brainstorm server to self-terminate within 60s:

1. ownerAlive() treated EPERM (permission denied) as "process dead".
   When the owner PID belongs to a different user (Tailscale SSH,
   system daemons), process.kill(pid, 0) throws EPERM — but the
   process IS alive. Fixed: return e.code === 'EPERM'.

2. On WSL, the grandparent PID resolves to a short-lived subprocess
   that exits before the first 60s lifecycle check. The PID is
   genuinely dead (ESRCH), so the EPERM fix alone doesn't help.
   Fixed: validate the owner PID at server startup — if it's already
   dead, it was a bad resolution, so disable monitoring and rely on
   the 30-minute idle timeout.

This also removes the Windows/MSYS2-specific OWNER_PID="" carve-out
from start-server.sh, since the server now handles invalid PIDs
generically at startup regardless of platform.

Tested on Linux (magic-kingdom) via Tailscale SSH:
- Root-owned owner PID (EPERM): server survives ✓
- Dead owner PID at startup (WSL sim): monitoring disabled, survives ✓
- Valid owner that dies: server shuts down within 60s ✓

Fixes #879

* Release v5.0.6: inline self-review, brainstorm server restructure, owner-PID fixes

* fix: add Copilot CLI platform detection for sessionStart context injection

Copilot CLI v1.0.11 reads `additionalContext` from sessionStart hook
output, but the session-start script only emits the Claude Code-specific
nested format. Add COPILOT_CLI env var detection so Copilot CLI gets the
SDK-standard top-level `additionalContext` while Claude Code continues
getting `hookSpecificOutput`.

Based on PR #910 by @culinablaz.

* feat: add Copilot CLI tool mapping, docs, and install instructions

- Add references/copilot-tools.md with full tool equivalence table
- Add Copilot CLI to using-superpowers skill platform instructions
- Add marketplace install instructions to README
- Add changelog entry crediting @culinablaz for the hook fix

* fix(opencode): align skills path across bootstrap, runtime, and tests

The bootstrap text advertised a configDir-based skills path that didn't
match the runtime path (resolved relative to the plugin file). Tests
used yet another hardcoded path and referenced a nonexistent lib/ dir.

- Remove misleading skills path from bootstrap text; the agent should
  use the native skill tool, not read files by path
- Fix test setup to create a consistent layout matching the plugin's
  ../../skills resolution
- Export SUPERPOWERS_SKILLS_DIR from setup.sh so tests use a single
  source of truth
- Add regression test that bootstrap doesn't advertise the old path
- Remove broken cp of nonexistent lib/ directory

Fixes #847

* docs: add OpenCode path fix to release notes

* fix(opencode): inject bootstrap as user message instead of system message

Move bootstrap injection from experimental.chat.system.transform to
experimental.chat.messages.transform, prepending to the first user
message instead of adding a system message.

This avoids two issues:
- System messages repeated every turn inflate token usage (#750)
- Multiple system messages break Qwen and other models (#894)

Tested on OpenCode 1.3.2 with Claude Sonnet 4.5 — brainstorming skill
fires correctly on "Let's make a React to do list" prompt.

* docs: update release notes with OpenCode bootstrap change

* docs: add worktree rototill design spec (PRI-974)

Design for detect-and-defer worktree support. Superpowers defers to
native harness worktree systems when available, falls back to manual
git worktree creation when not. Covers Phases 0-2: detection, consent,
native tool preference, finishing state detection, and three bug fixes
(#940, #999, #238).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: address SWE review feedback on worktree rototill spec

- Fix Bug #999 order: merge → verify → remove worktree → delete branch
  (avoids losing work if merge fails after worktree removal)
- Add submodule guard to Step 0 detection (GIT_DIR != GIT_COMMON is also
  true in submodules)
- Preserve global path (~/.config/superpowers/worktrees/) in detection for
  backward compatibility, just stop offering it to new users
- Add step numbering note and implementation notes section
- Expand provenance heuristic to cover global path and manual creation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: honest spec revisions after issue/PR deep dive

- Step 1a is the load-bearing assumption, not just a risk — if it fails,
  the entire design needs rework. TDD validation must be first impl task.
- #1009 resolution depends on Step 1a working, stated explicitly
- #574 honestly deferred, not "partially addressed"
- Add hooks symlink to Step 1b (PR #965 idea, prevents silent hook loss)
- Add stale worktree pruning to Step 5 (PR #1072 idea, one-line self-heal)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add worktree rototill implementation plan (PRI-974)

5 tasks: TDD gate for Step 1a, using-git-worktrees rewrite,
finishing-a-development-branch rewrite, integration updates,
end-to-end validation. Task 1 is a hard gate — if native tool
preference fails RED/GREEN, stop and redesign.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add RED/GREEN validation for native worktree preference (PRI-974)

Gate test for Step 1a — validates agents prefer EnterWorktree over
git worktree add on Claude Code. Must pass before skill rewrite.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: rewrite using-git-worktrees with detect-and-defer (PRI-974)

Step 0: GIT_DIR != GIT_COMMON detection (skip if already isolated)
Step 0 consent: opt-in prompt before creating worktree (#991)
Step 1a: native tool preference (short, first, declarative)
Step 1b: git worktree fallback with hooks symlink and legacy path compat
Submodule guard prevents false detection
Platform-neutral instruction file references (#1049)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: rewrite finishing-a-development-branch with detect-and-defer (PRI-974)

Step 2: environment detection (GIT_DIR != GIT_COMMON) before presenting menu
Detached HEAD: reduced 3-option menu (no merge from detached HEAD)
Provenance-based cleanup: .worktrees/ = ours, anything else = hands off
Bug #940: Option 2 no longer cleans up worktree
Bug #999: merge -> verify -> remove worktree -> delete branch
Bug #238: cd to main repo root before git worktree remove
Stale worktree pruning after removal (git worktree prune)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address spec review findings in both skill rewrites (PRI-974)

using-git-worktrees: submodule guard now says "treat as normal repo"
instead of "proceed to Step 1" (preserves consent flow)
using-git-worktrees: directory priority summaries include global legacy

finishing-a-development-branch: move git branch -d after Step 6 cleanup
to make Bug #999 ordering unambiguous (merge -> worktree remove -> branch delete)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: update worktree integration references across skills (PRI-974)

Remove REQUIRED language from executing-plans and subagent-driven-development.
Consent and detection now live inside using-git-worktrees itself.
Fix stale 'created by brainstorming' claim in writing-plans.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: include worktrees/ (non-hidden) in finishing provenance check (PRI-974)

The creation skill supports both .worktrees/ and worktrees/ directories,
but the finishing skill's cleanup only checked .worktrees/. Worktrees
under the non-hidden path would be orphaned on merge or discard.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: Step 1a validated through TDD — explicit naming + consent bridge (PRI-974)

Step 1a failed at 2/6 with the spec's original abstract text ("use your
native tool"). Three REFACTOR iterations found what works (50/50 runs):

1. Explicit tool naming — "do you have EnterWorktree, WorktreeCreate..."
   transforms interpretation into factual toolkit check
2. Consent bridge — "user's consent is your authorization" directly
   addresses EnterWorktree's "ONLY when user explicitly asks" guardrail
3. Red Flag entry naming the specific anti-pattern

File split was tested but proven unnecessary — the fix is the Step 1a
text quality, not physical separation of git commands. Control test
with full 240-line skill (all git commands visible) passed 20/20.

Test script updated: supports batch runs (./test.sh green 20), "all"
phase, and checks absence of git worktree add (reliable signal) rather
than presence of EnterWorktree text (agent sometimes omits tool name).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update spec with TDD findings on Step 1a (PRI-974)

Step 1a's original "deliberately short, abstract" design was disproven
by TDD (2/6 pass rate). Spec now documents the validated approach:
explicit tool naming + consent bridge + red flag (50/50 pass rate).

- Design Principles: updated to reflect explicit naming over abstraction
- Step 1a: replaced abstract text with validated approach, added design
  note explaining the TDD revision and why file splitting was unnecessary
- Risks: Step 1a risk marked RESOLVED with cross-platform validation table
  and residual risk note about upstream tool description dependency

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: honest cross-platform validation table in spec (PRI-974)

Research confirmed Claude Code is currently the only harness with an
agent-callable mid-session worktree tool. All others either create
worktrees before the agent starts (Codex App, Gemini, Cursor) or have
no native support (Codex CLI, OpenCode).

Table now shows: what was actually tested (Claude Code 50/50, Codex CLI
6/6), what was simulated (Codex App 1/1), and what's untested (Gemini,
Cursor, OpenCode). Step 1a is forward-compatible for when other
harnesses add agent-callable tools.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: cross-platform validation on 5 harnesses (PRI-974)

Tested on Gemini CLI (gemini -p) and Cursor Agent (cursor-agent -p):
- Gemini: Step 0 detection 1/1, Step 1b fallback 1/1
- Cursor: Step 0 detection 1/1, Step 1b fallback 1/1

Both correctly identified no native agent-callable worktree tool,
fell through to git worktree add, and performed safety verification.
Both correctly detected existing worktrees and skipped creation.

5 of 6 harnesses now tested. Only OpenCode untested (no CLI access).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove incorrect hooks symlink step from worktree skill

Git worktrees inherit hooks from the main repo automatically via
$GIT_COMMON_DIR — this has been the case since git 2.5 (2015).
The symlink step was based on an incorrect premise from PR #965
and also fails in practice (.git is a file in worktrees, not a dir).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: address PR #1121 review — respect user preference, drop y/n

- Consent prompt: drop "(y/n)" and add escape valve for users who
  have already declared their worktree preference in global or
  project agent instruction files.
- Directory selection: reorder to put declared user preference
  ahead of observed filesystem state, and reframe the default as
  "if no other guidance available".
- Sandbox fallback: require explicitly informing the user that
  the sandbox blocked creation, not just "report accordingly".
- writing-plans: fully qualify the superpowers:using-git-worktrees
  reference.
- Plan doc: mirror the consent-prompt change.

Step 1a native-tool framing and the helper-scripts suggestion are
still outstanding — the first needs a benchmark re-run before softer
phrasing can be adopted without regressing compliance; the second is
exploratory and will get a thread reply.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: soften Step 1a native-tool framing per PR #1121 review

Address obra's comment on explicit step numbers / prescriptive tone.
Drops "STOP HERE if available", the "If YES:" gate, and the "even if /
even if / NO EXCEPTIONS" reinforcement paragraph. Keeps the specific
tool-name anchors (EnterWorktree, WorktreeCreate, /worktree, --worktree),
which the original TDD data showed are load-bearing.

A/B verified against drill harness on the 3 creation/consent scenarios
(consent-flow, creation-from-main, creation-from-main-spec-aware):
baseline explicit wording scored 12/12 criteria, softened wording also
scored 12/12. The "agent used the most appropriate tool" criterion
passed in all 3 softened runs — agents still picked EnterWorktree via
ToolSearch without the imperative framing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: drop instruction file enumeration per PR #1121 review

Jesse flagged that the verbose CLAUDE.md/AGENTS.md/GEMINI.md/.cursorrules
enumeration (a) chews tokens, (b) confuses models that anchor on exact
strings, and (c) is repeated DRY-violatingly across 3+ locations.

Replace with abstract "your instructions" framing in four spots:
- skills/using-git-worktrees/SKILL.md Step 0 → Step 1 transition
- skills/using-git-worktrees/SKILL.md Step 1b Directory Selection
- docs/superpowers/plans/2026-04-06-worktree-rototill.md (both mirror locations)

Same intent, harness-agnostic phrasing, ~half the tokens.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: replace hardcoded /Users/jesse with generic placeholders (#858)

* Remove the deprecated legacy slash commands (#1188)

* fix: prevent subagent-driven-development from pausing every 3 tasks

requesting-code-review had "review after each batch (3 tasks)" for
executing-plans, which leaked into subagent-driven-development as a
check-in cadence. Replaced with flexible "each task or at natural
checkpoints" and added explicit continuous execution directive to
subagent-driven-development.

* Remove Integration sections from skills

These sections don't help with steering and are a legacy of the time
before agents had native skills systems.

* fix(opencode): cache bootstrap content at module level to eliminate per-step file I/O

getBootstrapContent() called fs.existsSync + fs.readFileSync + regex
frontmatter parsing on every agent step with zero caching.  The
experimental.chat.messages.transform hook fires every step in opencode's
agent loop (messages are reloaded from DB each step via
filterCompactedEffect).  A 10-step turn triggered 10 redundant file
reads + 10 regex parses for content that never changes during a session.

Changes:
- Add module-level _bootstrapCache (undefined = not loaded, null = file
  missing) so the first call reads and parses SKILL.md, all subsequent
  calls return the cached string with zero filesystem access
- Cache the null sentinel when SKILL.md is missing, preventing repeated
  fs.existsSync probes
- Add _testing export (resetCache/getCache) for test infrastructure
- Clarify the injection guard comment explaining how it interacts with
  opencode's per-step message reloading
- Add 15 regression tests covering cache behavior, fs call counts,
  injection guard, missing file sentinel, cache reset, and source audit

Fixes #1202

* test(opencode): simplify bootstrap cache coverage

* docs: clarify opencode install caveats

* test(opencode): modernize integration tests

* docs: add Factory Droid installation instructions

* Preserve Codex marketplace metadata

* docs: add README quickstart install links (#1293)

* docs(codex-tools): fix subagent wait mapping to wait_agent

Update the Codex tool mapping so Claude Code 'Task returns result' maps to the current Codex spawned-agent result tool, wait_agent. Also clarify that older Codex builds exposed spawned-agent waiting as wait, while current bare wait is the code-mode exec/wait surface for yielded exec cells.

Verified with Drill:
- codex-tool-mapping-comprehension fails against dev with task_returns_result=wait
- codex-tool-mapping-comprehension passes against this PR with task_returns_result=wait_agent and exec/wait scoped correctly
- codex-subagent-wait-mapping passes against this PR with spawn_agent -> wait_agent -> close_agent and PR963_OK returned

* fix(cursor): run SessionStart hook via run-hook.cmd on Windows

Route Cursor's Windows SessionStart hook through the existing run-hook.cmd dispatcher instead of invoking the extensionless session-start script directly. This avoids Windows opening the extensionless hook file and lets Git Bash run the script as intended.

Also removed an accidental UTF-8 BOM from hooks-cursor.json before merging.

Verified:
- hooks-cursor.json parses as JSON and has no BOM
- command is ./hooks/run-hook.cmd session-start
- CURSOR_PLUGIN_ROOT=/tmp/superpowers ./hooks/run-hook.cmd session-start emits valid Cursor JSON with additional_context

* fix(tests): make SDD integration test actually run its assertions

The SDD integration test silently bailed before printing any verification
results. Three independent bugs caused this:

1. `WORKING_DIR_ESCAPED` was computed from `$SCRIPT_DIR/../..` without
   resolving `..` segments. The resulting "directory" name contained
   literal `..` so `find` was looking in a path that doesn't exist.

2. With `set -euo pipefail`, the `find ... | sort -r | head -1` pipeline
   could exit non-zero (SIGPIPE on the producer when head closes early),
   killing the script silently before assertions ran.

3. The `claude -p` invocation never passed `--plugin-dir`, so it loaded
   the installed plugin instead of the working tree. Local edits to
   skills under test were not actually being tested.

Other adjustments:
- Run claude from inside the unique TEST_PROJECT directory instead of
  from the plugin root, so its session JSONL lives in its own
  `~/.claude/projects/` folder and doesn't race other concurrent
  claude sessions for "most recent file".
- Use the same character-normalization claude does (every non-alphanumeric
  becomes `-`) when computing the session dir name; macOS-resolved
  `/private/var/...` paths and tmp dirs with `.`/`_` in their names need
  this to round-trip correctly.
- Accept either `"name":"Agent"` or `"name":"Task"` in the subagent count
  — the harness renamed the tool but the test wasn't updated.

Verified on this branch: all six verification tests now pass against a
real end-to-end SDD run (skill invoked, 7 subagents dispatched, 6
TodoWrite calls, working code produced, tests pass, no extra features).

* feat: add Gemini CLI subagent support mapping

Map Gemini Task dispatch to @agent-name/@generalist and document parallel subagent dispatch for independent tasks.

* docs: update Codex plugin install guidance (#1288)

* Lift superpowers:code-reviewer agent into the requesting-code-review skill

The plugin had a single named agent (`agents/code-reviewer.md`) used by
two skills, while every other reviewer/implementer subagent in the repo
is dispatched as `general-purpose` with the prompt template living
alongside its skill. That asymmetry had no upside and several costs:

- Two sources of truth for the code review checklist (the agent file
  and `requesting-code-review/code-reviewer.md`), both drifting
  independently.
- `Codex` users could not use the named agent directly; the codex-tools
  reference doc had a workaround section explaining how to flatten the
  named agent into a `worker` dispatch.
- No third-party reliance on `superpowers:code-reviewer` inside this
  repo.

Changes:
- Merge `agents/code-reviewer.md` (persona + checklist) and
  `skills/requesting-code-review/code-reviewer.md` (placeholder
  template) into a single self-contained Task-dispatch template,
  matching the shape of `implementer-prompt.md`,
  `spec-reviewer-prompt.md`, etc.
- Update `skills/requesting-code-review/SKILL.md` and
  `skills/subagent-driven-development/code-quality-reviewer-prompt.md`
  to dispatch `Task (general-purpose)` instead of the named agent.
- Drop the now-obsolete "Named agent dispatch" workaround sections from
  `codex-tools.md` and `copilot-tools.md` — superpowers no longer ships
  any named agents, so those instructions documented nothing.
- Delete `agents/code-reviewer.md` and the empty `agents/` directory.

Tier 3 coverage for the change: a new behavioral test
`tests/claude-code/test-requesting-code-review.sh` plants real bugs
(SQL injection, plaintext password handling, credential logging) into
a tiny project, runs the actual `requesting-code-review` skill against
the working tree, and asserts the dispatched reviewer flags every
planted issue at Critical/Important severity and refuses to approve
the diff.

Verified end-to-end on this branch:
- The new test passes (5/5 assertions; reviewer caught all planted
  bugs and several others).
- The existing SDD integration test still passes (7/7 subagents
  dispatched, all as `general-purpose`; spec compliance still
  rejects extra features; produced code is correct).
- Session JSONLs confirm zero remaining `superpowers:code-reviewer`
  dispatches anywhere in the SDD pipeline.

* Prepare v5.1.0: release notes and version bump

Add v5.1.0 release notes covering:
- Removals: legacy slash commands (/brainstorm, /execute-plan,
  /write-plan), skill Integration sections
- Worktree skills rewrite (PRI-974, PR #1121)
- Contributor guidelines for AI agents
- Codex plugin mirror tooling (PR #1165)
- OpenCode bootstrap caching (#1202)
- SDD pause-every-3-tasks fix; SDD integration test fixes
- Cursor Windows hook routing
- Gemini CLI subagent dispatch mapping
- Skill terminology cleanups
- Install docs (Factory Droid, Codex, quickstart links)

Bumps version 5.0.7 -> 5.1.0 across all declared files via
scripts/bump-version.sh; not yet tagged or released.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Drew Ritter <drewritter@workerbee.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Drew Ritter <drew@primeradiant.com>
Co-authored-by: Blaž Čulina <culina.blaz@nsoft.com>
Co-authored-by: Jesse Vincent <jesse@primeradiant.com>
Co-authored-by: voidborne-d <voidborne-d@users.noreply.github.com>
Co-authored-by: Richard Luo <luo.richard@gmail.com>
Co-authored-by: Drew Ritter <drew@ritter.dev>
Co-authored-by: leonsong09 <59187950+leonsong09@users.noreply.github.com>
Co-authored-by: YuXiang Hong <41331696+starumiQAQ@users.noreply.github.com>
Co-authored-by: Sathvik Gilakamsetty <spacetime1007@gmail.com>
v5.1.0
2026-05-04 15:05:01 -07:00
Jesse Vincent
e7a2d16476 Require session transcript for new-harness PRs
Most new-harness PRs ship integrations that copy skill files or wrap
with `npx skills` instead of loading the using-superpowers bootstrap at
session start. Those integrations look like they work but skills never
auto-trigger.

Add an acceptance test ("Let's make a react todo list" must auto-trigger
brainstorming in a clean session) and require the transcript in the PR.
2026-04-30 14:08:41 -07:00
Jesse Vincent
6efe32c9e2 Use committed Codex plugin files in sync script
- commit .codex-plugin/plugin.json and brand assets in this repo
- sync tracked Codex plugin files instead of generating or seeding them
- honor upstream gitignored files during rsync
- cover the new sync behavior with regression tests
2026-04-23 19:02:37 -07:00
Jesse Vincent
b55764852a formatting 2026-04-16 12:50:46 -07:00
Jesse Vincent
9f42444ab1 formatting 2026-04-16 12:50:46 -07:00
Jesse Vincent
99e4c656bf reorder installs 2026-04-16 12:50:46 -07:00
Jesse Vincent
a5dd364e42 README updates for Codex, other cleanup 2026-04-16 12:50:46 -07:00
Jesse Vincent
c4bbe651cb Some terminology cleanups 2026-04-15 12:41:40 -07:00
Drew Ritter
34c17aefb2 sync-to-codex-plugin: seed interface.defaultPrompt (#1180)
Codex plugin pages use interface.defaultPrompt to show suggested
prompts on the plugin's app card; the generator now emits two
domain-neutral seed prompts so the superpowers listing isn't empty.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 10:59:39 -07:00
Jesse Vincent
f9b088f7b3 Merge pull request #1165 from obra/mirror-codex-plugin-tooling
Mirror codex plugin tooling
2026-04-14 14:13:31 -07:00
Drew Ritter
bc25777c6a sync-to-codex-plugin: anchor EXCLUDES patterns to source root
Rsync exclude patterns without a leading "/" match any directory of
the given name at any depth. The previous "scripts/" pattern was
meant to exclude upstream's top-level scripts/ dir (which contains
sync-to-codex-plugin.sh itself, bump-version.sh, etc.) but also
incorrectly excluded skills/brainstorming/scripts/ — a legitimate
skill-adjacent dir with 5 files (frame-template.html, helper.js,
server.cjs, start-server.sh, stop-server.sh).

Found during a determinism check: comparing the hand-crafted
add-superpowers-plugin bootstrap PR against an automated bootstrap
PR produced a diff showing those 5 files were missing from the
automated version.

Fix: anchor every top-level-only exclude with a leading "/".
.DS_Store stays unanchored because Finder creates them anywhere.

This also prevents future drift if anyone adds a tests/, hooks/,
docs/, lib/, etc. subdir inside a skill.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 14:03:56 -07:00