The plugin had a single named agent (`agents/code-reviewer.md`) used by
two skills, while every other reviewer/implementer subagent in the repo
is dispatched as `general-purpose` with the prompt template living
alongside its skill. That asymmetry had no upside and several costs:
- Two sources of truth for the code review checklist (the agent file
and `requesting-code-review/code-reviewer.md`), both drifting
independently.
- `Codex` users could not use the named agent directly; the codex-tools
reference doc had a workaround section explaining how to flatten the
named agent into a `worker` dispatch.
- No third-party reliance on `superpowers:code-reviewer` inside this
repo.
Changes:
- Merge `agents/code-reviewer.md` (persona + checklist) and
`skills/requesting-code-review/code-reviewer.md` (placeholder
template) into a single self-contained Task-dispatch template,
matching the shape of `implementer-prompt.md`,
`spec-reviewer-prompt.md`, etc.
- Update `skills/requesting-code-review/SKILL.md` and
`skills/subagent-driven-development/code-quality-reviewer-prompt.md`
to dispatch `Task (general-purpose)` instead of the named agent.
- Drop the now-obsolete "Named agent dispatch" workaround sections from
`codex-tools.md` and `copilot-tools.md` — superpowers no longer ships
any named agents, so those instructions documented nothing.
- Delete `agents/code-reviewer.md` and the empty `agents/` directory.
Tier 3 coverage for the change: a new behavioral test
`tests/claude-code/test-requesting-code-review.sh` plants real bugs
(SQL injection, plaintext password handling, credential logging) into
a tiny project, runs the actual `requesting-code-review` skill against
the working tree, and asserts the dispatched reviewer flags every
planted issue at Critical/Important severity and refuses to approve
the diff.
Verified end-to-end on this branch:
- The new test passes (5/5 assertions; reviewer caught all planted
bugs and several others).
- The existing SDD integration test still passes (7/7 subagents
dispatched, all as `general-purpose`; spec compliance still
rejects extra features; produced code is correct).
- Session JSONLs confirm zero remaining `superpowers:code-reviewer`
dispatches anywhere in the SDD pipeline.
The SDD integration test silently bailed before printing any verification
results. Three independent bugs caused this:
1. `WORKING_DIR_ESCAPED` was computed from `$SCRIPT_DIR/../..` without
resolving `..` segments. The resulting "directory" name contained
literal `..` so `find` was looking in a path that doesn't exist.
2. With `set -euo pipefail`, the `find ... | sort -r | head -1` pipeline
could exit non-zero (SIGPIPE on the producer when head closes early),
killing the script silently before assertions ran.
3. The `claude -p` invocation never passed `--plugin-dir`, so it loaded
the installed plugin instead of the working tree. Local edits to
skills under test were not actually being tested.
Other adjustments:
- Run claude from inside the unique TEST_PROJECT directory instead of
from the plugin root, so its session JSONL lives in its own
`~/.claude/projects/` folder and doesn't race other concurrent
claude sessions for "most recent file".
- Use the same character-normalization claude does (every non-alphanumeric
becomes `-`) when computing the session dir name; macOS-resolved
`/private/var/...` paths and tmp dirs with `.`/`_` in their names need
this to round-trip correctly.
- Accept either `"name":"Agent"` or `"name":"Task"` in the subagent count
— the harness renamed the tool but the test wasn't updated.
Verified on this branch: all six verification tests now pass against a
real end-to-end SDD run (skill invoked, 7 subagents dispatched, 6
TodoWrite calls, working code produced, tests pass, no extra features).
Route Cursor's Windows SessionStart hook through the existing run-hook.cmd dispatcher instead of invoking the extensionless session-start script directly. This avoids Windows opening the extensionless hook file and lets Git Bash run the script as intended.
Also removed an accidental UTF-8 BOM from hooks-cursor.json before merging.
Verified:
- hooks-cursor.json parses as JSON and has no BOM
- command is ./hooks/run-hook.cmd session-start
- CURSOR_PLUGIN_ROOT=/tmp/superpowers ./hooks/run-hook.cmd session-start emits valid Cursor JSON with additional_context
Update the Codex tool mapping so Claude Code 'Task returns result' maps to the current Codex spawned-agent result tool, wait_agent. Also clarify that older Codex builds exposed spawned-agent waiting as wait, while current bare wait is the code-mode exec/wait surface for yielded exec cells.
Verified with Drill:
- codex-tool-mapping-comprehension fails against dev with task_returns_result=wait
- codex-tool-mapping-comprehension passes against this PR with task_returns_result=wait_agent and exec/wait scoped correctly
- codex-subagent-wait-mapping passes against this PR with spawn_agent -> wait_agent -> close_agent and PR963_OK returned
getBootstrapContent() called fs.existsSync + fs.readFileSync + regex
frontmatter parsing on every agent step with zero caching. The
experimental.chat.messages.transform hook fires every step in opencode's
agent loop (messages are reloaded from DB each step via
filterCompactedEffect). A 10-step turn triggered 10 redundant file
reads + 10 regex parses for content that never changes during a session.
Changes:
- Add module-level _bootstrapCache (undefined = not loaded, null = file
missing) so the first call reads and parses SKILL.md, all subsequent
calls return the cached string with zero filesystem access
- Cache the null sentinel when SKILL.md is missing, preventing repeated
fs.existsSync probes
- Add _testing export (resetCache/getCache) for test infrastructure
- Clarify the injection guard comment explaining how it interacts with
opencode's per-step message reloading
- Add 15 regression tests covering cache behavior, fs call counts,
injection guard, missing file sentinel, cache reset, and source audit
Fixes#1202
requesting-code-review had "review after each batch (3 tasks)" for
executing-plans, which leaked into subagent-driven-development as a
check-in cadence. Replaced with flexible "each task or at natural
checkpoints" and added explicit continuous execution directive to
subagent-driven-development.
- commit .codex-plugin/plugin.json and brand assets in this repo
- sync tracked Codex plugin files instead of generating or seeding them
- honor upstream gitignored files during rsync
- cover the new sync behavior with regression tests
Codex plugin pages use interface.defaultPrompt to show suggested
prompts on the plugin's app card; the generator now emits two
domain-neutral seed prompts so the superpowers listing isn't empty.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rsync exclude patterns without a leading "/" match any directory of
the given name at any depth. The previous "scripts/" pattern was
meant to exclude upstream's top-level scripts/ dir (which contains
sync-to-codex-plugin.sh itself, bump-version.sh, etc.) but also
incorrectly excluded skills/brainstorming/scripts/ — a legitimate
skill-adjacent dir with 5 files (frame-template.html, helper.js,
server.cjs, start-server.sh, stop-server.sh).
Found during a determinism check: comparing the hand-crafted
add-superpowers-plugin bootstrap PR against an automated bootstrap
PR produced a diff showing those 5 files were missing from the
automated version.
Fix: anchor every top-level-only exclude with a leading "/".
.DS_Store stays unanchored because Finder creates them anywhere.
This also prevents future drift if anyone adds a tests/, hooks/,
docs/, lib/, etc. subdir inside a skill.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two coupled changes:
1. Add assets/ to EXCLUDES. A normal sync run was deleting
plugins/superpowers/assets/ via --delete because the corresponding
directory doesn't exist upstream. Confirmed via dry-run that the
previous version would wipe both brand asset files on next sync.
2. Add --bootstrap and --assets-src flags to support creating the
initial plugin PR from scratch. Bootstrap mode skips the
"plugin must exist on base" preflight, creates the plugin
directory, rsyncs upstream content, then copies
PrimeRadiant_Favicon.{svg,png} from --assets-src into
plugins/superpowers/assets/ as superpowers-small.svg and
app-icon.png. Run once by one team member to open the initial
PR; every subsequent run is a normal (non-bootstrap) sync.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The live .codex-plugin/plugin.json in the downstream fork was cleaned up
(websiteURL, privacyPolicyURL, termsOfServiceURL, and defaultPrompt
removed) and icon fields were added (composerIcon, logo pointing at
assets/superpowers-small.svg and assets/app-icon.png). Update the
heredoc to produce the same shape so future sync runs don't wipe the
icon fields or reintroduce the removed URL fields.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove CODE_OF_CONDUCT.md from EXCLUDES so it syncs from upstream
(per PR #1165 review feedback on the exclude list)
- Remove the agents/openai.yaml overlay generator and its exclude entry
— the file duplicates fields already in .codex-plugin/plugin.json and
only 6 of 28 upstream plugins ship one, so we match the 22-plugin
majority shape
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous version was a local rsync helper that required a hand-maintained
destination path. This rewrite makes it path/user-agnostic and gives every team
member the same flow:
- Clones prime-radiant-inc/openai-codex-plugins fresh into a temp dir per run
(trap EXIT cleans up)
- Auto-detects upstream from the script's own location
- Preflight: rsync, git, gh auth, python3, upstream package.json
- Reads upstream version from package.json and bakes it into the regenerated
.codex-plugin/plugin.json, so version bumps flow through
- Regenerates both overlay files (.codex-plugin/plugin.json and
agents/openai.yaml) inline via heredoc — single source of truth
- Pushes a sync/superpowers-<sha>-<UTC-timestamp> branch and opens a PR via
gh pr create; prints PR URL and /files diff URL on completion
- --dry-run, --yes, --base BRANCH, --local PATH flags for all the usual modes
- Deterministic: two runs against the same upstream SHA produce PRs with
identical diffs, so the tool itself can be sanity-checked by running twice
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Jesse flagged that the verbose CLAUDE.md/AGENTS.md/GEMINI.md/.cursorrules
enumeration (a) chews tokens, (b) confuses models that anchor on exact
strings, and (c) is repeated DRY-violatingly across 3+ locations.
Replace with abstract "your instructions" framing in four spots:
- skills/using-git-worktrees/SKILL.md Step 0 → Step 1 transition
- skills/using-git-worktrees/SKILL.md Step 1b Directory Selection
- docs/superpowers/plans/2026-04-06-worktree-rototill.md (both mirror locations)
Same intent, harness-agnostic phrasing, ~half the tokens.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address obra's comment on explicit step numbers / prescriptive tone.
Drops "STOP HERE if available", the "If YES:" gate, and the "even if /
even if / NO EXCEPTIONS" reinforcement paragraph. Keeps the specific
tool-name anchors (EnterWorktree, WorktreeCreate, /worktree, --worktree),
which the original TDD data showed are load-bearing.
A/B verified against drill harness on the 3 creation/consent scenarios
(consent-flow, creation-from-main, creation-from-main-spec-aware):
baseline explicit wording scored 12/12 criteria, softened wording also
scored 12/12. The "agent used the most appropriate tool" criterion
passed in all 3 softened runs — agents still picked EnterWorktree via
ToolSearch without the imperative framing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Consent prompt: drop "(y/n)" and add escape valve for users who
have already declared their worktree preference in global or
project agent instruction files.
- Directory selection: reorder to put declared user preference
ahead of observed filesystem state, and reframe the default as
"if no other guidance available".
- Sandbox fallback: require explicitly informing the user that
the sandbox blocked creation, not just "report accordingly".
- writing-plans: fully qualify the superpowers:using-git-worktrees
reference.
- Plan doc: mirror the consent-prompt change.
Step 1a native-tool framing and the helper-scripts suggestion are
still outstanding — the first needs a benchmark re-run before softer
phrasing can be adopted without regressing compliance; the second is
exploratory and will get a thread reply.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Git worktrees inherit hooks from the main repo automatically via
$GIT_COMMON_DIR — this has been the case since git 2.5 (2015).
The symlink step was based on an incorrect premise from PR #965
and also fails in practice (.git is a file in worktrees, not a dir).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tested on Gemini CLI (gemini -p) and Cursor Agent (cursor-agent -p):
- Gemini: Step 0 detection 1/1, Step 1b fallback 1/1
- Cursor: Step 0 detection 1/1, Step 1b fallback 1/1
Both correctly identified no native agent-callable worktree tool,
fell through to git worktree add, and performed safety verification.
Both correctly detected existing worktrees and skipped creation.
5 of 6 harnesses now tested. Only OpenCode untested (no CLI access).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Research confirmed Claude Code is currently the only harness with an
agent-callable mid-session worktree tool. All others either create
worktrees before the agent starts (Codex App, Gemini, Cursor) or have
no native support (Codex CLI, OpenCode).
Table now shows: what was actually tested (Claude Code 50/50, Codex CLI
6/6), what was simulated (Codex App 1/1), and what's untested (Gemini,
Cursor, OpenCode). Step 1a is forward-compatible for when other
harnesses add agent-callable tools.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Step 1a's original "deliberately short, abstract" design was disproven
by TDD (2/6 pass rate). Spec now documents the validated approach:
explicit tool naming + consent bridge + red flag (50/50 pass rate).
- Design Principles: updated to reflect explicit naming over abstraction
- Step 1a: replaced abstract text with validated approach, added design
note explaining the TDD revision and why file splitting was unnecessary
- Risks: Step 1a risk marked RESOLVED with cross-platform validation table
and residual risk note about upstream tool description dependency
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Step 1a failed at 2/6 with the spec's original abstract text ("use your
native tool"). Three REFACTOR iterations found what works (50/50 runs):
1. Explicit tool naming — "do you have EnterWorktree, WorktreeCreate..."
transforms interpretation into factual toolkit check
2. Consent bridge — "user's consent is your authorization" directly
addresses EnterWorktree's "ONLY when user explicitly asks" guardrail
3. Red Flag entry naming the specific anti-pattern
File split was tested but proven unnecessary — the fix is the Step 1a
text quality, not physical separation of git commands. Control test
with full 240-line skill (all git commands visible) passed 20/20.
Test script updated: supports batch runs (./test.sh green 20), "all"
phase, and checks absence of git worktree add (reliable signal) rather
than presence of EnterWorktree text (agent sometimes omits tool name).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The creation skill supports both .worktrees/ and worktrees/ directories,
but the finishing skill's cleanup only checked .worktrees/. Worktrees
under the non-hidden path would be orphaned on merge or discard.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove REQUIRED language from executing-plans and subagent-driven-development.
Consent and detection now live inside using-git-worktrees itself.
Fix stale 'created by brainstorming' claim in writing-plans.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
using-git-worktrees: submodule guard now says "treat as normal repo"
instead of "proceed to Step 1" (preserves consent flow)
using-git-worktrees: directory priority summaries include global legacy
finishing-a-development-branch: move git branch -d after Step 6 cleanup
to make Bug #999 ordering unambiguous (merge -> worktree remove -> branch delete)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Step 2: environment detection (GIT_DIR != GIT_COMMON) before presenting menu
Detached HEAD: reduced 3-option menu (no merge from detached HEAD)
Provenance-based cleanup: .worktrees/ = ours, anything else = hands off
Bug #940: Option 2 no longer cleans up worktree
Bug #999: merge -> verify -> remove worktree -> delete branch
Bug #238: cd to main repo root before git worktree remove
Stale worktree pruning after removal (git worktree prune)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Gate test for Step 1a — validates agents prefer EnterWorktree over
git worktree add on Claude Code. Must pass before skill rewrite.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>