Gate test for Step 1a — validates agents prefer EnterWorktree over
git worktree add on Claude Code. Must pass before skill rewrite.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5 tasks: TDD gate for Step 1a, using-git-worktrees rewrite,
finishing-a-development-branch rewrite, integration updates,
end-to-end validation. Task 1 is a hard gate — if native tool
preference fails RED/GREEN, stop and redesign.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Step 1a is the load-bearing assumption, not just a risk — if it fails,
the entire design needs rework. TDD validation must be first impl task.
- #1009 resolution depends on Step 1a working, stated explicitly
- #574 honestly deferred, not "partially addressed"
- Add hooks symlink to Step 1b (PR #965 idea, prevents silent hook loss)
- Add stale worktree pruning to Step 5 (PR #1072 idea, one-line self-heal)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix Bug #999 order: merge → verify → remove worktree → delete branch
(avoids losing work if merge fails after worktree removal)
- Add submodule guard to Step 0 detection (GIT_DIR != GIT_COMMON is also
true in submodules)
- Preserve global path (~/.config/superpowers/worktrees/) in detection for
backward compatibility, just stop offering it to new users
- Add step numbering note and implementation notes section
- Expand provenance heuristic to cover global path and manual creation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Design for detect-and-defer worktree support. Superpowers defers to
native harness worktree systems when available, falls back to manual
git worktree creation when not. Covers Phases 0-2: detection, consent,
native tool preference, finishing state detection, and three bug fixes
(#940, #999, #238).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move bootstrap injection from experimental.chat.system.transform to
experimental.chat.messages.transform, prepending to the first user
message instead of adding a system message.
This avoids two issues:
- System messages repeated every turn inflate token usage (#750)
- Multiple system messages break Qwen and other models (#894)
Tested on OpenCode 1.3.2 with Claude Sonnet 4.5 — brainstorming skill
fires correctly on "Let's make a React to do list" prompt.
The bootstrap text advertised a configDir-based skills path that didn't
match the runtime path (resolved relative to the plugin file). Tests
used yet another hardcoded path and referenced a nonexistent lib/ dir.
- Remove misleading skills path from bootstrap text; the agent should
use the native skill tool, not read files by path
- Fix test setup to create a consistent layout matching the plugin's
../../skills resolution
- Export SUPERPOWERS_SKILLS_DIR from setup.sh so tests use a single
source of truth
- Add regression test that bootstrap doesn't advertise the old path
- Remove broken cp of nonexistent lib/ directory
Fixes#847
Copilot CLI v1.0.11 reads `additionalContext` from sessionStart hook
output, but the session-start script only emits the Claude Code-specific
nested format. Add COPILOT_CLI env var detection so Copilot CLI gets the
SDK-standard top-level `additionalContext` while Claude Code continues
getting `hookSpecificOutput`.
Based on PR #910 by @culinablaz.
Two bugs caused the brainstorm server to self-terminate within 60s:
1. ownerAlive() treated EPERM (permission denied) as "process dead".
When the owner PID belongs to a different user (Tailscale SSH,
system daemons), process.kill(pid, 0) throws EPERM — but the
process IS alive. Fixed: return e.code === 'EPERM'.
2. On WSL, the grandparent PID resolves to a short-lived subprocess
that exits before the first 60s lifecycle check. The PID is
genuinely dead (ESRCH), so the EPERM fix alone doesn't help.
Fixed: validate the owner PID at server startup — if it's already
dead, it was a bad resolution, so disable monitoring and rely on
the 30-minute idle timeout.
This also removes the Windows/MSYS2-specific OWNER_PID="" carve-out
from start-server.sh, since the server now handles invalid PIDs
generically at startup regardless of platform.
Tested on Linux (magic-kingdom) via Tailscale SSH:
- Root-owned owner PID (EPERM): server survives ✓
- Dead owner PID at startup (WSL sim): monitoring disabled, survives ✓
- Valid owner that dies: server shuts down within 60s ✓
Fixes#879
ownerAlive() treated EPERM (permission denied) the same as ESRCH
(process not found), causing the server to self-terminate within 60s
whenever the owner process ran as a different user. This affected WSL
(owner is a Windows process), Tailscale SSH, and any cross-user
scenario.
The fix: `return e.code === 'EPERM'` — if we get permission denied,
the process is alive; we just can't signal it.
Tested on Linux via Tailscale SSH with a root-owned grandparent PID:
- Server survives past the 60s lifecycle check (EPERM = alive)
- Server still shuts down when owner genuinely dies (ESRCH = dead)
Fixes#879
The session directory now contains two peers: content/ (HTML served to
the browser) and state/ (events, server-info, pid, log). Previously
all files shared a single directory, making server state and user
interaction data accessible over the /files/ HTTP route.
Also fixes stale test assertion ("Waiting for Claude" → "Waiting for
the agent").
Reported-By: 吉田仁
Metadata files (.server-info, .events, .server.pid, .server.log,
.server-stopped) were stored in the same directory served over HTTP,
making them accessible via the /files/ route. They now live in a .meta/
subdirectory that is not web-accessible.
Also fixes a stale test assertion ("Waiting for Claude" → "Waiting for
the agent").
Reported-By: 吉田仁
The subagent review loop (dispatching a fresh agent to review plans/specs)
doubled execution time (~25 min overhead) without measurably improving plan
quality. Regression testing across 5 versions (v3.6.0 through v5.0.4) with
5 trials each showed identical plan sizes, task counts, and quality scores
regardless of whether the review loop ran.
Changes:
- writing-plans: Replace subagent Plan Review Loop with inline Self-Review
checklist (spec coverage, placeholder scan, type consistency)
- writing-plans: Add explicit "No Placeholders" section listing plan failures
(TBD, vague descriptions, undefined references, "similar to Task N")
- brainstorming: Replace subagent Spec Review Loop with inline Spec Self-Review
(placeholder scan, internal consistency, scope check, ambiguity check)
- Both skills now use "look at it with fresh eyes" framing
Testing: 5 trials with the new skill show self-review catches 3-5 real bugs
per run (spawn positions, API mismatches, seed bugs, grid indexing) in ~30s
instead of ~25 min. Remaining defects are comparable to the subagent approach.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8 tasks covering: environment detection in using-git-worktrees,
Step 1.5 + cleanup guard in finishing-a-development-branch,
Integration line updates, codex-tools.md docs, automated tests,
and final verification.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Design for making using-git-worktrees, finishing-a-development-branch,
and subagent-driven-development skills work in the Codex App's sandboxed
worktree environment. Read-only environment detection via git-dir vs
git-common-dir comparison, ~48 lines across 4 files, zero breaking changes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Four templates: bug report (with environment table and platform-vs-plugin
gate), feature request (with problem statement and core-appropriateness
question), IDE/platform support request, and a config that disables
blank issues and redirects questions to Discord.
Requires contributors to articulate the problem they're solving,
confirm human review, document eval methodology, and check for
duplicate PRs. Informed by patterns in ~90 closed-without-merge PRs.
- Skip OWNER_PID monitoring on Windows/MSYS2 where the PID namespace is
invisible to Node.js, preventing server self-termination after 60s (#770)
- Document run_in_background: true for Claude Code on Windows (#767)
- Restore user choice between subagent-driven and inline execution after
plan writing; subagent-driven is recommended but no longer mandatory
- Add Windows lifecycle test script verified on Windows 11 VM
- Note #723 (stop-server.sh reliability) as already fixed
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>