Compare commits

..

12 Commits

Author SHA1 Message Date
Jesse Vincent
8811b0f2d7 Revert "Make visual-companion.md script paths skill-rooted, not plugin-rooted"
This reverts commit e9f5188289.
2026-05-23 17:01:46 -07:00
Jesse Vincent
d48bec6cc3 Revert "Probe per-user Git Bash and Scoop before falling back to PATH on Windows"
This reverts commit a8f0738e3a.
2026-05-23 17:00:15 -07:00
Jesse Vincent
a8f0738e3a Probe per-user Git Bash and Scoop before falling back to PATH on Windows
Stock Windows 10/11 ships C:\Windows\System32\bash.exe (the WSL
launcher) as the first match for `where bash`. WSL's bash cannot
execute Windows-style script paths, so when Git Bash is installed
outside the two standard system locations -- specifically the
per-user "Only for me" Git for Windows installer
(%LOCALAPPDATA%\Programs\Git) or a Scoop install
(%USERPROFILE%\scoop\apps\git\current\usr\bin) -- run-hook.cmd
silently fails: WSL prints "Windows Subsystem for Linux must be
updated", the script returns 0, and Superpowers' SessionStart
bootstrap is never injected. From the user's perspective skills
auto-trigger inconsistently or not at all, with no surfaced error.

Add explicit probes for both locations between the existing system-
wide Git for Windows checks and the `where bash` fallback. Also add
a comment to the fallback documenting the WSL-launcher trap so future
maintainers understand why the explicit probes must come first.

Verified on a Windows 11 VM (dockur/windows 11, Git Bash 2.x, Node
22):
- System Git present: existing probe still matches (no regression)
- System Git absent, per-user Git present via junction: new probe
  matches, hook produces valid 6422-byte JSON, exit 0
- All Git probes absent: confirmed WSL trap fires
  ("Windows Subsystem for Linux must be updated") and the hook exits 0
  silently, demonstrating the original bug

Existing tests/hooks/test-session-start.sh still passes on macOS (7/7).

Reported by @ytchenak in #1607.

Co-authored-by: ytchenak <ytchenak@users.noreply.github.com>
Closes #1607.
2026-05-23 16:58:56 -07:00
Jesse Vincent
f36bad5b78 Pipe SessionStart hook printf through cat to absorb EPIPE on Windows
On Windows + Git Bash, the SessionStart hook prints a confusing
diagnostic at every startup ("printf: write error: Permission denied")
when Claude Code closes the hook's stdout pipe before the printf has
finished writing. The hook still runs to completion and context still
gets injected, but the diagnostic surfaces every session because
Git Bash's printf reports EPIPE as "Permission denied" (not "Broken
pipe" like Linux) and our `set -euo pipefail` lets that error escape.

Piping each printf through `cat` makes the external cat process the
recipient of any SIGPIPE / EPIPE. cat's failure does not propagate to
the parent bash under pipefail because cat is the last command in the
pipeline and exits cleanly when the pipe stays open long enough to
hold the data. On macOS/Linux the cat passthrough is transparent (no
behavior change, no measurable cost).

Verified:
- Existing tests/hooks/test-session-start.sh: 7/7 pass on macOS
- Manual run on Windows 11 + Git Bash 5.2 + Node 22 produces valid JSON,
  clean stderr, and exit 0
- JSON output is byte-identical to the unpatched hook

Reported by @silvertakana in #1612, attribution preserved in the
Co-authored-by trailer below — this is the same fix shape the original
PR proposed.

Co-authored-by: silvertakana <silvertakana@users.noreply.github.com>
Closes #1612.
2026-05-23 16:55:46 -07:00
Nick Galatis
21ad401e90 fix(systematic-debugging): defuse Claude Code ultrathink keyword scanner trigger (#1558)
The "Signals You're Doing It Wrong" bullet in systematic-debugging/SKILL.md
contains the literal token Claude Code's runtime scans for in tool result
bodies. Every Skill-tool invocation of this skill caused the harness to
inject a spurious system-reminder claiming the user requested deeper
reasoning, silently bumping every session into extended thinking.

Replace the bullet's spelling so the contiguous letter sequence the scanner
matches is broken with a hyphen. The signal text remains recognizable to
the agent and the documented action ("Question fundamentals, not just
symptoms") is unchanged.

Fixes obra/superpowers#1283
2026-05-23 16:51:00 -07:00
Jesse Vincent
e9f5188289 Make visual-companion.md script paths skill-rooted, not plugin-rooted
Issue #1134: agents reading visual-companion.md see bare commands like
`scripts/start-server.sh`, correctly identify the plugin install
directory, then look for `<plugin>/scripts/start-server.sh` instead of
`<plugin>/skills/brainstorming/scripts/start-server.sh`. The file
doesn't exist at the plugin-rooted path, so the agent concludes the
visual companion isn't available and falls back to text-only
brainstorming.

Multiple independent reproductions in the issue thread, plus one user's
agent self-reported: "I assumed the scripts folder was in the root
directory of the plugin, it didn't realize it could have been talking
about the skill folder itself."

Change all `scripts/<file>` references in visual-companion.md to
`skills/brainstorming/scripts/<file>`. Agents that correctly identify
the plugin root will now join to the right path.

Closes #1134.
2026-05-23 16:42:13 -07:00
Jesse Vincent
eef50b96f0 Align windows-lifecycle test with current brainstorm server layout
The test had drifted behind three server implementation changes and no
longer ran against the actual server:

- Server entrypoint renamed from server.js to server.cjs; the test still
  invoked node on server.js and failed with MODULE_NOT_FOUND.
- Server state moved to a state/ subdirectory (state/server-info,
  state/server.pid); the test still waited on .server-info and wrote
  .server.pid at the session root.
- Owner-PID startup validation now keeps the server running when the
  owner PID is dead at startup: it logs owner-pid-invalid, disables
  owner monitoring, and falls back to the idle timeout. The test still
  expected the server to self-terminate within 60s of a dead-at-startup
  owner.

Update file/path references to match the current server, and rewrite
the dead-at-startup test to assert the current behavior: server
survives, log contains owner-pid-invalid, log does not contain a
spurious "owner process exited" line.

Verified locally: 9 passed, 0 failed, 3 skipped (Windows-only).
2026-05-23 16:36:45 -07:00
Jesse Vincent
e1d3f71e0d Convert curly to square brackets in code-reviewer.md placeholders
Matches the style used by the spec-reviewer-prompt.md and
code-quality-reviewer-prompt.md call sites, which already use square
brackets ([VAR] or [VAR — description]). No semantic change — these
placeholders are filled in by the controller; nothing programmatic
substitutes them.
2026-05-23 16:14:24 -07:00
Jesse Vincent
b2212dc913 Scope spec reviewer to task diff and make reviewers read-only
Two problems with the SDD reviewer prompts on dev:

- spec-reviewer-prompt.md never received a git range, so the
  general-purpose subagent had to crawl the entire codebase to find what
  changed. Reporter measured 20-33 minute spec reviews on simple tasks
  (#1538).
- Neither reviewer prompt told the subagent that review is read-only.
  A spec reviewer running `git checkout <parent-sha>` for historical
  comparison silently detached HEAD on the controller's branch, then
  subsequent task commits accumulated on the detached HEAD and were
  effectively orphaned (#1543, reproduced independently in #1543's
  thread).

Add a Git Range to Review section to spec-reviewer-prompt.md that
mirrors the one code-reviewer.md already has, plus a Read-Only Review
section in both reviewer prompt templates stating the principle: do
not mutate the working tree, the index, HEAD, or branch state. Allow
inspecting other revisions via a separate temporary worktree, so the
read-only rule does not block legitimate historical comparison.

Closes #1538.
Closes #1543.
2026-05-23 16:14:05 -07:00
Jesse Vincent
180f009090 @mhat reported that his claude got confused about 'debugging' being named as a skill in the bootstrap 2026-05-21 17:23:25 -04:00
Drew Ritter
8c1f7c5dae Bump superpowers-evals submodule 2026-05-14 16:32:24 -07:00
Drew Ritter
201f945838 [codex] support native Codex plugin hooks (#1540)
* docs: specify Codex native hooks parity

* docs: refine Codex hooks spec after review

* docs: record Codex hook contract spike

* docs: plan Codex native hooks implementation

* feat: support Codex native plugin hooks

* test: add Codex native hook drill coverage

* Simplify Codex hook entrypoint
2026-05-14 15:59:38 -07:00
8 changed files with 79 additions and 44 deletions

2
evals

Submodule evals updated: 29957de826...e2b37138c8

View File

@@ -37,13 +37,13 @@ session_context="<EXTREMELY_IMPORTANT>\nYou have superpowers.\n\n**Below is the
# See: https://github.com/obra/superpowers/issues/571
if [ -n "${CURSOR_PLUGIN_ROOT:-}" ]; then
# Cursor sets CURSOR_PLUGIN_ROOT (may also set CLAUDE_PLUGIN_ROOT)
printf '{\n "additional_context": "%s"\n}\n' "$session_context"
printf '{\n "additional_context": "%s"\n}\n' "$session_context" | cat
elif [ -n "${CLAUDE_PLUGIN_ROOT:-}" ] && [ -z "${COPILOT_CLI:-}" ]; then
# Claude Code sets CLAUDE_PLUGIN_ROOT without COPILOT_CLI
printf '{\n "hookSpecificOutput": {\n "hookEventName": "SessionStart",\n "additionalContext": "%s"\n }\n}\n' "$session_context"
printf '{\n "hookSpecificOutput": {\n "hookEventName": "SessionStart",\n "additionalContext": "%s"\n }\n}\n' "$session_context" | cat
else
# Copilot CLI (sets COPILOT_CLI=1) or unknown platform — SDK standard format
printf '{\n "additionalContext": "%s"\n}\n' "$session_context"
printf '{\n "additionalContext": "%s"\n}\n' "$session_context" | cat
fi
exit 0

View File

@@ -21,6 +21,6 @@ escape_for_json() {
using_superpowers_escaped=$(escape_for_json "$using_superpowers_content")
session_context="<EXTREMELY_IMPORTANT>\nYou have superpowers.\n\n**Below is the full content of your 'superpowers:using-superpowers' skill - your introduction to using skills. For all other skills, follow the Codex skill-loading instructions in that skill:**\n\n${using_superpowers_escaped}\n</EXTREMELY_IMPORTANT>"
printf '{\n "hookSpecificOutput": {\n "hookEventName": "SessionStart",\n "additionalContext": "%s"\n }\n}\n' "$session_context"
printf '{\n "hookSpecificOutput": {\n "hookEventName": "SessionStart",\n "additionalContext": "%s"\n }\n}\n' "$session_context" | cat
exit 0

View File

@@ -14,22 +14,26 @@ Subagent (general-purpose):
## What Was Implemented
{DESCRIPTION}
[DESCRIPTION]
## Requirements / Plan
{PLAN_OR_REQUIREMENTS}
[PLAN_OR_REQUIREMENTS]
## Git Range to Review
**Base:** {BASE_SHA}
**Head:** {HEAD_SHA}
**Base:** [BASE_SHA]
**Head:** [HEAD_SHA]
```bash
git diff --stat {BASE_SHA}..{HEAD_SHA}
git diff {BASE_SHA}..{HEAD_SHA}
git diff --stat [BASE_SHA]..[HEAD_SHA]
git diff [BASE_SHA]..[HEAD_SHA]
```
## Read-Only Review
Your review is read-only on this checkout. Do not mutate the working tree, the index, HEAD, or branch state in any way. Use tools like `git show`, `git diff`, and `git log` to inspect history. If you need a working copy of a different revision, check it out into a separate temporary directory (e.g. `git worktree add /tmp/review-[SHA] [SHA]`) — never move HEAD on this checkout.
## What to Check
**Plan alignment:**
@@ -122,10 +126,10 @@ Subagent (general-purpose):
```
**Placeholders:**
- `{DESCRIPTION}` — brief summary of what was built
- `{PLAN_OR_REQUIREMENTS}` — what it should do (plan file path, task text, or requirements)
- `{BASE_SHA}` — starting commit
- `{HEAD_SHA}` — ending commit
- `[DESCRIPTION]` — brief summary of what was built
- `[PLAN_OR_REQUIREMENTS]` — what it should do (plan file path, task text, or requirements)
- `[BASE_SHA]` — starting commit
- `[HEAD_SHA]` — ending commit
**Reviewer returns:** Strengths, Issues (Critical / Important / Minor), Recommendations, Assessment

View File

@@ -18,6 +18,22 @@ Subagent (general-purpose):
[From implementer's report]
## Git Range to Review
**Base:** [BASE_SHA — commit before this task]
**Head:** [HEAD_SHA — current commit]
```bash
git diff --stat [BASE_SHA]..[HEAD_SHA]
git diff [BASE_SHA]..[HEAD_SHA]
```
Only read files in this diff. Do not crawl the broader codebase.
## Read-Only Review
Your review is read-only on this checkout. Do not mutate the working tree, the index, HEAD, or branch state in any way. Use tools like `git show`, `git diff`, and `git log` to inspect history. If you need a working copy of a different revision, check it out into a separate temporary directory (e.g. `git worktree add /tmp/review-[SHA] [SHA]`) — never move HEAD on this checkout.
## CRITICAL: Do Not Trust the Report
The implementer finished suspiciously quickly. Their report may be incomplete,

View File

@@ -237,7 +237,7 @@ If you catch yourself thinking:
- "Is that not happening?" - You assumed without verifying
- "Will it show us...?" - You should have added evidence gathering
- "Stop guessing" - You're proposing fixes without understanding
- "Ultrathink this" - Question fundamentals, not just symptoms
- "Ultra-think this" - Question fundamentals, not just symptoms
- "We're stuck?" (frustrated) - Your approach isn't working
**When you see these:** STOP. Return to Phase 1.

View File

@@ -102,15 +102,15 @@ These thoughts mean STOP—you're rationalizing:
When multiple skills could apply, use this order:
1. **Process skills first** (brainstorming, debugging) - these determine HOW to approach the task
1. **Process skills first** (brainstorming, systematic-debugging) - these determine HOW to approach the task
2. **Implementation skills second** (frontend-design, mcp-builder) - these guide execution
"Let's build X" → brainstorming first, then implementation skills.
"Fix this bug" → debugging first, then domain-specific skills.
"Fix this bug" → systematic-debugging first, then domain-specific skills.
## Skill Types
**Rigid** (TDD, debugging): Follow exactly. Don't adapt away discipline.
**Rigid** (TDD, systematic-debugging): Follow exactly. Don't adapt away discipline.
**Flexible** (patterns): Adapt principles to context.

View File

@@ -1,9 +1,11 @@
#!/usr/bin/env bash
# Windows lifecycle tests for the brainstorm server.
#
# Verifies that the brainstorm server survives the 60-second lifecycle
# check on Windows, where OWNER_PID monitoring is disabled because the
# MSYS2 PID namespace is invisible to Node.js.
# Verifies brainstorm server lifecycle behavior, including:
# - Windows/MSYS2 foreground mode and empty OWNER_PID handling
# - Server survival past the 60-second lifecycle check window
# - Dead-at-startup OWNER_PID validation (logged, monitoring disabled)
# - Clean stop-server.sh shutdown
#
# Requirements:
# - Node.js in PATH
@@ -20,7 +22,7 @@ SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
REPO_ROOT="${SUPERPOWERS_ROOT:-$(cd "$SCRIPT_DIR/../.." && pwd)}"
START_SCRIPT="$REPO_ROOT/skills/brainstorming/scripts/start-server.sh"
STOP_SCRIPT="$REPO_ROOT/skills/brainstorming/scripts/stop-server.sh"
SERVER_JS="$REPO_ROOT/skills/brainstorming/scripts/server.js"
SERVER_SCRIPT="$REPO_ROOT/skills/brainstorming/scripts/server.cjs"
TEST_DIR="${TMPDIR:-/tmp}/brainstorm-win-test-$$"
@@ -64,7 +66,7 @@ skip() {
wait_for_server_info() {
local dir="$1"
for _ in $(seq 1 50); do
if [[ -f "$dir/.server-info" ]]; then
if [[ -f "$dir/state/server-info" ]]; then
return 0
fi
sleep 0.1
@@ -73,9 +75,9 @@ wait_for_server_info() {
}
get_port_from_info() {
# Read the port from .server-info. Use grep/sed instead of Node.js
# Read the port from state/server-info. Use grep/sed instead of Node.js
# to avoid MSYS2-to-Windows path translation issues.
grep -o '"port":[0-9]*' "$1/.server-info" | head -1 | sed 's/"port"://'
grep -o '"port":[0-9]*' "$1/state/server-info" | head -1 | sed 's/"port"://'
}
http_check() {
@@ -214,11 +216,11 @@ BRAINSTORM_HOST="127.0.0.1" \
BRAINSTORM_URL_HOST="localhost" \
BRAINSTORM_OWNER_PID="" \
BRAINSTORM_PORT=$((49152 + RANDOM % 16383)) \
node "$SERVER_JS" > "$TEST_DIR/survival/.server.log" 2>&1 &
node "$SERVER_SCRIPT" > "$TEST_DIR/survival/.server.log" 2>&1 &
SERVER_PID=$!
if ! wait_for_server_info "$TEST_DIR/survival"; then
fail "Server starts successfully" "Server did not write .server-info within 5 seconds"
fail "Server starts successfully" "Server did not write state/server-info within 5 seconds"
kill "$SERVER_PID" 2>/dev/null || true
SERVER_PID=""
else
@@ -254,10 +256,15 @@ else
SERVER_PID=""
fi
# ========== Test 5: Bad OWNER_PID causes shutdown (control) ==========
# ========== Test 5: Dead-at-startup OWNER_PID is logged but does not kill the server ==========
#
# The server validates BRAINSTORM_OWNER_PID at startup. If it's already dead,
# the PID resolution was wrong (common on WSL, Tailscale SSH, cross-user
# scenarios). The server logs 'owner-pid-invalid', disables owner monitoring,
# and continues running. The idle timeout becomes the only shutdown trigger.
echo ""
echo "--- Control: Bad OWNER_PID causes shutdown ---"
echo "--- Dead-at-startup OWNER_PID: server survives, logs owner-pid-invalid ---"
mkdir -p "$TEST_DIR/control"
@@ -272,33 +279,41 @@ BRAINSTORM_HOST="127.0.0.1" \
BRAINSTORM_URL_HOST="localhost" \
BRAINSTORM_OWNER_PID="$BAD_PID" \
BRAINSTORM_PORT=$((49152 + RANDOM % 16383)) \
node "$SERVER_JS" > "$TEST_DIR/control/.server.log" 2>&1 &
node "$SERVER_SCRIPT" > "$TEST_DIR/control/.server.log" 2>&1 &
CONTROL_PID=$!
if ! wait_for_server_info "$TEST_DIR/control"; then
fail "Control server starts" "Server did not write .server-info within 5 seconds"
fail "Control server starts" "Server did not write state/server-info within 5 seconds"
kill "$CONTROL_PID" 2>/dev/null || true
CONTROL_PID=""
else
pass "Control server starts with bad OWNER_PID=$BAD_PID"
pass "Control server starts with dead-at-startup OWNER_PID=$BAD_PID"
echo " Waiting ~75s for lifecycle check to kill server..."
echo " Waiting ~75s to verify server survives past lifecycle check..."
sleep 75
if kill -0 "$CONTROL_PID" 2>/dev/null; then
fail "Control server self-terminates with bad OWNER_PID" \
"Server is still alive (expected it to die)"
kill "$CONTROL_PID" 2>/dev/null || true
pass "Server survives with dead-at-startup OWNER_PID (owner monitoring disabled)"
else
pass "Control server self-terminates with bad OWNER_PID"
fail "Server survives with dead-at-startup OWNER_PID" \
"Server died unexpectedly. Log tail: $(tail -5 "$TEST_DIR/control/.server.log" 2>/dev/null)"
fi
if grep -q "owner-pid-invalid" "$TEST_DIR/control/.server.log" 2>/dev/null; then
pass "Server logs 'owner-pid-invalid' for dead-at-startup PID"
else
fail "Server logs 'owner-pid-invalid' for dead-at-startup PID" \
"Log tail: $(tail -5 "$TEST_DIR/control/.server.log" 2>/dev/null)"
fi
if grep -q "owner process exited" "$TEST_DIR/control/.server.log" 2>/dev/null; then
pass "Control server logs 'owner process exited'"
fail "No spurious 'owner process exited' log" \
"Found 'owner process exited' but owner monitoring should be disabled"
else
fail "Control server logs 'owner process exited'" \
"Log tail: $(tail -5 "$TEST_DIR/control/.server.log" 2>/dev/null)"
pass "No spurious 'owner process exited' log"
fi
kill "$CONTROL_PID" 2>/dev/null || true
fi
wait "$CONTROL_PID" 2>/dev/null || true
@@ -309,16 +324,16 @@ CONTROL_PID=""
echo ""
echo "--- Clean Shutdown ---"
mkdir -p "$TEST_DIR/stop-test"
mkdir -p "$TEST_DIR/stop-test/state"
BRAINSTORM_DIR="$TEST_DIR/stop-test" \
BRAINSTORM_HOST="127.0.0.1" \
BRAINSTORM_URL_HOST="localhost" \
BRAINSTORM_OWNER_PID="" \
BRAINSTORM_PORT=$((49152 + RANDOM % 16383)) \
node "$SERVER_JS" > "$TEST_DIR/stop-test/.server.log" 2>&1 &
node "$SERVER_SCRIPT" > "$TEST_DIR/stop-test/.server.log" 2>&1 &
STOP_TEST_PID=$!
echo "$STOP_TEST_PID" > "$TEST_DIR/stop-test/.server.pid"
echo "$STOP_TEST_PID" > "$TEST_DIR/stop-test/state/server.pid"
if ! wait_for_server_info "$TEST_DIR/stop-test"; then
fail "Stop-test server starts" "Server did not start"