superpowers

mirror of https://github.com/obra/superpowers.git synced 2026-07-27 12:44:01 +08:00

Author	SHA1	Message	Date
Jesse Vincent	c7900f1698	Close three review blind spots found by defect tracing Live eval deliverables shipped five polish defects; tracing each through the transcripts showed three mechanisms, each now addressed: - reviewers answered pointed checklist items with unsupported yes (evidence rule: every What-to-Check answer needs file:line evidence) - no reviewer ever saw the design's global constraints (controllers now paste binding constraints into task requirements) - test output noise was invisible everywhere (pristine-output checks in implementer self-review and quality review)	2026-06-09 21:19:08 -07:00
Jesse Vincent	5cfdb75b94	Require explicit model on subagent dispatch In live eval runs, controllers given judgment-based model selection stopped passing a model at all; the omitted parameter inherits the session's top-tier model, silently making every subagent maximally expensive (one run dispatched 26/26 reviewers on the session model).	2026-06-09 21:11:45 -07:00
Jesse Vincent	87825ff193	Forbid controllers pre-judging reviewer findings A live eval run of sdd-quality-reviewer-catches-planted-defect caught the SDD controller fabricating a plan constraint and instructing the quality reviewer not to flag the planted DRY violation. The duplication shipped. Constructing Reviewer Prompts now bans suppression directives alongside open-ended broadening directives.	2026-06-09 18:28:24 -07:00
Jesse Vincent	09cb4d7361	Sync plan: escaped pre() pattern in Task 5 checks block	2026-06-09 18:19:00 -07:00
Jesse Vincent	b3bb9a68d7	Fix plan doc: correct Task 1 grep expectation; sync Task 5 story block	2026-06-09 17:21:06 -07:00
Jesse Vincent	71dc271a08	Sync plan's Task 5 blocks with review fixes	2026-06-09 17:13:03 -07:00
Jesse Vincent	5aea3dca31	SDD controller: reviewer prompt budgets, ⚠️ handling, final-review pointer, model judgment	2026-06-09 16:59:05 -07:00
Jesse Vincent	b3281c0227	Implementer prompt: re-run covering tests after fixing review findings	2026-06-09 16:56:28 -07:00
Jesse Vincent	c14c1de552	Scope spec reviewer's Your Job wording to the diff	2026-06-09 16:55:28 -07:00
Jesse Vincent	be8a6269c4	Spec reviewer: judge from the diff, grounded skepticism, ⚠️ verdict channel	2026-06-09 16:53:30 -07:00
Jesse Vincent	da41209243	Use bare placeholder names in quality reviewer prompt body	2026-06-09 16:51:54 -07:00
Jesse Vincent	2cc449b6d4	Make per-task quality reviewer prompt self-contained and task-scoped	2026-06-09 16:47:27 -07:00
Jesse Vincent	f8dcd1ed3d	Add implementation plan for task-scoped review dispatch	2026-06-09 16:42:50 -07:00
Jesse Vincent	4192572d19	Harden review-dispatch spec per adversarial review findings	2026-06-09 16:33:44 -07:00
Jesse Vincent	5da15d7eba	Add design spec: task-scoped review dispatch for SDD	2026-06-09 16:26:00 -07:00
Jesse Vincent	f55642e0dd	Require contributors to disclose authoring environment and target dev Add a mandatory self-identification disclosure (model, harness, harness version, all installed plugins) to the PR template and all three issue templates, and document the requirement in the contributor guidelines. We weigh contributions differently depending on what produced them: content reasoned from documentation is held to a different bar than work grounded in a real session. Also state explicitly, in both CLAUDE.md and the PR template, that all PRs must target the dev branch rather than main.	2026-06-08 22:14:34 -07:00
Drew Ritter	ae1eefb7f9	chore(evals): bump submodule to --scenarios filter (ff3ee83) Adds `run-all --scenarios` for resuming a scenario subset across the Code Assist rate-limit windows. Follows the agy rate-limit fix (79f9963). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 22:46:00 -07:00
Drew Ritter	617168aff5	chore(evals): bump submodule to antigravity rate-limit fix (79f9963) Serialize antigravity against the Gemini Code Assist rate limit (max_concurrency=1), diagnose 429/RESOURCE_EXHAUSTED honestly instead of as auth, fail-fast on a latched window, and tolerant preflight OK match. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 16:27:35 -07:00
Rahul	d7c260a978	fix(brainstorming): cap websocket frame payloads	2026-06-02 11:24:02 -07:00
Drew Ritter	f3f0789c5c	Add shell lint script	2026-06-01 19:48:28 -07:00
Drew Ritter	16a1719988	Tighten Kimi plugin porting coverage	2026-06-01 19:41:58 -07:00
Drew Ritter	c74c22daa7	docs: restore Kimi direct install command	2026-06-01 19:41:58 -07:00
Drew Ritter	773bbf61d6	docs: simplify Kimi README install steps	2026-06-01 19:41:58 -07:00
Drew Ritter	6b76158550	fix: wire Kimi plugin into release metadata	2026-06-01 19:41:58 -07:00
Drew Ritter	7fec40bb55	fix: align Kimi manifest with supported fields	2026-06-01 19:41:58 -07:00
qer	2a8e54735b	feat: add Kimi Code plugin manifest	2026-06-01 19:41:58 -07:00
Matt Van Horn	f776394360	feat(subagent-dev): add TDD RED evidence to implementer report format Add a conditional TDD Evidence field to the implementer report format so controllers can verify RED and GREEN output when TDD was required. The field asks for the command run, relevant RED/GREEN output, and the expected RED failure reason rather than raw full logs. Fixes #994.	2026-06-01 16:15:05 -07:00
Drew Ritter	7301c81b4d	docs(windows): trim polyglot hook implementation copy	2026-06-01 16:07:01 -07:00
dev_Hakaze	9d3e68a5ad	docs(windows): update polyglot hook docs Rewrite the Windows polyglot hook documentation to match the current run-hook.cmd dispatcher and update the porting guide cross-reference.\n\nFixes #1653.	2026-06-01 15:57:30 -07:00
nestorluiscamachopaz	81c3052416	fix: foreground mode saves node PID and clears OWNER_PID on Windows/MSYS2 Verified on real Windows Git Bash: lifecycle test passed 12/12, manual start/stop released the port, and no brainstorm node processes remained.	2026-06-01 14:26:22 -07:00
nawfal	c879454a0d	fix(finishing-a-development-branch): remove gh-specific PR creation instruction Per obra's guidance on #1609: remove the github-specific instruction rather than replacing it with a platform-detection table. Agents already know their forge tooling; the skill only needs to cover the push step. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-01 13:58:22 -07:00
nawfal	ff213eb2cf	fix(finishing-a-development-branch): detect remote platform before creating PR/MR Replaces hardcoded `gh pr create` in Option 2 with a platform-neutral note: check `git remote get-url origin` first, then use gh (GitHub), glab (GitLab), or fall back to the compare URL for unknown platforms. Adds matching Red Flag entry so agents don't skip the detection step. Fixes #1609 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-01 13:58:22 -07:00
Jesse Vincent	da00e59958	feat: add Antigravity CLI (agy) support Antigravity (Google's `agy` CLI) installs the existing Superpowers plugin directly: agy plugin install https://github.com/obra/superpowers agy imports the bundled skills and runs the plugin's SessionStart hook, so using-superpowers bootstraps from the first message — verified on agy 1.0.3: a fresh session given "Let's make a react todo list" auto-triggers the brainstorming skill instead of writing code. agy discovers skills natively and, having no Skill tool, loads them by reading SKILL.md with view_file. No scaffold, installer, or generated context file is needed. This adds only: - README.md: an Antigravity install section + Quickstart link - skills/using-superpowers/SKILL.md: reference to the agy tool mapping - skills/using-superpowers/references/antigravity-tools.md: action->tool mapping for agy (view_file, write_to_file, invoke_subagent, manage_task, and skill loading via view_file on SKILL.md) - tests/antigravity/: structural test for the tool mapping, mirroring tests/pi/	2026-06-01 11:42:09 -07:00
Jesse Vincent	deceaec78d	docs: add 'Porting Superpowers to a New Harness' guide An evergreen guide for adding support for a new harness (IDE, CLI, or agent runner). Teaches the invariants — automatic session-start bootstrap, skill discovery/invocation, tool mapping, the acceptance test — and points at the closest reference integration shape (shell-hook, in-process plugin, instructions-file / declared context file) to copy. Covers discovery, build, local install, tmux-driven verification, distribution, and PR submission, with a live reference-integration index and a gotchas appendix. Two non-negotiable rules: (1) never edit skill bodies; (2) everything ships through the harness's own install mechanism — never edit the user's config. When a plugin installer strips undeclared files, declare the bootstrap as a recognized component (a manifest contextFileName-style context file the installer preserves and the harness loads every session), generated at install time from the live SKILL.md + tool mapping. Surfaced-skill-description bootstrap is the softer fallback. Hardened against real end-to-end ports (Antigravity CLI): shapes can compose; a fork doesn't inherit its parent's behavior; a hook system != a usable session-start event; verify @-includes AND context-file preservation with a marker; web-search the docs and study existing plugins; reverse-engineer undocumented harnesses; print/headless modes may hang; workspace-trust gates stall tmux; declared context files survive plugin install while undeclared files are stripped; skills-path registration is per-harness.	2026-06-01 10:07:38 -07:00
Jesse Vincent	e63e44bedf	fix(sync-to-codex-plugin): exclude /.pi/ so the pi extension doesn't leak into the Codex plugin The .pi/ directory holds the pi-harness extension (.pi/extensions/superpowers.ts), which is tracked (not git-ignored), so the git-ignored-path exclusion helpers never caught it. It was also missing from the static EXCLUDES list alongside the other harness dotdirs (.opencode, .cursor-plugin, .claude-plugin), so a sync would rsync pi's files into the Codex plugin distribution. Add /.pi/ to EXCLUDES.	2026-05-29 15:05:38 -07:00
Jesse Vincent	8811b0f2d7	Revert "Make visual-companion.md script paths skill-rooted, not plugin-rooted" This reverts commit `e9f5188289`.	2026-05-23 17:01:46 -07:00
Jesse Vincent	d48bec6cc3	Revert "Probe per-user Git Bash and Scoop before falling back to PATH on Windows" This reverts commit `a8f0738e3a`.	2026-05-23 17:00:15 -07:00
Jesse Vincent	a8f0738e3a	Probe per-user Git Bash and Scoop before falling back to PATH on Windows Stock Windows 10/11 ships C:\Windows\System32\bash.exe (the WSL launcher) as the first match for `where bash`. WSL's bash cannot execute Windows-style script paths, so when Git Bash is installed outside the two standard system locations -- specifically the per-user "Only for me" Git for Windows installer (%LOCALAPPDATA%\Programs\Git) or a Scoop install (%USERPROFILE%\scoop\apps\git\current\usr\bin) -- run-hook.cmd silently fails: WSL prints "Windows Subsystem for Linux must be updated", the script returns 0, and Superpowers' SessionStart bootstrap is never injected. From the user's perspective skills auto-trigger inconsistently or not at all, with no surfaced error. Add explicit probes for both locations between the existing system- wide Git for Windows checks and the `where bash` fallback. Also add a comment to the fallback documenting the WSL-launcher trap so future maintainers understand why the explicit probes must come first. Verified on a Windows 11 VM (dockur/windows 11, Git Bash 2.x, Node 22): - System Git present: existing probe still matches (no regression) - System Git absent, per-user Git present via junction: new probe matches, hook produces valid 6422-byte JSON, exit 0 - All Git probes absent: confirmed WSL trap fires ("Windows Subsystem for Linux must be updated") and the hook exits 0 silently, demonstrating the original bug Existing tests/hooks/test-session-start.sh still passes on macOS (7/7). Reported by @ytchenak in #1607. Co-authored-by: ytchenak <ytchenak@users.noreply.github.com> Closes #1607.	2026-05-23 16:58:56 -07:00
Jesse Vincent	f36bad5b78	Pipe SessionStart hook printf through cat to absorb EPIPE on Windows On Windows + Git Bash, the SessionStart hook prints a confusing diagnostic at every startup ("printf: write error: Permission denied") when Claude Code closes the hook's stdout pipe before the printf has finished writing. The hook still runs to completion and context still gets injected, but the diagnostic surfaces every session because Git Bash's printf reports EPIPE as "Permission denied" (not "Broken pipe" like Linux) and our `set -euo pipefail` lets that error escape. Piping each printf through `cat` makes the external cat process the recipient of any SIGPIPE / EPIPE. cat's failure does not propagate to the parent bash under pipefail because cat is the last command in the pipeline and exits cleanly when the pipe stays open long enough to hold the data. On macOS/Linux the cat passthrough is transparent (no behavior change, no measurable cost). Verified: - Existing tests/hooks/test-session-start.sh: 7/7 pass on macOS - Manual run on Windows 11 + Git Bash 5.2 + Node 22 produces valid JSON, clean stderr, and exit 0 - JSON output is byte-identical to the unpatched hook Reported by @silvertakana in #1612, attribution preserved in the Co-authored-by trailer below — this is the same fix shape the original PR proposed. Co-authored-by: silvertakana <silvertakana@users.noreply.github.com> Closes #1612.	2026-05-23 16:55:46 -07:00
Nick Galatis	21ad401e90	fix(systematic-debugging): defuse Claude Code ultrathink keyword scanner trigger (#1558 ) The "Signals You're Doing It Wrong" bullet in systematic-debugging/SKILL.md contains the literal token Claude Code's runtime scans for in tool result bodies. Every Skill-tool invocation of this skill caused the harness to inject a spurious system-reminder claiming the user requested deeper reasoning, silently bumping every session into extended thinking. Replace the bullet's spelling so the contiguous letter sequence the scanner matches is broken with a hyphen. The signal text remains recognizable to the agent and the documented action ("Question fundamentals, not just symptoms") is unchanged. Fixes obra/superpowers#1283	2026-05-23 16:51:00 -07:00
Jesse Vincent	e9f5188289	Make visual-companion.md script paths skill-rooted, not plugin-rooted Issue #1134: agents reading visual-companion.md see bare commands like `scripts/start-server.sh`, correctly identify the plugin install directory, then look for `<plugin>/scripts/start-server.sh` instead of `<plugin>/skills/brainstorming/scripts/start-server.sh`. The file doesn't exist at the plugin-rooted path, so the agent concludes the visual companion isn't available and falls back to text-only brainstorming. Multiple independent reproductions in the issue thread, plus one user's agent self-reported: "I assumed the scripts folder was in the root directory of the plugin, it didn't realize it could have been talking about the skill folder itself." Change all `scripts/<file>` references in visual-companion.md to `skills/brainstorming/scripts/<file>`. Agents that correctly identify the plugin root will now join to the right path. Closes #1134.	2026-05-23 16:42:13 -07:00
Jesse Vincent	eef50b96f0	Align windows-lifecycle test with current brainstorm server layout The test had drifted behind three server implementation changes and no longer ran against the actual server: - Server entrypoint renamed from server.js to server.cjs; the test still invoked node on server.js and failed with MODULE_NOT_FOUND. - Server state moved to a state/ subdirectory (state/server-info, state/server.pid); the test still waited on .server-info and wrote .server.pid at the session root. - Owner-PID startup validation now keeps the server running when the owner PID is dead at startup: it logs owner-pid-invalid, disables owner monitoring, and falls back to the idle timeout. The test still expected the server to self-terminate within 60s of a dead-at-startup owner. Update file/path references to match the current server, and rewrite the dead-at-startup test to assert the current behavior: server survives, log contains owner-pid-invalid, log does not contain a spurious "owner process exited" line. Verified locally: 9 passed, 0 failed, 3 skipped (Windows-only).	2026-05-23 16:36:45 -07:00
Jesse Vincent	e1d3f71e0d	Convert curly to square brackets in code-reviewer.md placeholders Matches the style used by the spec-reviewer-prompt.md and code-quality-reviewer-prompt.md call sites, which already use square brackets ([VAR] or [VAR — description]). No semantic change — these placeholders are filled in by the controller; nothing programmatic substitutes them.	2026-05-23 16:14:24 -07:00
Jesse Vincent	b2212dc913	Scope spec reviewer to task diff and make reviewers read-only Two problems with the SDD reviewer prompts on dev: - spec-reviewer-prompt.md never received a git range, so the general-purpose subagent had to crawl the entire codebase to find what changed. Reporter measured 20-33 minute spec reviews on simple tasks (#1538). - Neither reviewer prompt told the subagent that review is read-only. A spec reviewer running `git checkout <parent-sha>` for historical comparison silently detached HEAD on the controller's branch, then subsequent task commits accumulated on the detached HEAD and were effectively orphaned (#1543, reproduced independently in #1543's thread). Add a Git Range to Review section to spec-reviewer-prompt.md that mirrors the one code-reviewer.md already has, plus a Read-Only Review section in both reviewer prompt templates stating the principle: do not mutate the working tree, the index, HEAD, or branch state. Allow inspecting other revisions via a separate temporary worktree, so the read-only rule does not block legitimate historical comparison. Closes #1538. Closes #1543.	2026-05-23 16:14:05 -07:00
Jesse Vincent	180f009090	@mhat reported that his claude got confused about 'debugging' being named as a skill in the bootstrap	2026-05-21 17:23:25 -04:00
Drew Ritter	8c1f7c5dae	Bump superpowers-evals submodule	2026-05-14 16:32:24 -07:00
Drew Ritter	201f945838	[codex] support native Codex plugin hooks (#1540 ) * docs: specify Codex native hooks parity * docs: refine Codex hooks spec after review * docs: record Codex hook contract spike * docs: plan Codex native hooks implementation * feat: support Codex native plugin hooks * test: add Codex native hook drill coverage * Simplify Codex hook entrypoint	2026-05-14 15:59:38 -07:00
Drew Ritter	49bf5ad6dc	Align Pi mapping with action vocabulary	2026-05-13 17:58:46 -07:00
Drew Ritter	4bd0973879	Bump evals submodule for Pi backend	2026-05-13 17:58:46 -07:00
Jesse Vincent	452f1ed40b	chore: keep pi extension under .pi	2026-05-13 17:58:46 -07:00

1 2 3 4 5 ...

526 Commits