mirror of
https://github.com/obra/superpowers.git
synced 2026-05-05 16:49:04 +08:00
* docs: add Codex App compatibility design spec (PRI-823) Design for making using-git-worktrees, finishing-a-development-branch, and subagent-driven-development skills work in the Codex App's sandboxed worktree environment. Read-only environment detection via git-dir vs git-common-dir comparison, ~48 lines across 4 files, zero breaking changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: address spec review feedback for PRI-823 Fix three Important issues from spec review: - Clarify Step 1.5 placement relative to existing Steps 2/3 - Re-derive environment state at cleanup time instead of relying on earlier skill output - Acknowledge pre-existing Step 5 cleanup inconsistency Also: precise step references, exact codex-tools.md content, clearer Integration section update instructions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: address team review feedback for PRI-823 spec - Add commit SHA + data loss warning to handoff payload (HIGH) - Add explicit commit step before handoff (HIGH) - Remove misleading "mark as externally managed" from Path B - Add executing-plans 1-line edit (was missing) - Add branch name derivation rules - Add conditional UI language for non-App environments - Add sandbox fallback for permission errors - Add STOP directive after Step 0 reporting Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: clarify executing-plans in What Does NOT Change section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add cleanup guard test (#5) and sandbox fallback test (#10) to spec Both tests address real risk scenarios: - #5: cleanup guard bug would delete Codex App's own worktree (data loss) - #10: Local thread sandbox fallback needs manual Codex App validation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add implementation plan for Codex App compatibility (PRI-823) 8 tasks covering: environment detection in using-git-worktrees, Step 1.5 + cleanup guard in finishing-a-development-branch, Integration line updates, codex-tools.md docs, automated tests, and final verification. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(codex-tools): add named agent dispatch mapping for Codex (#647) * fix(writing-skills): correct false 'only two fields' frontmatter claim (#882) * Replace subagent review loops with lightweight inline self-review The subagent review loop (dispatching a fresh agent to review plans/specs) doubled execution time (~25 min overhead) without measurably improving plan quality. Regression testing across 5 versions (v3.6.0 through v5.0.4) with 5 trials each showed identical plan sizes, task counts, and quality scores regardless of whether the review loop ran. Changes: - writing-plans: Replace subagent Plan Review Loop with inline Self-Review checklist (spec coverage, placeholder scan, type consistency) - writing-plans: Add explicit "No Placeholders" section listing plan failures (TBD, vague descriptions, undefined references, "similar to Task N") - brainstorming: Replace subagent Spec Review Loop with inline Spec Self-Review (placeholder scan, internal consistency, scope check, ambiguity check) - Both skills now use "look at it with fresh eyes" framing Testing: 5 trials with the new skill show self-review catches 3-5 real bugs per run (spawn positions, API mismatches, seed bugs, grid indexing) in ~30s instead of ~25 min. Remaining defects are comparable to the subagent approach. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Revert "Replace subagent review loops with lightweight inline self-review" This reverts commitbf8f7572eb. * Reapply "Replace subagent review loops with lightweight inline self-review" This reverts commitb045fa3950. * Add v5.0.6 release notes * Move brainstorm server metadata to .meta/ subdirectory Metadata files (.server-info, .events, .server.pid, .server.log, .server-stopped) were stored in the same directory served over HTTP, making them accessible via the /files/ route. They now live in a .meta/ subdirectory that is not web-accessible. Also fixes a stale test assertion ("Waiting for Claude" → "Waiting for the agent"). Reported-By: 吉田仁 * Revert "Move brainstorm server metadata to .meta/ subdirectory" This reverts commitab500dade6. * Separate brainstorm server content and state into peer directories The session directory now contains two peers: content/ (HTML served to the browser) and state/ (events, server-info, pid, log). Previously all files shared a single directory, making server state and user interaction data accessible over the /files/ HTTP route. Also fixes stale test assertion ("Waiting for Claude" → "Waiting for the agent"). Reported-By: 吉田仁 * Fix owner-PID false positive when owner runs as different user ownerAlive() treated EPERM (permission denied) the same as ESRCH (process not found), causing the server to self-terminate within 60s whenever the owner process ran as a different user. This affected WSL (owner is a Windows process), Tailscale SSH, and any cross-user scenario. The fix: `return e.code === 'EPERM'` — if we get permission denied, the process is alive; we just can't signal it. Tested on Linux via Tailscale SSH with a root-owned grandparent PID: - Server survives past the 60s lifecycle check (EPERM = alive) - Server still shuts down when owner genuinely dies (ESRCH = dead) Fixes #879 * Fix owner-PID lifecycle monitoring for cross-platform reliability Two bugs caused the brainstorm server to self-terminate within 60s: 1. ownerAlive() treated EPERM (permission denied) as "process dead". When the owner PID belongs to a different user (Tailscale SSH, system daemons), process.kill(pid, 0) throws EPERM — but the process IS alive. Fixed: return e.code === 'EPERM'. 2. On WSL, the grandparent PID resolves to a short-lived subprocess that exits before the first 60s lifecycle check. The PID is genuinely dead (ESRCH), so the EPERM fix alone doesn't help. Fixed: validate the owner PID at server startup — if it's already dead, it was a bad resolution, so disable monitoring and rely on the 30-minute idle timeout. This also removes the Windows/MSYS2-specific OWNER_PID="" carve-out from start-server.sh, since the server now handles invalid PIDs generically at startup regardless of platform. Tested on Linux (magic-kingdom) via Tailscale SSH: - Root-owned owner PID (EPERM): server survives ✓ - Dead owner PID at startup (WSL sim): monitoring disabled, survives ✓ - Valid owner that dies: server shuts down within 60s ✓ Fixes #879 * Release v5.0.6: inline self-review, brainstorm server restructure, owner-PID fixes * fix: add Copilot CLI platform detection for sessionStart context injection Copilot CLI v1.0.11 reads `additionalContext` from sessionStart hook output, but the session-start script only emits the Claude Code-specific nested format. Add COPILOT_CLI env var detection so Copilot CLI gets the SDK-standard top-level `additionalContext` while Claude Code continues getting `hookSpecificOutput`. Based on PR #910 by @culinablaz. * feat: add Copilot CLI tool mapping, docs, and install instructions - Add references/copilot-tools.md with full tool equivalence table - Add Copilot CLI to using-superpowers skill platform instructions - Add marketplace install instructions to README - Add changelog entry crediting @culinablaz for the hook fix * fix(opencode): align skills path across bootstrap, runtime, and tests The bootstrap text advertised a configDir-based skills path that didn't match the runtime path (resolved relative to the plugin file). Tests used yet another hardcoded path and referenced a nonexistent lib/ dir. - Remove misleading skills path from bootstrap text; the agent should use the native skill tool, not read files by path - Fix test setup to create a consistent layout matching the plugin's ../../skills resolution - Export SUPERPOWERS_SKILLS_DIR from setup.sh so tests use a single source of truth - Add regression test that bootstrap doesn't advertise the old path - Remove broken cp of nonexistent lib/ directory Fixes #847 * docs: add OpenCode path fix to release notes * fix(opencode): inject bootstrap as user message instead of system message Move bootstrap injection from experimental.chat.system.transform to experimental.chat.messages.transform, prepending to the first user message instead of adding a system message. This avoids two issues: - System messages repeated every turn inflate token usage (#750) - Multiple system messages break Qwen and other models (#894) Tested on OpenCode 1.3.2 with Claude Sonnet 4.5 — brainstorming skill fires correctly on "Let's make a React to do list" prompt. * docs: update release notes with OpenCode bootstrap change * docs: add worktree rototill design spec (PRI-974) Design for detect-and-defer worktree support. Superpowers defers to native harness worktree systems when available, falls back to manual git worktree creation when not. Covers Phases 0-2: detection, consent, native tool preference, finishing state detection, and three bug fixes (#940, #999, #238). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: address SWE review feedback on worktree rototill spec - Fix Bug #999 order: merge → verify → remove worktree → delete branch (avoids losing work if merge fails after worktree removal) - Add submodule guard to Step 0 detection (GIT_DIR != GIT_COMMON is also true in submodules) - Preserve global path (~/.config/superpowers/worktrees/) in detection for backward compatibility, just stop offering it to new users - Add step numbering note and implementation notes section - Expand provenance heuristic to cover global path and manual creation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: honest spec revisions after issue/PR deep dive - Step 1a is the load-bearing assumption, not just a risk — if it fails, the entire design needs rework. TDD validation must be first impl task. - #1009 resolution depends on Step 1a working, stated explicitly - #574 honestly deferred, not "partially addressed" - Add hooks symlink to Step 1b (PR #965 idea, prevents silent hook loss) - Add stale worktree pruning to Step 5 (PR #1072 idea, one-line self-heal) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add worktree rototill implementation plan (PRI-974) 5 tasks: TDD gate for Step 1a, using-git-worktrees rewrite, finishing-a-development-branch rewrite, integration updates, end-to-end validation. Task 1 is a hard gate — if native tool preference fails RED/GREEN, stop and redesign. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add RED/GREEN validation for native worktree preference (PRI-974) Gate test for Step 1a — validates agents prefer EnterWorktree over git worktree add on Claude Code. Must pass before skill rewrite. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: rewrite using-git-worktrees with detect-and-defer (PRI-974) Step 0: GIT_DIR != GIT_COMMON detection (skip if already isolated) Step 0 consent: opt-in prompt before creating worktree (#991) Step 1a: native tool preference (short, first, declarative) Step 1b: git worktree fallback with hooks symlink and legacy path compat Submodule guard prevents false detection Platform-neutral instruction file references (#1049) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: rewrite finishing-a-development-branch with detect-and-defer (PRI-974) Step 2: environment detection (GIT_DIR != GIT_COMMON) before presenting menu Detached HEAD: reduced 3-option menu (no merge from detached HEAD) Provenance-based cleanup: .worktrees/ = ours, anything else = hands off Bug #940: Option 2 no longer cleans up worktree Bug #999: merge -> verify -> remove worktree -> delete branch Bug #238: cd to main repo root before git worktree remove Stale worktree pruning after removal (git worktree prune) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address spec review findings in both skill rewrites (PRI-974) using-git-worktrees: submodule guard now says "treat as normal repo" instead of "proceed to Step 1" (preserves consent flow) using-git-worktrees: directory priority summaries include global legacy finishing-a-development-branch: move git branch -d after Step 6 cleanup to make Bug #999 ordering unambiguous (merge -> worktree remove -> branch delete) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: update worktree integration references across skills (PRI-974) Remove REQUIRED language from executing-plans and subagent-driven-development. Consent and detection now live inside using-git-worktrees itself. Fix stale 'created by brainstorming' claim in writing-plans. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: include worktrees/ (non-hidden) in finishing provenance check (PRI-974) The creation skill supports both .worktrees/ and worktrees/ directories, but the finishing skill's cleanup only checked .worktrees/. Worktrees under the non-hidden path would be orphaned on merge or discard. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Step 1a validated through TDD — explicit naming + consent bridge (PRI-974) Step 1a failed at 2/6 with the spec's original abstract text ("use your native tool"). Three REFACTOR iterations found what works (50/50 runs): 1. Explicit tool naming — "do you have EnterWorktree, WorktreeCreate..." transforms interpretation into factual toolkit check 2. Consent bridge — "user's consent is your authorization" directly addresses EnterWorktree's "ONLY when user explicitly asks" guardrail 3. Red Flag entry naming the specific anti-pattern File split was tested but proven unnecessary — the fix is the Step 1a text quality, not physical separation of git commands. Control test with full 240-line skill (all git commands visible) passed 20/20. Test script updated: supports batch runs (./test.sh green 20), "all" phase, and checks absence of git worktree add (reliable signal) rather than presence of EnterWorktree text (agent sometimes omits tool name). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update spec with TDD findings on Step 1a (PRI-974) Step 1a's original "deliberately short, abstract" design was disproven by TDD (2/6 pass rate). Spec now documents the validated approach: explicit tool naming + consent bridge + red flag (50/50 pass rate). - Design Principles: updated to reflect explicit naming over abstraction - Step 1a: replaced abstract text with validated approach, added design note explaining the TDD revision and why file splitting was unnecessary - Risks: Step 1a risk marked RESOLVED with cross-platform validation table and residual risk note about upstream tool description dependency Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: honest cross-platform validation table in spec (PRI-974) Research confirmed Claude Code is currently the only harness with an agent-callable mid-session worktree tool. All others either create worktrees before the agent starts (Codex App, Gemini, Cursor) or have no native support (Codex CLI, OpenCode). Table now shows: what was actually tested (Claude Code 50/50, Codex CLI 6/6), what was simulated (Codex App 1/1), and what's untested (Gemini, Cursor, OpenCode). Step 1a is forward-compatible for when other harnesses add agent-callable tools. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: cross-platform validation on 5 harnesses (PRI-974) Tested on Gemini CLI (gemini -p) and Cursor Agent (cursor-agent -p): - Gemini: Step 0 detection 1/1, Step 1b fallback 1/1 - Cursor: Step 0 detection 1/1, Step 1b fallback 1/1 Both correctly identified no native agent-callable worktree tool, fell through to git worktree add, and performed safety verification. Both correctly detected existing worktrees and skipped creation. 5 of 6 harnesses now tested. Only OpenCode untested (no CLI access). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove incorrect hooks symlink step from worktree skill Git worktrees inherit hooks from the main repo automatically via $GIT_COMMON_DIR — this has been the case since git 2.5 (2015). The symlink step was based on an incorrect premise from PR #965 and also fails in practice (.git is a file in worktrees, not a dir). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: address PR #1121 review — respect user preference, drop y/n - Consent prompt: drop "(y/n)" and add escape valve for users who have already declared their worktree preference in global or project agent instruction files. - Directory selection: reorder to put declared user preference ahead of observed filesystem state, and reframe the default as "if no other guidance available". - Sandbox fallback: require explicitly informing the user that the sandbox blocked creation, not just "report accordingly". - writing-plans: fully qualify the superpowers:using-git-worktrees reference. - Plan doc: mirror the consent-prompt change. Step 1a native-tool framing and the helper-scripts suggestion are still outstanding — the first needs a benchmark re-run before softer phrasing can be adopted without regressing compliance; the second is exploratory and will get a thread reply. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: soften Step 1a native-tool framing per PR #1121 review Address obra's comment on explicit step numbers / prescriptive tone. Drops "STOP HERE if available", the "If YES:" gate, and the "even if / even if / NO EXCEPTIONS" reinforcement paragraph. Keeps the specific tool-name anchors (EnterWorktree, WorktreeCreate, /worktree, --worktree), which the original TDD data showed are load-bearing. A/B verified against drill harness on the 3 creation/consent scenarios (consent-flow, creation-from-main, creation-from-main-spec-aware): baseline explicit wording scored 12/12 criteria, softened wording also scored 12/12. The "agent used the most appropriate tool" criterion passed in all 3 softened runs — agents still picked EnterWorktree via ToolSearch without the imperative framing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: drop instruction file enumeration per PR #1121 review Jesse flagged that the verbose CLAUDE.md/AGENTS.md/GEMINI.md/.cursorrules enumeration (a) chews tokens, (b) confuses models that anchor on exact strings, and (c) is repeated DRY-violatingly across 3+ locations. Replace with abstract "your instructions" framing in four spots: - skills/using-git-worktrees/SKILL.md Step 0 → Step 1 transition - skills/using-git-worktrees/SKILL.md Step 1b Directory Selection - docs/superpowers/plans/2026-04-06-worktree-rototill.md (both mirror locations) Same intent, harness-agnostic phrasing, ~half the tokens. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: replace hardcoded /Users/jesse with generic placeholders (#858) * Remove the deprecated legacy slash commands (#1188) * fix: prevent subagent-driven-development from pausing every 3 tasks requesting-code-review had "review after each batch (3 tasks)" for executing-plans, which leaked into subagent-driven-development as a check-in cadence. Replaced with flexible "each task or at natural checkpoints" and added explicit continuous execution directive to subagent-driven-development. * Remove Integration sections from skills These sections don't help with steering and are a legacy of the time before agents had native skills systems. * fix(opencode): cache bootstrap content at module level to eliminate per-step file I/O getBootstrapContent() called fs.existsSync + fs.readFileSync + regex frontmatter parsing on every agent step with zero caching. The experimental.chat.messages.transform hook fires every step in opencode's agent loop (messages are reloaded from DB each step via filterCompactedEffect). A 10-step turn triggered 10 redundant file reads + 10 regex parses for content that never changes during a session. Changes: - Add module-level _bootstrapCache (undefined = not loaded, null = file missing) so the first call reads and parses SKILL.md, all subsequent calls return the cached string with zero filesystem access - Cache the null sentinel when SKILL.md is missing, preventing repeated fs.existsSync probes - Add _testing export (resetCache/getCache) for test infrastructure - Clarify the injection guard comment explaining how it interacts with opencode's per-step message reloading - Add 15 regression tests covering cache behavior, fs call counts, injection guard, missing file sentinel, cache reset, and source audit Fixes #1202 * test(opencode): simplify bootstrap cache coverage * docs: clarify opencode install caveats * test(opencode): modernize integration tests * docs: add Factory Droid installation instructions * Preserve Codex marketplace metadata * docs: add README quickstart install links (#1293) * docs(codex-tools): fix subagent wait mapping to wait_agent Update the Codex tool mapping so Claude Code 'Task returns result' maps to the current Codex spawned-agent result tool, wait_agent. Also clarify that older Codex builds exposed spawned-agent waiting as wait, while current bare wait is the code-mode exec/wait surface for yielded exec cells. Verified with Drill: - codex-tool-mapping-comprehension fails against dev with task_returns_result=wait - codex-tool-mapping-comprehension passes against this PR with task_returns_result=wait_agent and exec/wait scoped correctly - codex-subagent-wait-mapping passes against this PR with spawn_agent -> wait_agent -> close_agent and PR963_OK returned * fix(cursor): run SessionStart hook via run-hook.cmd on Windows Route Cursor's Windows SessionStart hook through the existing run-hook.cmd dispatcher instead of invoking the extensionless session-start script directly. This avoids Windows opening the extensionless hook file and lets Git Bash run the script as intended. Also removed an accidental UTF-8 BOM from hooks-cursor.json before merging. Verified: - hooks-cursor.json parses as JSON and has no BOM - command is ./hooks/run-hook.cmd session-start - CURSOR_PLUGIN_ROOT=/tmp/superpowers ./hooks/run-hook.cmd session-start emits valid Cursor JSON with additional_context * fix(tests): make SDD integration test actually run its assertions The SDD integration test silently bailed before printing any verification results. Three independent bugs caused this: 1. `WORKING_DIR_ESCAPED` was computed from `$SCRIPT_DIR/../..` without resolving `..` segments. The resulting "directory" name contained literal `..` so `find` was looking in a path that doesn't exist. 2. With `set -euo pipefail`, the `find ... | sort -r | head -1` pipeline could exit non-zero (SIGPIPE on the producer when head closes early), killing the script silently before assertions ran. 3. The `claude -p` invocation never passed `--plugin-dir`, so it loaded the installed plugin instead of the working tree. Local edits to skills under test were not actually being tested. Other adjustments: - Run claude from inside the unique TEST_PROJECT directory instead of from the plugin root, so its session JSONL lives in its own `~/.claude/projects/` folder and doesn't race other concurrent claude sessions for "most recent file". - Use the same character-normalization claude does (every non-alphanumeric becomes `-`) when computing the session dir name; macOS-resolved `/private/var/...` paths and tmp dirs with `.`/`_` in their names need this to round-trip correctly. - Accept either `"name":"Agent"` or `"name":"Task"` in the subagent count — the harness renamed the tool but the test wasn't updated. Verified on this branch: all six verification tests now pass against a real end-to-end SDD run (skill invoked, 7 subagents dispatched, 6 TodoWrite calls, working code produced, tests pass, no extra features). * feat: add Gemini CLI subagent support mapping Map Gemini Task dispatch to @agent-name/@generalist and document parallel subagent dispatch for independent tasks. * docs: update Codex plugin install guidance (#1288) * Lift superpowers:code-reviewer agent into the requesting-code-review skill The plugin had a single named agent (`agents/code-reviewer.md`) used by two skills, while every other reviewer/implementer subagent in the repo is dispatched as `general-purpose` with the prompt template living alongside its skill. That asymmetry had no upside and several costs: - Two sources of truth for the code review checklist (the agent file and `requesting-code-review/code-reviewer.md`), both drifting independently. - `Codex` users could not use the named agent directly; the codex-tools reference doc had a workaround section explaining how to flatten the named agent into a `worker` dispatch. - No third-party reliance on `superpowers:code-reviewer` inside this repo. Changes: - Merge `agents/code-reviewer.md` (persona + checklist) and `skills/requesting-code-review/code-reviewer.md` (placeholder template) into a single self-contained Task-dispatch template, matching the shape of `implementer-prompt.md`, `spec-reviewer-prompt.md`, etc. - Update `skills/requesting-code-review/SKILL.md` and `skills/subagent-driven-development/code-quality-reviewer-prompt.md` to dispatch `Task (general-purpose)` instead of the named agent. - Drop the now-obsolete "Named agent dispatch" workaround sections from `codex-tools.md` and `copilot-tools.md` — superpowers no longer ships any named agents, so those instructions documented nothing. - Delete `agents/code-reviewer.md` and the empty `agents/` directory. Tier 3 coverage for the change: a new behavioral test `tests/claude-code/test-requesting-code-review.sh` plants real bugs (SQL injection, plaintext password handling, credential logging) into a tiny project, runs the actual `requesting-code-review` skill against the working tree, and asserts the dispatched reviewer flags every planted issue at Critical/Important severity and refuses to approve the diff. Verified end-to-end on this branch: - The new test passes (5/5 assertions; reviewer caught all planted bugs and several others). - The existing SDD integration test still passes (7/7 subagents dispatched, all as `general-purpose`; spec compliance still rejects extra features; produced code is correct). - Session JSONLs confirm zero remaining `superpowers:code-reviewer` dispatches anywhere in the SDD pipeline. * Prepare v5.1.0: release notes and version bump Add v5.1.0 release notes covering: - Removals: legacy slash commands (/brainstorm, /execute-plan, /write-plan), skill Integration sections - Worktree skills rewrite (PRI-974, PR #1121) - Contributor guidelines for AI agents - Codex plugin mirror tooling (PR #1165) - OpenCode bootstrap caching (#1202) - SDD pause-every-3-tasks fix; SDD integration test fixes - Cursor Windows hook routing - Gemini CLI subagent dispatch mapping - Skill terminology cleanups - Install docs (Factory Droid, Codex, quickstart links) Bumps version 5.0.7 -> 5.1.0 across all declared files via scripts/bump-version.sh; not yet tagged or released. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Drew Ritter <drewritter@workerbee.local> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Drew Ritter <drew@primeradiant.com> Co-authored-by: Blaž Čulina <culina.blaz@nsoft.com> Co-authored-by: Jesse Vincent <jesse@primeradiant.com> Co-authored-by: voidborne-d <voidborne-d@users.noreply.github.com> Co-authored-by: Richard Luo <luo.richard@gmail.com> Co-authored-by: Drew Ritter <drew@ritter.dev> Co-authored-by: leonsong09 <59187950+leonsong09@users.noreply.github.com> Co-authored-by: YuXiang Hong <41331696+starumiQAQ@users.noreply.github.com> Co-authored-by: Sathvik Gilakamsetty <spacetime1007@gmail.com>
153 lines
6.0 KiB
Markdown
153 lines
6.0 KiB
Markdown
---
|
|
name: writing-plans
|
|
description: Use when you have a spec or requirements for a multi-step task, before touching code
|
|
---
|
|
|
|
# Writing Plans
|
|
|
|
## Overview
|
|
|
|
Write comprehensive implementation plans assuming the engineer has zero context for our codebase and questionable taste. Document everything they need to know: which files to touch for each task, code, testing, docs they might need to check, how to test it. Give them the whole plan as bite-sized tasks. DRY. YAGNI. TDD. Frequent commits.
|
|
|
|
Assume they are a skilled developer, but know almost nothing about our toolset or problem domain. Assume they don't know good test design very well.
|
|
|
|
**Announce at start:** "I'm using the writing-plans skill to create the implementation plan."
|
|
|
|
**Context:** If working in an isolated worktree, it should have been created via the `superpowers:using-git-worktrees` skill at execution time.
|
|
|
|
**Save plans to:** `docs/superpowers/plans/YYYY-MM-DD-<feature-name>.md`
|
|
- (User preferences for plan location override this default)
|
|
|
|
## Scope Check
|
|
|
|
If the spec covers multiple independent subsystems, it should have been broken into sub-project specs during brainstorming. If it wasn't, suggest breaking this into separate plans — one per subsystem. Each plan should produce working, testable software on its own.
|
|
|
|
## File Structure
|
|
|
|
Before defining tasks, map out which files will be created or modified and what each one is responsible for. This is where decomposition decisions get locked in.
|
|
|
|
- Design units with clear boundaries and well-defined interfaces. Each file should have one clear responsibility.
|
|
- You reason best about code you can hold in context at once, and your edits are more reliable when files are focused. Prefer smaller, focused files over large ones that do too much.
|
|
- Files that change together should live together. Split by responsibility, not by technical layer.
|
|
- In existing codebases, follow established patterns. If the codebase uses large files, don't unilaterally restructure - but if a file you're modifying has grown unwieldy, including a split in the plan is reasonable.
|
|
|
|
This structure informs the task decomposition. Each task should produce self-contained changes that make sense independently.
|
|
|
|
## Bite-Sized Task Granularity
|
|
|
|
**Each step is one action (2-5 minutes):**
|
|
- "Write the failing test" - step
|
|
- "Run it to make sure it fails" - step
|
|
- "Implement the minimal code to make the test pass" - step
|
|
- "Run the tests and make sure they pass" - step
|
|
- "Commit" - step
|
|
|
|
## Plan Document Header
|
|
|
|
**Every plan MUST start with this header:**
|
|
|
|
```markdown
|
|
# [Feature Name] Implementation Plan
|
|
|
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
|
|
**Goal:** [One sentence describing what this builds]
|
|
|
|
**Architecture:** [2-3 sentences about approach]
|
|
|
|
**Tech Stack:** [Key technologies/libraries]
|
|
|
|
---
|
|
```
|
|
|
|
## Task Structure
|
|
|
|
````markdown
|
|
### Task N: [Component Name]
|
|
|
|
**Files:**
|
|
- Create: `exact/path/to/file.py`
|
|
- Modify: `exact/path/to/existing.py:123-145`
|
|
- Test: `tests/exact/path/to/test.py`
|
|
|
|
- [ ] **Step 1: Write the failing test**
|
|
|
|
```python
|
|
def test_specific_behavior():
|
|
result = function(input)
|
|
assert result == expected
|
|
```
|
|
|
|
- [ ] **Step 2: Run test to verify it fails**
|
|
|
|
Run: `pytest tests/path/test.py::test_name -v`
|
|
Expected: FAIL with "function not defined"
|
|
|
|
- [ ] **Step 3: Write minimal implementation**
|
|
|
|
```python
|
|
def function(input):
|
|
return expected
|
|
```
|
|
|
|
- [ ] **Step 4: Run test to verify it passes**
|
|
|
|
Run: `pytest tests/path/test.py::test_name -v`
|
|
Expected: PASS
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
```bash
|
|
git add tests/path/test.py src/path/file.py
|
|
git commit -m "feat: add specific feature"
|
|
```
|
|
````
|
|
|
|
## No Placeholders
|
|
|
|
Every step must contain the actual content an engineer needs. These are **plan failures** — never write them:
|
|
- "TBD", "TODO", "implement later", "fill in details"
|
|
- "Add appropriate error handling" / "add validation" / "handle edge cases"
|
|
- "Write tests for the above" (without actual test code)
|
|
- "Similar to Task N" (repeat the code — the engineer may be reading tasks out of order)
|
|
- Steps that describe what to do without showing how (code blocks required for code steps)
|
|
- References to types, functions, or methods not defined in any task
|
|
|
|
## Remember
|
|
- Exact file paths always
|
|
- Complete code in every step — if a step changes code, show the code
|
|
- Exact commands with expected output
|
|
- DRY, YAGNI, TDD, frequent commits
|
|
|
|
## Self-Review
|
|
|
|
After writing the complete plan, look at the spec with fresh eyes and check the plan against it. This is a checklist you run yourself — not a subagent dispatch.
|
|
|
|
**1. Spec coverage:** Skim each section/requirement in the spec. Can you point to a task that implements it? List any gaps.
|
|
|
|
**2. Placeholder scan:** Search your plan for red flags — any of the patterns from the "No Placeholders" section above. Fix them.
|
|
|
|
**3. Type consistency:** Do the types, method signatures, and property names you used in later tasks match what you defined in earlier tasks? A function called `clearLayers()` in Task 3 but `clearFullLayers()` in Task 7 is a bug.
|
|
|
|
If you find issues, fix them inline. No need to re-review — just fix and move on. If you find a spec requirement with no task, add the task.
|
|
|
|
## Execution Handoff
|
|
|
|
After saving the plan, offer execution choice:
|
|
|
|
**"Plan complete and saved to `docs/superpowers/plans/<filename>.md`. Two execution options:**
|
|
|
|
**1. Subagent-Driven (recommended)** - I dispatch a fresh subagent per task, review between tasks, fast iteration
|
|
|
|
**2. Inline Execution** - Execute tasks in this session using executing-plans, batch execution with checkpoints
|
|
|
|
**Which approach?"**
|
|
|
|
**If Subagent-Driven chosen:**
|
|
- **REQUIRED SUB-SKILL:** Use superpowers:subagent-driven-development
|
|
- Fresh subagent per task + two-stage review
|
|
|
|
**If Inline Execution chosen:**
|
|
- **REQUIRED SUB-SKILL:** Use superpowers:executing-plans
|
|
- Batch execution with checkpoints for review
|