chore(evals): bump submodule to companion just-in-time scenario

Bump the evals submodule (ff3ee83 -> f1ac859) to include the brainstorming visual-companion just-in-time eval scenario that validates the SKILL.md consent-move in this PR (GREEN pass / RED fail on Quorum and drill). Scope: dev's recorded pointer predates the drill->Quorum migration, so this bump also carries that migration.
fix(brainstorm-server): fix auth-integration bugs from full-branch review
2026-06-11 05:09:05 +08:00 · 2026-06-09 19:55:22 -07:00 · 2026-06-09 19:13:52 -07:00 · 2026-06-09 18:33:00 -07:00 · 2026-06-09 18:29:49 -07:00 · 2026-06-09 18:27:43 -07:00
36 changed files with 3059 additions and 207 deletions
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
@@ -12,14 +12,17 @@ add a comment or reaction to the existing one instead.

 - [ ] I searched existing issues and this is not a duplicate

-## Environment
+## Environment (required)
+<!-- Required. We assume an agent filed this report — tell us which one and
+     where it ran. We weigh reports by what produced them. -->

 | Field | Value |
 |-------|-------|
 | Superpowers version | |
 | Harness (Claude Code, Cursor, etc.) | |
 | Harness version | |
-| Model | |
+| Your model + version | |
+| All plugins installed | |
 | OS + shell | |

 ## Is this a Superpowers issue or a platform issue?
--- a/.github/ISSUE_TEMPLATE/feature_request.md
+++ b/.github/ISSUE_TEMPLATE/feature_request.md
@@ -30,5 +30,18 @@ progress, and some were intentionally declined.
     of project? If this is specific to your domain, workflow, or a
     third-party tool, it may belong as its own plugin instead. -->

+## Environment (required)
+<!-- Required. We assume an agent wrote this request — tell us which one and
+     where it ran. We weigh proposals reasoned from documentation differently
+     than ones grounded in a real session where the problem actually came up. -->
+
+| Field | Value |
+|-------|-------|
+| Superpowers version | |
+| Harness (Claude Code, Cursor, etc.) | |
+| Harness version | |
+| Your model + version | |
+| All plugins installed | |
+
 ## Context
-<!-- Optional: version info, harness, model, workflow where you hit this. -->
+<!-- Optional: the workflow where you hit this, links, transcripts. -->
--- a/.github/ISSUE_TEMPLATE/platform_support.md
+++ b/.github/ISSUE_TEMPLATE/platform_support.md
@@ -21,3 +21,14 @@ requested or discussed.
 ## Have you tried manual installation?
 <!-- Many tools work with Superpowers through manual setup even without
     official support. Did you try? What happened? -->
+
+## Environment (required)
+<!-- Required. We assume an agent wrote this request — tell us which one and
+     where it ran. -->
+
+| Field | Value |
+|-------|-------|
+| Harness you currently use (Claude Code, Cursor, etc.) | |
+| Harness version | |
+| Your model + version | |
+| All plugins installed | |
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -4,6 +4,23 @@ sections blank, contain multiple unrelated changes, or show no evidence
 of human involvement will be closed without review.
 -->

+> **This PR MUST target the `dev` branch, not `main`.** `main` is the
+> released branch; active work lands on `dev` first. PRs opened against
+> `main` will be asked to retarget `dev` before review.
+
+## Who is submitting this PR? (required)
+<!-- Required. PRs that omit this will be closed. We assume an agent wrote
+     this PR — tell us which one and where it ran. We weigh contributions by
+     what produced them: content reasoned from documentation is held to a
+     different bar than work grounded in a real session. -->
+
+| Field | Value |
+|-------|-------|
+| Your model + version | |
+| Harness + version | |
+| All plugins installed | |
+| Human partner who reviewed this diff | |
+
 ## What problem are you trying to solve?
 <!-- Describe the specific problem you encountered. If this was a session
     issue, include: what you were doing, what went wrong, the model's
--- a/.kimi-plugin/plugin.json
+++ b/.kimi-plugin/plugin.json
@@ -0,0 +1,38 @@
+{
+  "name": "superpowers",
+  "version": "5.1.0",
+  "description": "An agentic skills framework and software development methodology.",
+  "author": {
+    "name": "Jesse Vincent",
+    "email": "jesse@fsck.com"
+  },
+  "homepage": "https://github.com/obra/superpowers",
+  "license": "MIT",
+  "keywords": [
+    "brainstorming",
+    "subagent-driven-development",
+    "skills",
+    "planning",
+    "tdd",
+    "debugging",
+    "code-review",
+    "workflow"
+  ],
+  "skills": "./skills/",
+  "sessionStart": {
+    "skill": "using-superpowers"
+  },
+  "skillInstructions": "Kimi Code tool mapping for Superpowers skills:\n\n- When a Superpowers skill says to ask the user, ask clarifying questions, ask one question at a time, present multiple-choice options, use the terminal for a question, or wait for the user's choice, call Kimi Code's `AskUserQuestion` tool. Do not render those choices as plain assistant text unless `AskUserQuestion` is unavailable or the session is in auto permission mode.\n- For `AskUserQuestion`, provide 1 question with 2-4 concrete options when possible. Put the recommended option first and suffix its label with `(Recommended)`.\n- When a Superpowers skill refers to `TodoWrite`, use Kimi Code's `TodoList` tool.\n- When a Superpowers skill says `Task tool (general-purpose)` or asks you to dispatch an implementer/reviewer subagent, use Kimi Code's `Agent` tool with a Kimi subagent type. Do not pass `general-purpose` as `subagent_type`.\n- For implementation, code review, spec review, quality review, and filled Superpowers subagent prompt templates, call `Agent` with `subagent_type: \"coder\"`, paste the fully filled prompt into `prompt`, and provide a short `description`.\n- For read-only codebase exploration that would take several searches, use `Agent` with `subagent_type: \"explore\"`.\n- For read-only planning or architecture design, use `Agent` with `subagent_type: \"plan\"`.\n- Keep dependent Superpowers subagent steps sequential. Use multiple `Agent` calls, or `run_in_background: true` only when the work is independent and background agents are available.\n- When a Superpowers skill refers to the `Skill` tool, use Kimi Code's native `Skill` tool.\n- Use Kimi Code's `Read`, `Write`, `Edit`, `Bash`, `Grep`, `Glob`, `FetchURL`, `WebSearch`, and MCP tools by their actual exposed names.\n- When a skill asks to search file contents, use `Grep`; when it asks to find files by path or pattern, use `Glob`; when it asks to fetch a URL, use `FetchURL`; when it asks to search the web, use `WebSearch`.",
+  "interface": {
+    "displayName": "Superpowers",
+    "shortDescription": "Planning, TDD, debugging, and delivery workflows for coding agents",
+    "longDescription": "Use Superpowers to guide agent work through brainstorming, implementation planning, test-driven development, systematic debugging, parallel execution, code review, and finish-the-branch workflows.",
+    "developerName": "Jesse Vincent",
+    "capabilities": [
+      "Interactive",
+      "Read",
+      "Write"
+    ],
+    "websiteURL": "https://github.com/obra/superpowers"
+  }
+}
--- a/.version-bump.json
+++ b/.version-bump.json
@@ -4,6 +4,7 @@
    { "path": ".claude-plugin/plugin.json", "field": "version" },
    { "path": ".cursor-plugin/plugin.json", "field": "version" },
    { "path": ".codex-plugin/plugin.json", "field": "version" },
+    { "path": ".kimi-plugin/plugin.json", "field": "version" },
    { "path": ".claude-plugin/marketplace.json", "field": "plugins.0.version" },
    { "path": "gemini-extension.json", "field": "version" }
  ],
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -14,7 +14,8 @@ Before you open a PR against this repo, you MUST:
 2. **Search for existing PRs** — open AND closed — that address the same problem. If duplicates exist, STOP and tell your human partner. Do not open another duplicate.
 3. **Verify this is a real problem.** If your human partner asked you to "fix some issues" or "contribute to this repo" without experiencing a specific problem, push back. Ask them what broke, what failed, what the user experience was.
 4. **Confirm the change belongs in core.** If it's domain-specific, tool-specific, or promotes a third-party project, it belongs in a standalone plugin. Tell your human partner.
-5. **Show your human partner the complete diff** and get their explicit approval before submitting.
+5. **Identify yourself.** Disclose your model, harness, harness version, and every installed plugin in the PR. Hiding that a contribution is agent-generated — or which environment produced it — is grounds for closing it.
+6. **Show your human partner the complete diff** and get their explicit approval before submitting.

 If any of these checks fail, do not open the PR. Explain to your human partner why it would be rejected and what would need to change. They will thank you for saving them the embarrassment.

@@ -26,6 +27,10 @@ If any of these checks fail, do not open the PR. Explain to your human partner w

 **PRs that show no evidence of human involvement will be closed.** A human must review the complete proposed diff before submission.

+**Submitters MUST identify themselves.** Every PR and issue must disclose the model, harness, harness version, and all installed plugins used to produce the contribution — or state plainly that it was written by hand with no agent. This is not optional. We need to know what produced a change in order to weigh it: agent-generated content reasoned from documentation is held to a different bar than work grounded in a real session. Contributions that hide their authoring environment will be closed.
+
+**All PRs MUST target the `dev` branch, not `main`.** `main` is the released branch; active work lands on `dev` first. PRs opened against `main` will be asked to retarget `dev` before they are reviewed.
+
 ## What We Will Not Accept

 ### Third-party dependencies
--- a/README.md
+++ b/README.md
@@ -4,7 +4,7 @@ Superpowers is a complete software development methodology for your coding agent

 ## Quickstart

-Give your agent Superpowers: [Claude Code](#claude-code), [Antigravity](#antigravity), [Codex App](#codex-app), [Codex CLI](#codex-cli), [Cursor](#cursor), [Factory Droid](#factory-droid), [Gemini CLI](#gemini-cli), [GitHub Copilot CLI](#github-copilot-cli), [OpenCode](#opencode), [Pi](#pi).
+Give your agent Superpowers: [Claude Code](#claude-code), [Antigravity](#antigravity), [Codex App](#codex-app), [Codex CLI](#codex-cli), [Cursor](#cursor), [Factory Droid](#factory-droid), [Gemini CLI](#gemini-cli), [GitHub Copilot CLI](#github-copilot-cli), [Kimi Code](#kimi-code), [OpenCode](#opencode), [Pi](#pi).

 ## How it works

@@ -149,6 +149,26 @@ Superpowers is available via the [official Codex plugin marketplace](https://git
  copilot plugin install superpowers@superpowers-marketplace
  ```

+### Kimi Code
+
+Superpowers is available in Kimi Code's plugin marketplace.
+
+- Open Kimi Code's plugin manager:
+
+  ```text
+  /plugins
+  ```
+
+- Go to `Marketplace` > `Superpowers` and install it.
+
+- Or install directly from this repository:
+
+  ```text
+  /plugins install https://github.com/obra/superpowers
+  ```
+
+- Detailed docs: [docs/README.kimi.md](docs/README.kimi.md)
+
 ### OpenCode

 OpenCode uses its own plugin install; install Superpowers separately even if you
--- a/docs/README.kimi.md
+++ b/docs/README.kimi.md
@@ -0,0 +1,88 @@
+# Superpowers for Kimi Code
+
+Complete guide for using Superpowers with [Kimi Code](https://github.com/MoonshotAI/kimi-code).
+
+## Installation
+
+Superpowers is available in Kimi Code's plugin marketplace.
+
+Open the plugin manager:
+
+```text
+/plugins
+```
+
+Go to `Marketplace` > `Superpowers` and install it.
+
+You can also install from this repository:
+
+```text
+/plugins install https://github.com/obra/superpowers
+```
+
+For unreleased validation against `dev`, pin the branch explicitly:
+
+```text
+/plugins install https://github.com/obra/superpowers/tree/dev
+```
+
+Kimi Code applies plugin changes to new sessions. After installing, updating, enabling, disabling, or reloading a plugin, start a fresh session with `/new`.
+
+## How It Works
+
+The Kimi plugin manifest lives at `.kimi-plugin/plugin.json`.
+
+The manifest does three things:
+
+1. Points Kimi Code at the existing `skills/` directory.
+2. Loads `using-superpowers` at session start through `sessionStart.skill`.
+3. Provides Kimi-specific tool mapping through `skillInstructions`.
+
+Kimi Code reads Superpowers skills from this repository. There are no copied skills, symlinks, hooks, or extra runtime dependencies.
+
+## Tool Mapping
+
+Skills describe actions instead of hard-coding one runtime's tool names. On Kimi Code these resolve to:
+
+- "Ask the user" / "ask clarifying questions" -> `AskUserQuestion`
+- "Create a todo" / "mark complete in todo list" -> `TodoList`
+- "Dispatch a subagent" -> `Agent`
+- "Invoke a skill" -> Kimi Code's native `Skill` tool
+- "Read a file" / "write a file" / "edit a file" -> `Read`, `Write`, `Edit`
+- "Run a shell command" -> `Bash`
+- "Search file contents" -> `Grep`
+- "Find files by path or pattern" -> `Glob`
+- "Fetch a URL" -> `FetchURL`
+- "Search the web" -> `WebSearch`
+
+## Updating
+
+Use Kimi Code's plugin manager:
+
+```text
+/plugins
+```
+
+Select Superpowers and update it from there. Start a fresh session with `/new` after updating.
+
+## Troubleshooting
+
+### Plugin not loading
+
+1. Run `/plugins info superpowers` and check diagnostics.
+2. Make sure the plugin is enabled.
+3. Start a fresh session with `/new` after install or update.
+
+### Direct GitHub install used an old release
+
+Kimi Code installs the latest GitHub release for a bare repository URL when one exists. To test unreleased changes before the next Superpowers release, install the branch explicitly:
+
+```text
+/plugins install https://github.com/obra/superpowers/tree/dev
+```
+
+### Skills not triggering
+
+1. Confirm `/plugins info superpowers` shows the plugin enabled.
+2. Start a fresh session with `/new`.
+3. Try the acceptance prompt: `Let's make a react todo list`. A working install should load `brainstorming` before writing code.
--- a/docs/porting-to-a-new-harness.md
+++ b/docs/porting-to-a-new-harness.md
@@ -0,0 +1,826 @@
+# Porting Superpowers to a New Harness
+
+This guide explains how to add support for a new harness — an IDE, CLI, or
+agent runner that isn't Claude Code — so that Superpowers skills auto-trigger
+there the same way they do natively.
+
+It is written in two layers. **Part 1–3** explain how the system works and how
+to tell whether a harness can be supported at all; read these before you touch
+anything. **Part 4–8** are a prescriptive procedure for an agent (supervised by
+a human partner) to execute the port end to end, through distribution. An
+appendix indexes the current reference integrations so you can copy the closest
+one.
+
+The integration mechanism differs across harnesses, and it will keep changing.
+This guide deliberately teaches the **invariants** — the things that must be
+true no matter the mechanism — and points you at a live reference implementation
+to copy. When this guide and the code disagree, the code wins; fix the guide.
+
+## Before you start
+
+Adding a harness is the highest-stakes contribution type in this repo. Before
+writing anything:
+
+- Read `CLAUDE.md` and `.github/PULL_REQUEST_TEMPLATE.md` in full — the
+  contributor rules and the new-harness PR requirements are not optional.
+- Search open **and closed** PRs for a prior attempt at this harness. If one
+  exists, understand why it stalled before starting your own.
+
+---
+
+## Part 1 — How Superpowers works across harnesses
+
+Superpowers is the same content everywhere. What changes per harness is the thin
+layer that delivers that content to the model and translates its instructions
+into the harness's native tools. Three components:
+
+1. **Skills (harness-agnostic).** Everything in `skills/` is the source of
+   truth, shared verbatim by every harness. Skills are written to describe
+   *actions* — "invoke a skill", "read a file", "dispatch a subagent", "create a
+   todo" — and never name a specific tool. This is what lets one skill body run
+   on Claude Code, Codex, Gemini, pi, and the rest without edits.
+
+2. **Tool mapping (per-harness).** Each harness needs the action vocabulary
+   translated into its real tool names. That translation lives in
+   `skills/using-superpowers/references/<harness>-tools.md` and/or inline in the
+   harness's bootstrap injector (see Part 5). It says, e.g., "*dispatch a
+   subagent* → call `task` with `subagent_type`."
+
+3. **Bootstrap (per-harness).** At the start of every session, the full
+   `skills/using-superpowers/SKILL.md` is injected into the model's context,
+   wrapped in `<EXTREMELY_IMPORTANT>` tags, with the tool mapping appended. That
+   injected skill is what teaches the model that skills exist and that it must
+   check for a relevant skill before acting. **The bootstrap is the entire
+   integration.** Without it, the skill files are inert — present on disk, never
+   invoked.
+
+### Two rules that make this work
+
+**1. Skills name actions, not tools.** Do **not** edit skill bodies to fit your
+harness. Porting adds a tool-mapping reference and a bootstrap injector; it
+never reaches into `skills/*/SKILL.md` to swap tool names. (The project's
+contributor guidelines treat skill content as carefully-tuned behavior-shaping
+code; rewording it for "compliance" is rejected on sight.)
+
+**2. Everything ships through the harness's own install mechanism. Never edit the
+user's files.** The bootstrap, the skills, and the tool mapping all get delivered
+*as part of what the harness installs* — a plugin, an extension, a marketplace
+entry, an extension-bundled context file. A port **must not** reach into a user's
+global or personal config (`~/.gemini/config/AGENTS.md`, `settings.json`,
+`trustedFolders.json`, a hand-edited `~/.bashrc`, etc.) to inject anything. The
+harness owns what it loads; your install artifact is the only thing you get to
+write. If the install mechanism genuinely can't carry the bootstrap, that is a
+limitation to surface (Part 6) — never a license to hand-edit the user's config.
+(Shape C is *not* an exception: Gemini's context file is fine because it ships
+*inside the installed extension* and is declared by the manifest's
+`contextFileName` — the harness loads the extension's own file, not a file you
+edited in the user's home.)
+
+---
+
+## Part 2 — Can this harness be supported?
+
+A harness can support Superpowers only if it can do all of the following. Check
+these before writing code — if the first one fails, stop.
+
+### Hard requirement: automatic session-start injection
+
+The harness must let you inject text into the model's context **at the start of
+every session, with no per-session opt-in by your human partner.** This is the
+one non-negotiable capability. It can take any form:
+
+- a **hook/event system** that runs a shell command at session start and reads
+  its stdout (Claude Code, Codex, Cursor, Copilot CLI), or
+- an **in-process plugin/extension** with a session-start or message lifecycle
+  callback that can mutate the message array (OpenCode, pi), or
+- an **instructions-file** convention where the harness loads a context file that
+  *your installed extension ships and declares* (e.g. Gemini's `contextFileName`
+  pointing at the extension's own `GEMINI.md`) — not a file you edit in the user's
+  home.
+
+If the only way to get Superpowers in front of the model is for your human
+partner to opt in each session (paste a prompt, run a command, enable a mode),
+the harness
+**cannot** be properly supported. The acceptance test in Part 3 will fail, and
+the PR will be closed. This is the single most common reason a "port" isn't a
+real port.
+
+### The rest of the capability checklist
+
+| Capability | Why it's needed | If absent |
+|---|---|---|
+| **Skill discovery + invocation** | The model must be able to load a skill's full content on demand | If there's no native skill tool, the sanctioned fallback is to `read` the relevant `SKILL.md` directly — see Part 5. A harness with neither a skill tool nor file-read cannot work. |
+| **File read / write / edit** | Nearly every skill manipulates files | Essential. No workaround. |
+| **Run shell commands** | TDD, verification, git workflows | Essential. |
+| **Subagent / task dispatch** | `dispatching-parallel-agents`, `subagent-driven-development` | Degradable: if unavailable, those specific skills tell the model to do the work inline or report the missing capability — *never* to invent a `Task` call. Some harnesses gate this behind a config flag (e.g. Codex needs multi-agent enabled). |
+| **Todo / task tracking** | Progress tracking in several skills | Degradable: fall back to a plan file or `TODO.md`. |
+| **Web fetch / search** | A few skills | Degradable. |
+| **Shell or polyglot script execution (Windows)** | Only for the shell-hook shape, only if you want Windows support | See Part 7. In-process-plugin harnesses sidestep this entirely. |
+
+"Degradable" means: the skill already has fallback wording for the missing
+tool. Your job in the tool mapping is to point at the real tool when it exists
+and reuse that fallback wording when it doesn't.
+
+### You may not need a new directory at all
+
+Some "new harnesses" are really existing integrations under a different
+installer. Factory's Droid, for example, consumes the Claude Code plugin via its
+own `plugin install` command and needs no new files here. Before building,
+check whether the harness can simply load an existing manifest. A port that adds
+nothing to this repo but a paragraph in the README is a perfectly good outcome.
+
+---
+
+## Part 3 — Definition of done
+
+A port is finished when **all** of these are true:
+
+1. The `using-superpowers` bootstrap loads at session start, every session, with
+   no per-session opt-in.
+2. A tool mapping exists for the harness (in
+   `references/<harness>-tools.md`, inline in the bootstrap, or both — per Part 5).
+3. Skills can actually be invoked — natively, or via the documented
+   read-`SKILL.md` fallback — and the model follows them.
+4. **The acceptance test passes.** In a clean session, the user message:
+
+   > Let's make a react todo list
+
+   auto-triggers the `brainstorming` skill *before any code is written*. Capture
+   the full transcript — the PR requires it.
+5. Tests cover the integration (Part 5) and pass.
+6. A real user can install it through the harness's own mechanism (not by
+   hand-copying files), and the version is tracked in `.version-bump.json` where
+   applicable (Part 6). Note that some installers rewrite or strip the manifest on
+   install (one drops it to just `{"name": …}`), so "the *installed* files report
+   the repo version" is not always achievable — track the version at the source
+   manifest and don't treat a rewritten installed manifest as a failure.
+
+A quick smoke check before the full acceptance test: start a session and ask the
+model to describe its superpowers. If the bootstrap injected, it knows it has
+them. (OpenCode's install doc uses `opencode run --print-logs "hello" 2>&1 |
+grep -i superpowers` for the same goal via a different mechanism — log-grep
+rather than asking the model; the `2>&1` matters because logs go to stderr. Find
+your harness's equivalent.)
+
+---
+
+## Part 4 — Choose your integration shape
+
+There are three structural shapes, distinguished by *how you get the bootstrap
+in front of the model*. Pick the one that matches what your harness exposes,
+then copy that reference implementation. The shape determines almost everything
+in Part 5 — the steps below branch on it.
+
+### How to tell which shape you have
+
+Before routing, learn the harness's *actual* mechanism — and don't assume it's
+well documented or that it behaves like whatever harness it forked from.
+
+**Find the surface:**
+
+- **Search the web for the harness's docs** (extension / plugin / hook / skill /
+  MCP / "context file" / "rules file"). Vendor tools change fast; search rather
+  than trust training knowledge.
+- **Find and read an existing third-party extension/plugin for the harness.** A
+  real working example beats docs — it shows the manifest shape, the install
+  command, and which components the harness actually loads.
+- Check what the harness loads at startup: a settings file? an extensions
+  directory? a per-project or global instructions file (`AGENTS.md`, `<NAME>.md`)?
+
+**If it's underdocumented, reverse-engineer it empirically** (a real porter has
+had to do every one of these):
+
+- `strings` the binary / grep the install tree for hook event names, config
+  paths, and the instructions file it reads.
+- **Ask the running model to enumerate its own tool names** — e.g. "list the
+  exact machine names of every tool you can call." This is the authoritative way
+  to get tool names without inventing them (see Step 4).
+- Prove every assumption with a **unique-marker test**: inject a nonsense token
+  through the mechanism you think works, start a fresh session, and confirm the
+  token actually reached the model.
+
+**A fork does not inherit its parent's behavior.** A harness derived from another
+(e.g. a Gemini-derived CLI) may expose the parent's manifest fields and
+`@`-include syntax and *still not honor them the same way*. Verify with a marker;
+never assume the parent's recipe transfers.
+
+Then route to a shape:
+
+- Shell command at session start whose stdout is read → **Shape A**.
+- Plugin/extension module with lifecycle callbacks you run code in → **Shape B**.
+- Only ever an always-on instructions file, no hook and no code plugin →
+  **Shape C**.
+
+**Shapes compose — they are not mutually exclusive.** The *skill-discovery*
+mechanism and the *bootstrap* mechanism need not be the same shape — but **both
+must still ride the install mechanism** (rule 2). Decide the two questions
+separately: *where do skills get discovered?* and *how does the bootstrap reach
+the model every session?* A harness might install skills via a plugin yet need
+the bootstrap delivered another install-shipped way (an extension-declared
+context file, or — see below — by the harness surfacing the installed
+`using-superpowers` skill's own description at session start). If more than one
+install-mechanism surface injects automatically, prefer the most reliable. What
+you may **not** do is bridge a gap by editing the user's global config.
+
+### Shape A — Shell-hook
+
+The harness has a hook system that runs a shell command at session start and
+reads JSON from its stdout. The configured command runs `run-hook.cmd`, a
+polyglot wrapper that just locates bash and dispatches the named script; the
+script (`hooks/session-start`, or a harness-specific variant like
+`hooks/session-start-codex`) is what reads `using-superpowers/SKILL.md` and
+prints a JSON object whose **field name and nesting differ per harness**.
+
+- Reference: `hooks/session-start` (and `hooks/session-start-codex`),
+  `hooks/run-hook.cmd`, and the per-harness hook config `hooks/hooks.json`
+  (Claude Code), `hooks/hooks-codex.json` (Codex), `hooks/hooks-cursor.json`
+  (Cursor).
+- Manifests: `.codex-plugin/plugin.json`, `.cursor-plugin/plugin.json` point the
+  harness at `./skills/` and the right `hooks-*.json`. (Claude Code's
+  `.claude-plugin/plugin.json` sets neither field — it auto-discovers `skills/`
+  and `hooks/hooks.json` by convention.)
+
+> **A hook *system* is not a session-start *event*.** A harness can have a
+> `hooks.json` mechanism — and even contain the literal string `SessionStart` in
+> its binary — while having no hook event that fires at session start and can
+> inject context. (One real harness only exposed pre/post-tool and stop events;
+> the `SessionStart` strings were telemetry.) Confirm the *specific event* you
+> need exists and can write to the model's context before committing to Shape A.
+> If it can't, the bootstrap belongs in an instructions file (Shape C) instead.
+
+### Shape B — In-process plugin / extension
+
+The harness loads a JS/TS module that exposes lifecycle callbacks. You register
+the skills directory through the harness's API and inject the bootstrap by
+mutating the message array in code.
+
+- Reference: `.opencode/plugins/superpowers.js` (JavaScript) and
+  `.pi/extensions/superpowers.ts` (TypeScript). pi is the closest reference for
+  any harness that has **no native skill tool**.
+
+### Shape C — Instructions-file
+
+The harness has neither a shell hook nor a code plugin — its session-start
+surface is a context file that *your installed extension ships and the manifest
+declares* (e.g. Gemini's `contextFileName` → the extension's own `GEMINI.md`).
+You can't run code or mutate messages; the extension's context file points at the
+bootstrap. There is no injector to assemble a string or strip frontmatter — the
+harness loads the referenced content as-is. **This works only because the file is
+part of the installed extension** — never substitute "edit the user's global
+`GEMINI.md`/`AGENTS.md`" for shipping your own (rule 2).
+
+- Reference: `gemini-extension.json` (manifest, with `contextFileName`),
+  `GEMINI.md` (two `@`-includes — the bootstrap skill and the tool-mapping
+  reference), `skills/using-superpowers/references/gemini-tools.md`.
+- Note: `@`-include is a Gemini feature. If your harness loads an instructions
+  file but has no include syntax, you must inline the bootstrap content into the
+  file instead.
+- **Don't trust that an `@`-include is actually expanded — prove it.** A
+  Gemini-*derived* harness can accept `@./path` syntax yet treat it as a *hint
+  the model may choose to read* (it emits a file-read tool call) rather than a
+  guaranteed inline expansion. That's the difference between the bootstrap being
+  reliably present every session and the model maybe-reading it. Run a
+  unique-marker test: if the marker isn't in context *without* a tool call,
+  **inline the content** rather than `@`-include it.
+
+### Routing table
+
+| If the harness… | Use shape | Copy from |
+|---|---|---|
+| runs a shell command at session start and reads its stdout | A (shell-hook) | Codex (`hooks/session-start-codex` + `hooks/hooks-codex.json` + `.codex-plugin/`) |
+| is a JS/TS plugin host with session/message lifecycle callbacks | B (in-process) | OpenCode (`.opencode/`) — or pi (`.pi/`) if it has no native skill tool |
+| ships an extension-declared context file it always loads | C (instructions-file) | Gemini (`gemini-extension.json` + `GEMINI.md` + `references/gemini-tools.md`) |
+| has a plugin install command and a manifest `contextFileName` (or equivalent) the installer keeps | C via the plugin installer | Antigravity (`.antigravity-plugin/` — `agy plugin install` ships a generated context file; verify the installer preserves it — Part 6) |
+
+Most real harnesses fit one row cleanly; the last is the hybrid case (rule 2 still
+holds — the bootstrap rides the install mechanism, never a user-config edit).
+
+---
+
+## Part 5 — The porting procedure
+
+### Step 1 — Study the closest reference implementation
+
+Open the files named in Part 4 for your shape and read them end to end. The
+patterns below are summaries; the code is the spec.
+
+### Step 2 — Create the manifest / entry point
+
+Create whatever the harness uses to recognize the plugin. Match the existing
+ones in spirit:
+
+- **Shape A:** a `*-plugin/plugin.json` (see `.codex-plugin/plugin.json`) with
+  `name`, `version`, `description`, author/license/keywords, `"skills":
+  "./skills/"`, and `"hooks": "./hooks/hooks-<harness>.json"`. Plus the
+  `hooks-<harness>.json` itself, registering a session-start hook whose command
+  invokes `run-hook.cmd`.
+- **Shape B:** the module the harness loads (e.g. `.<harness>/plugins/*.js`) plus
+  whatever package metadata it needs to be discovered. The committed package
+  metadata is the **repo-root `package.json`**: `main` points at the OpenCode
+  plugin, the `pi` field (`pi.extensions`, `pi.skills`) plus the `pi-package`
+  keyword declare the pi extension. Per-harness local manifests and lockfiles are
+  kept out of git — `.opencode/.gitignore` excludes `node_modules`,
+  `package.json`, and lockfiles. Do the same for your harness's *local* install
+  artifacts so they don't pollute the repo — but never gitignore the repo-root
+  `package.json`, which is the tracked source of truth.
+  - **Build/dependency check.** Decide how the harness loads your module:
+    does it run the source directly (pi's `.ts` is referenced as-is from
+    `package.json`; OpenCode ships plain `.js`), or does it need a transpile/build
+    step? Superpowers is zero-runtime-dependency. pi's `import type
+    { ExtensionAPI }` works specifically because the harness runs the `.ts`
+    directly, supplies that type at load, and the repo never type-checks the file
+    in CI — the import isn't even declared as a dependency. If *your* harness
+    actually type-checks or bundles the plugin, that breaks: an undeclared type
+    import fails, and the PR rules only carve out *runtime* deps for new
+    harnesses, not dev/type packages. If you hit this, confirm the approach with
+    the maintainer rather than quietly adding a dependency. Keep any build output
+    out of git and document the command.
+- **Shape C (instructions-file):** a small manifest (see `gemini-extension.json`:
+  `name`, `description`, `version`, `contextFileName`) plus the context file
+  itself (`GEMINI.md` is just two `@`-includes: the bootstrap skill and the
+  tool-mapping reference). The Gemini manifest has no `skills` field — Gemini
+  auto-discovers the `skills/` directory bundled in the installed extension. If
+  your harness has a native skill tool but no manifest field to register the
+  directory, you must find its discovery convention (read its extension docs),
+  then verify empirically: after wiring, ask the model to list its available
+  skills — if the bundled skills don't appear, discovery isn't working yet.
+
+### Step 3 — Wire the bootstrap injection
+
+This is the heart of the port. The shared goal: at session start, get the
+`using-superpowers` skill content (wrapped in `<EXTREMELY_IMPORTANT>` tags) plus
+the harness's tool mapping in front of the model, with a note that the skill is
+already active so the model doesn't try to load it again. *How* you do that —
+and what you assemble vs. what the harness loads raw — depends entirely on your
+shape. Do **not** apply one shape's recipe to another.
+
+**Shape A — a script reads `SKILL.md` and prints the harness's JSON.** The
+dispatched script (`hooks/session-start`) `cat`s the whole `SKILL.md` (frontmatter
+included — that's fine; it's emitted verbatim), wraps it with the "You have
+superpowers… for all other skills use the Skill tool" preamble, escapes it, and
+prints the harness's JSON shape. The tool mapping for Shape A does **not** go
+inline here — it lives in `references/<harness>-tools.md` (Step 4). Get the JSON
+output shape exactly right. `hooks/session-start`
+detects the harness from environment variables and prints *one of three* shapes:
+
+- Cursor (`CURSOR_PLUGIN_ROOT` set): `{ "additional_context": "…" }`
+- Claude Code (`CLAUDE_PLUGIN_ROOT` set, `COPILOT_CLI` unset):
+  `{ "hookSpecificOutput": { "hookEventName": "SessionStart", "additionalContext": "…" } }`
+- Copilot CLI / SDK standard (else): `{ "additionalContext": "…" }`
+
+This is a trap. Emitting the wrong field, or an extra one, means the bootstrap
+either never injects or injects twice (Claude Code reads both
+`additional_context` and `hookSpecificOutput` without de-duplicating, so emitting
+both double-injects). Find the
+exact field, nesting, and event-matcher values your harness expects. Then
+decide: add a fourth branch to `hooks/session-start`, or — if the harness needs
+a different bootstrap message or env contract — add a dedicated
+`hooks/session-start-<harness>` script, the way Codex did. If you add a branch
+and your harness *also* sets an env var an earlier branch keys on (some harnesses
+set `CLAUDE_PLUGIN_ROOT` too), order your branch before the one that would
+otherwise shadow it. Match the harness's
+own event-matcher strings (Claude Code uses `startup|clear|compact`, Codex
+`startup|resume|clear`, Cursor `sessionStart`); wrong matchers mean the hook
+silently never fires.
+
+The **hook-config schema itself varies per harness** — don't assume the
+Claude/Codex shape is universal. Compare `hooks/hooks.json`,
+`hooks/hooks-codex.json`, and `hooks/hooks-cursor.json`: Cursor's uses
+`"version": 1`, a lowercase `sessionStart` key, a relative
+`./hooks/run-hook.cmd` command, and omits the `matcher`/`type`/`async` fields the
+others use. Match your `hooks-<harness>.json` to whichever existing file is
+closest, not to a single canonical template.
+
+The hook **command string references a harness-provided plugin-root variable**,
+and its name differs per harness: `hooks.json` uses `${CLAUDE_PLUGIN_ROOT}`,
+`hooks-codex.json` uses `${PLUGIN_ROOT}`, Cursor uses a relative path. Use
+whatever your harness exports. (The `session-start` script re-derives the root
+itself via `dirname`, so the script body doesn't depend on this — but the
+command in the manifest does.)
+
+**Discovering the harness's contract.** The three facts above — env var, JSON
+field/nesting, matcher strings — are the harness's contract, not Superpowers',
+so you have to source them. Read the harness's hook docs, or find out
+empirically: register a throwaway session-start hook that dumps its environment
+and emits a marker, then observe which env var identifies the harness and
+whether/how the harness ingests your stdout. Pin these down before writing the
+real branch.
+
+**Shape B — assemble the string in code, then inject as a user message.** Here
+you build the bootstrap yourself: read `SKILL.md`, strip its YAML frontmatter,
+and assemble `<EXTREMELY_IMPORTANT>` + a short preamble that the skill is already
+loaded and must not be re-invoked + the stripped body + the inline tool mapping +
+`</EXTREMELY_IMPORTANT>`. One subtlety the references disagree on: OpenCode's
+preamble says "do NOT use the skill tool…" (assumes a `skill` tool exists), while
+pi's just says "do not try to load using-superpowers again." If your harness has
+no skill tool, use pi's wording, not OpenCode's.
+
+Inject the result as a **user-role message, not a system message** — system
+messages bloat tokens when repeated every turn (#750) and multiple system
+messages break some models (#894). Three things you must replicate:
+
+- **Dedup guard.** The lifecycle callback can fire repeatedly (OpenCode's
+  transform runs on *every* agent step; pi's `context` fires per turn). Before
+  injecting, check whether a bootstrap marker is already present and skip if so.
+  (The references pick different markers — pi a custom string, OpenCode the
+  `EXTREMELY_IMPORTANT` tag; matching the tag is more robust since it needs no
+  harness-specific constant.) Cache the bootstrap content at module level so
+  you're not re-reading and re-parsing `SKILL.md` on every call (#1202).
+- **Compaction.** If the harness compacts/summarizes history, re-inject
+  afterward. pi sets an `injectBootstrap` flag on `session_start` and
+  `session_compact`, clears it on `agent_end`, and inserts the message *after*
+  any leading compaction-summary messages. OpenCode relies on its per-step
+  re-injection plus the dedup guard.
+- **Message-object shape is per-harness — discover yours, don't copy a literal.**
+  The two references use *incompatible* shapes: pi builds
+  `{ role, content: [{ type, text }], timestamp }`; OpenCode manipulates
+  `message.info.role` and `message.parts[]`. Find your harness's message shape
+  from its API; copying a reference's object literal verbatim will fail silently.
+
+**Shape C — point your extension's context file at the bootstrap; assemble
+nothing.** There is no injector, so you do *not* strip frontmatter or build a
+wrapped string. The context file your extension ships (declared by the manifest —
+*not* the user's own global file) pulls in two things: the `using-superpowers`
+skill and the harness's tool-mapping reference. `GEMINI.md`
+does this with two `@`-includes (`@./skills/using-superpowers/SKILL.md` and
+`@./skills/using-superpowers/references/<harness>-tools.md`); the harness loads
+them raw, frontmatter and all, and `SKILL.md` already carries its own
+`<EXTREMELY-IMPORTANT>` block internally. If your harness has no include syntax,
+inline the content into the instructions file instead. Gemini ships **no**
+"already loaded, don't re-invoke" preamble — for an `@`-include harness the
+content is the active instruction set, not a skill the model would re-load. If
+you find your harness does try to re-invoke, add that note as a literal line in
+the instructions file (you have no code to add it any other way).
+
+### Step 4 — Write the tool mapping
+
+Translate the action vocabulary into the harness's real tools. Cover every one
+of these actions (omit only what genuinely doesn't apply):
+
+- read a file
+- create / edit / delete a file (one `apply_patch`-style tool, or separate
+  write/edit?)
+- run a shell command
+- search file contents / find files by name (grep, glob)
+- fetch a URL / web search
+- **dispatch a subagent**, including how to pass the agent type — and any config
+  flag needed to enable it
+- **create / update todos** (treat older `TodoWrite` references as this action)
+- **invoke a skill** — see Step 5
+
+**Get the real tool names from the harness; never invent them.** If the docs
+don't list them, the authoritative source is the harness itself: in a live
+session, ask the model to "list the exact machine names of every tool you can
+call, one per line" and use what it reports.
+
+**How the harness finds the `skills/` directory is itself per-harness** — confirm
+it, don't assume. Possibilities: a manifest `skills` path field (Codex's
+`"skills": "./skills/"`); a *co-located* `skills/` the harness auto-scans (where a
+path field is **ignored** — one real harness only scanned a `skills/` sitting next
+to `plugin.json`); an API/registration call (OpenCode, pi); or you stage an
+install dir that pairs the manifest with a **symlink to the repo's `skills/`** and
+point the installer at the staging dir (verify the installer *dereferences* the
+symlink and copies the real files — confirm with `agy plugin validate`/`install`
+or the equivalent before relying on it). A `skills` path field is *not* portable.
+
+Where the mapping lives depends on shape:
+
+- **Shape A:** put it in `skills/using-superpowers/references/<harness>-tools.md`.
+  The agent reaches it from the bootstrap — `SKILL.md`'s "Platform Adaptation"
+  section links the per-harness references files. (Shape A harnesses have no
+  instructions file; the mapping is *not* inlined into the hook output.)
+- **Shape B:** the mapping is typically inlined into the bootstrap string you
+  inject (see the `toolMapping` constant in `superpowers.js`). pi keeps it in
+  *both* places — `piToolMapping()` inline **and** `references/pi-tools.md`. If
+  you maintain it in two places, update both, or the port is half-done.
+- **Shape C:** put it in `references/<harness>-tools.md` and pull it into the
+  always-loaded instructions file (e.g. `GEMINI.md` `@`-includes
+  `gemini-tools.md`).
+
+You may also add a one-line pointer to your harness in `SKILL.md`'s "Platform
+Adaptation" section so an agent reading the bootstrap knows where its mapping
+lives. This is the one edit to a `SKILL.md` a port may make — and only because
+that section is a pointer list, not behavior-shaping content. It does not violate
+the "don't edit skill bodies" rule (Part 1); do not touch anything else in any
+skill. (The list is a convenience pointer, not an exhaustive registry — not every
+harness is listed.)
+
+### Step 5 — Handle a harness with no native skill tool
+
+`using-superpowers/SKILL.md` tells the model to *never read skill files manually
+with file tools — always use your platform's skill-loading mechanism.* The point
+is "don't bypass the mechanism," not "never use file-read." What counts as "your
+platform's mechanism" depends on the harness — and for a harness with no skill
+tool, the documented mechanism *is* reading `SKILL.md`. So reading it there
+honors the rule rather than breaking it. Distinguish three cases:
+
+1. **Native `Skill`-style tool** (Claude Code, Copilot CLI, Gemini's
+   `activate_skill`): point the mapping at that tool.
+2. **Native skill *discovery* but no `Skill` tool** (pi, Antigravity): the harness
+   can find and list skills, but the model can't call a tool to load one. Get the
+   skills installed where the harness scans (pi registers via `resources_discover`
+   → `skillPaths`; OpenCode via its `config` hook; `agy plugin install` copies
+   them in), and tell the model to load a skill by **reading its `SKILL.md` with
+   the file-read tool when the skill applies** — the sanctioned mechanism here,
+   the way `references/pi-tools.md` states it.
+
+   **For the bootstrap itself, prefer a declared context file (Part 6).** If the
+   harness has a `contextFileName`-style manifest field — as Antigravity does —
+   ship a generated context file through the installer: it's guaranteed-loaded and
+   carries both the `using-superpowers` content and the tool mapping. That is the
+   strong, preferred path.
+
+   **Fallback — the surfaced skill index.** If there's no context-file field but
+   the harness surfaces each installed skill's name + description at session start,
+   you need *neither* a built index nor a runtime-list instruction — the harness
+   is the index, and `using-superpowers`'s own surfaced description can be what
+   triggers the model to load it. This is softer than a declared context file;
+   two things it does **not** give you, versus a context file / hook / in-process
+   injector — account for both:
+   - **It bootstraps *triggering*, not the *tool mapping*.** An injector prepends
+     `<harness>-tools.md` alongside `using-superpowers` every session. Here nothing
+     injects the mapping — the model only sees skill *descriptions* and must *read*
+     your `references/<harness>-tools.md` when it needs tool names. It works
+     because skills name actions (the model reads the mapping when it acts), but
+     it's softer than injection. Make sure the mapping is reachable from what the
+     model loads — e.g. linked from `SKILL.md`'s Platform Adaptation section and
+     installed alongside the skills — not just sitting in the repo.
+   - **There's no structural guarantee the trigger fires.** No `<EXTREMELY_IMPORTANT>`
+     wrapper, no dedup, no re-injection after compaction — firing depends on the
+     model choosing to act on a description it sees in the index. This is exactly
+     why the acceptance test is mandatory here: it is the *only* guarantee, so run
+     it on the model(s) your users will actually use, not just the strongest one.
+3. **No skill system at all:** there is nothing to register, and the *only*
+   mechanism is the model reading `SKILL.md` on demand. But the model can't read
+   what it can't find: `using-superpowers/SKILL.md` does **not** enumerate the
+   available skills, so on its own the model won't know which skills exist or
+   their triggers. You must supply a discovery path. Two options, and they differ
+   in durability: (a) generate a skill index (each `skills/*/SKILL.md`'s `name` +
+   `description` frontmatter) and place it *inside* the `<EXTREMELY_IMPORTANT>`
+   wrapper alongside the tool mapping (Shape B recipe above) so it's covered by
+   the dedup guard — but a build-time index goes stale as skills are added; or
+   (b) instruct the model to list `skills/*/SKILL.md` at runtime and read their
+   frontmatter to find a match — slower but never stale. Prefer (b) unless you
+   have a reason not to. Without either, a no-skill-system port loads the
+   bootstrap but silently never triggers any other skill.
+
+In cases 2 and 3, say plainly in your tool mapping that reading `SKILL.md` is the
+blessed path, so the model doesn't think it's violating the "never read skill
+files" rule. Don't go hunting for a `skillPaths`-style registration API in a
+harness that has no skill system — case 3 has none.
+
+### Step 6 — Add tests
+
+Match the existing per-harness test style:
+
+- **Shape A:** assert the hook's stdout has the exact JSON shape your harness
+  consumes, and that it contains the bootstrap. See `tests/hooks/test-session-start.sh`,
+  which validates each harness's output shape.
+- **Shape B:** a unit test that fakes the harness's plugin API and asserts the
+  lifecycle handlers register, the bootstrap injects once, the dedup guard
+  works, and (if relevant) compaction re-injection works. See
+  `tests/pi/test-pi-extension.mjs`. Add an isolated-install integration check in
+  the style of `tests/opencode/`.
+- If the bootstrap is cached, test that the cache behaves when the file is
+  missing (see the OpenCode caching tests).
+
+These automated tests cover the wiring; the live tmux run in Step 7 is what
+proves the integration actually triggers skills.
+
+### Step 7 — Install locally, then drive a live instance to verify
+
+You cannot confirm a port works by reading code. You have to run the harness with
+your in-progress port loaded and watch a real session — which is also how you
+produce the transcript the PR requires.
+
+**Install locally.** Point a *local* instance of the harness at your working
+tree, not a published build:
+
+- **Shape A / C:** install the plugin/extension from this repo's local path (or
+  symlink its directory into wherever the harness looks). Find the harness's
+  "install from a local directory / git checkout" path in its docs.
+- **Shape B:** register the local module — e.g. an `opencode.json` `plugin`
+  entry pointing at the local path, or pi resolving the `package.json` fields
+  from the repo.
+
+Reinstall after each change and restart the harness, since the bootstrap loads at
+startup.
+
+**Drive it with tmux.** Most harnesses are interactive REPLs/TUIs that can't be
+driven by piping stdin, so run the harness inside a detached tmux session and
+control it with `send-keys` / `capture-pane`. A harness may advertise a
+non-interactive "run one prompt" mode (e.g. `opencode run "..."`) — try it for the
+quick smoke check, but **don't depend on it**: these modes are frequently flaky,
+auth-gated, or trust-gated (one real harness's `--print` mode hung and timed out
+with no output every time). Be ready to do *everything*, including the smoke
+check, through tmux.
+
+**Clear the gates first, or tmux stalls silently.** Many harnesses block on
+first-run onboarding, a "do you trust this folder?" prompt, a sandbox mode, or a
+permission gate — and a detached tmux session will just sit there with no error
+while it waits. Before the run, pre-trust your scratch directory (in the harness's
+settings/config) or be prepared to answer those prompts via `send-keys`, and
+account for the harness's startup time in your first `sleep`.
+
+```bash
+# 1. Launch the harness detached, in a throwaway project dir
+mkdir -p /tmp/port-smoke
+tmux new-session -d -s port-test -c /tmp/port-smoke '<harness-launch-command>'
+
+# 2. Let it initialize — real TUIs take longer than you think (10s+ with a model
+#    handshake); tune this. THEN capture and clear any blocking modal before you
+#    type a prompt: first-run onboarding and "trust this folder?" are modal, so
+#    keystrokes sent during them select menu items instead of typing your prompt.
+sleep 12
+tmux capture-pane -t port-test -p          # onboarding / trust prompt? answer it via send-keys first
+# (e.g. tmux send-keys -t port-test Enter   # to accept a trust prompt — inspect before assuming)
+
+# 3. Smoke check: does the model know it has superpowers?
+#    Send the text and Enter as SEPARATE send-keys with a beat between them —
+#    sending them together races on some TUIs (Enter arrives before the text lands).
+tmux send-keys -t port-test 'What are your superpowers?'; sleep 0.4; tmux send-keys -t port-test Enter
+sleep 5
+tmux capture-pane -t port-test -p          # reply should show it knows its skills
+
+# 4. Acceptance test: exact prompt (note the escaped apostrophe), fresh session
+tmux send-keys -t port-test 'Let'\''s make a react todo list'; sleep 0.4; tmux send-keys -t port-test Enter
+# poll until the turn finishes — re-capture every few seconds, don't capture once
+sleep 8
+tmux capture-pane -t port-test -p          # PASS = brainstorming triggers BEFORE any code
+
+# 5. Save the transcript for the PR, then clean up
+tmux capture-pane -t port-test -p > /tmp/port-smoke/transcript.txt
+tmux kill-session -t port-test
+```
+
+tmux gotchas that bite here: wait after launch before the first capture; send the
+prompt text and `Enter` as *separate* `send-keys` calls with a short `sleep`
+between them (sending them together races on some TUIs), and `Enter` is a key name
+not `\n`; the agent's turn takes time, so **poll `capture-pane` in a loop** rather
+than capturing once; `capture-pane` shows only the visible pane, so for a long
+conversation use the harness's own transcript/log file as the record of truth;
+always `kill-session` when done.
+
+If the smoke check shows the model *doesn't* know it has superpowers, the
+bootstrap isn't loading — fix that before bothering with the acceptance test.
+
+---
+
+## Part 6 — Distribution and release
+
+A working integration in this repo isn't usable until a real user can install
+it. Distribution differs per harness ecosystem — find yours:
+
+| Channel | Example | What you do |
+|---|---|---|
+| Native plugin marketplace | Claude Code | Register in `.claude-plugin/marketplace.json`; users `/plugin install`. The external `superpowers-marketplace` repo is the source of truth users install from — see the release steps in `CLAUDE.md`. |
+| External marketplace fork, synced by script | Codex | `scripts/sync-to-codex-plugin.sh` rsyncs the tracked plugin files into a separate fork repo and opens a PR. Read its include/exclude list so you ship the right tree (it deliberately drops repo-internal dirs and other harnesses' dotdirs). |
+| Git-URL extension install | Gemini, Kimi Code, OpenCode | Users install from a git URL (`gemini extensions install …`; Kimi Code `/plugins install …`; an `opencode.json` `plugin` array entry). Document the exact command. |
+| Package-manifest fields | pi | Declared through fields in the repo-root `package.json`; users install via the harness's package command. |
+| Local installer (plugin install) | Antigravity (`agy`) | A small `install.sh` that runs the harness's own `agy plugin install` against a staging dir holding the manifest, the skills, and a generated `contextFileName` context file (the bootstrap). Everything arrives through the install mechanism — *not* by editing the user's config (see below). |
+
+Then:
+
+- **A plugin installer may silently strip *undeclared* files — so make the
+  bootstrap a file the installer *recognizes*, never a user-config edit.** A
+  `plugin install` typically copies only the components it knows about
+  (skills/agents/commands/mcp/hooks/context) and discards anything else, so a
+  context file the manifest doesn't declare just vanishes from the install. The
+  fix is **not** to give up and write into the user's config (**rule 2**) — it's
+  to declare the bootstrap as a recognized component. In escalation order:
+  - **Ship a context file the manifest declares.** If the harness has a
+    `contextFileName`-style field (an extension-declared file it loads every
+    session), that is the strongest clean bootstrap: declare it, and the installer
+    preserves it *and* the harness loads it. Generate it at install time from the
+    live `using-superpowers/SKILL.md` + the tool mapping (wrapped in
+    `<EXTREMELY_IMPORTANT>`) so the installed bootstrap never drifts. This is what
+    `.antigravity-plugin/install.sh` does — `agy plugin install` reports
+    `✔ context : ANTIGRAVITY.md`, and a clean session reads `using-superpowers`'s
+    SKILL.md, loads `brainstorming`, and enters the brainstorming flow before any
+    code. **Verify with a marker** that the installer keeps the file and the
+    harness loads it: one porter wrongly concluded it couldn't, because they
+    shipped the file *without* declaring `contextFileName` and it was stripped as
+    unrecognized.
+  - **Otherwise lean on the installed `using-superpowers` skill itself.** If the
+    harness surfaces each installed skill's name + description at session start,
+    the `using-superpowers` description ("Use when starting any conversation…")
+    can prompt the model to load it — installing the skill *is* the bootstrap.
+    Softer (no guaranteed wrapper; it carries triggering but not the tool mapping
+    — see Step 5), so prefer the declared context file when available.
+  - If neither works, the harness cannot be cleanly supported yet — **say so**
+    and raise it, rather than hand-editing the user's config.
+
+- **Write install docs.** A `docs/README.<harness>.md` and/or a
+  `.<harness>/INSTALL.md` (see `docs/README.opencode.md` and
+  `.opencode/INSTALL.md`), plus an install section in the top-level `README.md`.
+  The only supported install action is **running the harness's own install
+  command** (`agy plugin install`, `gemini extensions install`, `/plugin
+  install`, etc.). Hand-copying skill files and editing the user's global/personal
+  config are *both* off-limits (rule 2 / the PR rules). If the harness has no
+  install command at all — its only surface is a user-owned config file — then it
+  fails the "deliver via install mechanism" rule, and you should raise that rather
+  than ship an installer that edits the user's files.
+- **Register the version.** If your harness introduces a *new* versioned
+  manifest, add its path and version field to `.version-bump.json` so
+  `scripts/bump-version.sh` keeps it in lockstep (read that file to see what's
+  currently tracked). A new manifest that isn't registered there will ship a
+  stale version. If your harness instead rides an already-tracked file — pi
+  declares itself in the repo-root `package.json`, which is already listed —
+  there's nothing new to add.
+- **If no existing channel fits, you're standing up a new one.** None of the four
+  rows may match your harness. If it needs a Codex-style external fork sync,
+  `scripts/sync-to-codex-plugin.sh` is the template to clone (note its anchored
+  include/exclude list and its PR automation). And whenever you add a new
+  per-harness directory, add it to the *other* harnesses' sync excludes (e.g. the
+  EXCLUDES list in `sync-to-codex-plugin.sh`) so your dotdir doesn't leak into
+  their distributions.
+
+---
+
+## Part 7 — Cross-platform / Windows
+
+Only relevant to the shell-hook shape. `hooks/run-hook.cmd` is a polyglot: a
+single file that's valid as both a Windows batch script and a Unix shell script.
+On Windows, `cmd.exe` runs the batch portion, which locates `bash` (Git for
+Windows, then `bash` on PATH) and runs the named hook script; if no bash is
+found it exits cleanly so the harness still works, just without injection. On
+Unix, the leading `:` makes the batch block a no-op and the shell runs the
+script directly.
+
+Two rules this enforces, which you must respect:
+
+- **Hook scripts are extensionless** (`session-start`, not `session-start.sh`).
+  Claude Code's Windows handling prepends `bash` to any command containing
+  `.sh`, which would double-invoke. Name your hook script without an extension.
+- Don't write per-OS variants of the hook script. One extensionless bash script
+  plus the polyglot wrapper covers all three platforms.
+
+`hooks/run-hook.cmd` itself is the authoritative implementation — read it. See
+`docs/windows/polyglot-hooks.md` for the background and rationale behind the
+dispatcher pattern.
+
+---
+
+## Part 8 — Submitting the PR
+
+- Target the **`dev`** branch. One harness per PR.
+- Fill in the PR template's **"New harness support"** section and paste the
+  complete acceptance-test transcript (the "Let's make a react todo list"
+  session showing `brainstorming` auto-triggering). A PR without this proof will
+  be closed.
+- Superpowers is a zero-dependency plugin. Don't add a third-party runtime
+  dependency. Adding a new harness is the one carve-out the contributor rules
+  allow, and even then keep it to what the integration strictly requires —
+  type-only imports that compile away are fine; runtime packages are not.
+- Don't touch skill bodies (Part 1). If you found yourself editing a `SKILL.md`
+  to make the port work, the fix belongs in your tool mapping instead.
+
+---
+
+## Appendix A — Reference integrations (current)
+
+Use this as the live index; when in doubt, read the files, not this table.
+
+| Harness | Entry point | Bootstrap mechanism | Tool mapping | Tests | Distribution |
+|---|---|---|---|---|---|
+| Claude Code | `.claude-plugin/plugin.json` + `hooks/hooks.json` | shell hook → `hooks/session-start` (`hookSpecificOutput.additionalContext`) | native `Skill` tool; `references/claude-code-tools.md` | `tests/hooks/` | marketplace |
+| Codex | `.codex-plugin/plugin.json` + `hooks/hooks-codex.json` | shell hook → `hooks/session-start-codex` | `references/codex-tools.md` | `tests/codex-plugin-sync/`, `tests/hooks/` | fork sync (`scripts/sync-to-codex-plugin.sh`) |
+| Cursor | `.cursor-plugin/plugin.json` + `hooks/hooks-cursor.json` | shell hook → `hooks/session-start` (`additional_context`) | `references/claude-code-tools.md` | `tests/hooks/` | hand-authored |
+| Copilot CLI | (shares Claude Code hook path; `COPILOT_CLI` env) | shell hook → `hooks/session-start` (`additionalContext`) | `references/copilot-tools.md` | `tests/hooks/` | — |
+| Gemini CLI | `gemini-extension.json` + `GEMINI.md` | instructions file `@`-includes bootstrap + mapping | `references/gemini-tools.md` | — | `gemini extensions install` |
+| Kimi Code | `.kimi-plugin/plugin.json` | manifest `sessionStart.skill` loads `using-superpowers` | inline `skillInstructions` in manifest | `tests/kimi/` | marketplace or `/plugins install` GitHub URL |
+| OpenCode | `.opencode/plugins/superpowers.js` (declared via root `package.json` `main`) | in-process: `config` hook registers skills dir; `experimental.chat.messages.transform` injects user message | inline in `superpowers.js` | `tests/opencode/` | `opencode.json` plugin git URL |
+| pi | `.pi/extensions/superpowers.ts` | in-process: `resources_discover` registers skills; `context` event injects user message; lifecycle-flag + compaction-aware | `piToolMapping()` inline **and** `references/pi-tools.md` | `tests/pi/` | repo-root `package.json` fields |
+
+## Appendix B — Gotchas that have bitten porters
+
+- **Opt-in isn't a port.** If your human partner has to do anything per session
+  to get Superpowers, the acceptance test fails. Re-read Part 2.
+- **Wrong JSON field → silent failure or double injection.** Shape A only.
+  Confirm the exact field/nesting; Claude Code reads two fields without dedup.
+- **Hook-config schema varies per harness.** Shape A. Cursor's `hooks-cursor.json`
+  looks nothing like the Claude/Codex one (`version`, lowercase `sessionStart`,
+  relative command, no `matcher`/`type`/`async`). Match the closest existing file.
+- **Plugin-root env var differs per harness.** Shape A. The hook command uses
+  `${CLAUDE_PLUGIN_ROOT}` (Claude), `${PLUGIN_ROOT}` (Codex), or a relative path
+  (Cursor). Use what your harness exports; the script re-derives the root itself.
+- **System-message injection.** Shape B injects a *user* message on purpose
+  (#750, #894). Don't "fix" it to a system message.
+- **Per-step vs per-turn callbacks.** OpenCode fires every step (per-call dedup
+  guard); pi fires per turn (lifecycle flag + `agent_end` reset). Copying one
+  harness's dedup strategy onto the other's callback frequency breaks injection.
+- **Message-object shape is per-harness.** Shape B. pi and OpenCode use
+  incompatible shapes; discover yours, don't copy a reference's object literal.
+- **Hunting for a skill-registration API that doesn't exist.** A harness with no
+  skill system (not just no `Skill` tool) has nothing to register — the model
+  reads `SKILL.md` on demand. Don't assume a `skillPaths` equivalent exists.
+- **Mapping in two places.** For in-process plugins the mapping may live both
+  inline and in a `references/` file (pi). Update both.
+- **The "never read skill files" line.** It means "don't bypass your platform's
+  skill-loading mechanism," not "never use file-read." On a no-skill-tool harness
+  that mechanism *is* reading `SKILL.md` — say so explicitly in the mapping
+  (Part 5).
+- **`.sh` on Windows.** Keep hook scripts extensionless (Part 7).
+- **Unregistered version.** A new manifest not added to `.version-bump.json`
+  ships stale (Part 6).
+- **Editing skills to fit the harness.** Never. The fix goes in the tool mapping.
--- a/docs/superpowers/plans/2026-06-09-visual-companion-issues.md
+++ b/docs/superpowers/plans/2026-06-09-visual-companion-issues.md
@@ -0,0 +1,350 @@
+# Visual Brainstorming Companion — Issue & Change Catalog
+
+**Date:** 2026-06-09
+**Status:** Analysis / triage. We are implementing these ourselves; the referenced
+community PRs are evidence and reference material, **not** code we intend to merge.
+
+## Purpose
+
+A single place that captures every open issue and PR touching the visual
+brainstorming companion (the local server in `skills/brainstorming/scripts/`),
+distilled to the underlying problem and the change we'd make. Each item is
+grounded against the current code, not the PR author's description.
+
+## Scope decisions (Jesse, 2026-06-09)
+
+- **Not vendoring Alpine.js.** PR #1639 (interactive mockups via a vendored
+  Alpine build) is **dropped**. See E3.
+- **E1 (terminal-vs-HTML hard gate) is a workshop item.** We'll design it
+  together; it is not specced here.
+- **E2 (storage location, #975/#977) is deferred** for now.
+- **Remote serving is a first-class scenario.** Superpowers is general-purpose;
+  users connect from remote (SSH tunnel, Tailscale, `--host 0.0.0.0`). The
+  security fix MUST protect those users, not just loopback. **Decision: a
+  per-session secret key**, not a Host/Origin allowlist. The allowlist only
+  defends the loopback browser-confused-deputy; a direct remote client just
+  sends the expected `Host`, so the allowlist is theater for remote exposure. A
+  secret key is the only thing that authenticates a client uniformly across
+  loopback, tunnel, and direct-remote, and it also defeats DNS rebinding. See A1.
+
+## Component map
+
+| File | Role |
+|------|------|
+| `skills/brainstorming/scripts/server.cjs` | Zero-dep HTTP + WebSocket server (RFC 6455 hand-rolled). Serves the newest screen, watches `content/`, records events to `state/events`. |
+| `skills/brainstorming/scripts/helper.js` | Injected into every page. WebSocket client, click capture, `window.brainstorm` API. |
+| `skills/brainstorming/scripts/frame-template.html` | Frame (header, theme CSS, status dot, indicator bar) wrapped around content fragments. |
+| `skills/brainstorming/scripts/start-server.sh` | Launch wrapper. Session dir, host/url-host, owner-PID resolution, platform backgrounding. |
+| `skills/brainstorming/scripts/stop-server.sh` | Kills the server by PID file, cleans `/tmp` sessions. |
+| `skills/brainstorming/visual-companion.md` | Operator guide the agent reads when it accepts the companion. |
+| `skills/brainstorming/SKILL.md` | Where the companion is offered and the per-question decision lives. |
+
+## Disposition summary
+
+| ID | Item | Source | Disposition |
+|----|------|--------|-------------|
+| A1 | Per-session secret key on `/`, `/files/*`, and WS (supersedes Host/Origin) | issues #1014, PRs #1110/#1553 | **Do** — chosen approach |
+| A2 | ~~Host/Origin allowlist~~ | PRs #1110/#1553 | Dropped — subsumed by A1 |
+| A3 | Crash on `null` / non-object WS payload | PR #1504 | Do |
+| A4 | Frame-length bound in `decodeFrame` | issue #1446 | Already fixed — verify/close |
+| B1 | Dotfile screens served as content (`._*.html`) | PR #950 | Do |
+| B2 | `stop-server.sh` kills reused/stale PID | PR #1703 | Do |
+| B3 | WS client reconnect backoff + status indicator | PR #856 | Do |
+| C1 | Idle timeout too short / not configurable; WS not closed on shutdown | issue #1237 (PR #1689) | Do |
+| C2 | Server death is invisible to user/agent | issue #1237 (residual) | Do |
+| D1 | Permanent opt-out of the companion | issue #892 | Do — design choice open |
+| D2 | Free-text feedback from the browser | issue #957 | Do — needs server change too |
+| D3 | Auto-open the companion URL | PR #759 (#755) | Do |
+| D4 | Light/dark contrast helpers in the frame | PR #1683 | Do |
+| E1 | Hard-gate terminal-vs-HTML per question | PR #1037 | **Workshop** |
+| E2 | Move session state out of the working tree | issue #975 (PR #977) | **Deferred** |
+| E3 | Vendor Alpine.js for interactive mockups | PR #1639 | **Dropped** |
+| E4 | Shell-lint warnings in start/stop scripts | PR #1677 | Opportunistic only |
+
+---
+
+## A. Server security hardening (`server.cjs`)
+
+### A1 — Per-session secret key (chosen approach)
+
+**Threat model.** Two assets: confidentiality of the served screen (`/`) and
+files (`/files/*`), and integrity of `state/events` — a WebSocket client with a
+truthy `choice` writes there (`server.cjs:243-246`), and the agent reads it next
+turn as the user's selection, i.e. **prompt injection into a live session with
+full tool access**. Reachers: with the default `127.0.0.1` bind, a malicious
+page in the user's browser (a confused deputy — runs attacker JS *and* can reach
+loopback); with a remote bind (`--host 0.0.0.0`, tailnet/LAN), any host that can
+route to the port, directly, with no same-origin policy in the way. Today
+`handleUpgrade` (`server.cjs:176`) checks only `Sec-WebSocket-Key`, and
+`handleRequest` (`server.cjs:138`) checks nothing — both are wide open.
+
+**Why a key, not an allowlist.** A Host/Origin allowlist only defends the
+loopback browser-deputy. A direct remote client just sends the expected `Host`
+and forges/omits `Origin`, so the allowlist is theater for exactly the remote
+case we must protect. A per-session secret authenticates the client uniformly
+across loopback, SSH tunnel, and direct-remote, and it also kills DNS rebinding
+(the rebound page neither knows the key nor receives the host-scoped cookie).
+So the key **supersedes** A1/A2's allowlist entirely — no `BRAINSTORM_ALLOWED_HOSTS`.
+
+**Design.** Random token (`crypto.randomBytes(32)` hex), generated in
+`server.cjs` at startup (overridable via `BRAINSTORM_TOKEN` for deterministic
+tests):
+
+1. **URL carries it** as `?key=<token>`. The server already builds `url` in its
+   `server-started` JSON (`server.cjs:351`) and writes it to `state/server-info`
+   — appending `?key=` there means `start-server.sh` (greps and prints that
+   JSON) and the skill (hands the user that URL) need **no change**.
+2. **Cookie bootstrap.** A valid `?key` on `/` sets
+   `brainstorm-key-<port>=<token>; HttpOnly; SameSite=Strict; Path=/`. The
+   browser then auto-attaches it to same-origin subresources (`/files/*`) and
+   the WebSocket handshake, so the agent can write any URL style and it works,
+   and `helper.js` needs no change. Cookie name is **per-port** to avoid the
+   Jupyter multi-server collision (cookies aren't port-scoped).
+   `SameSite=Strict` is safe for CDN/Unsplash content — that cookie is host-
+   scoped, so outbound CDN requests never carry it; SameSite only governs
+   requests back to our origin, which are all same-site.
+3. **Auth gate** = valid `?key` **OR** valid cookie (compared with
+   `crypto.timingSafeEqual`) on `/`, `/files/*`, and the WS upgrade. Missing/bad
+   key → friendly **403 HTML page** ("this page needs the full URL your coding
+   agent gave you, including `?key=…`" — generic "coding agent", not "Claude",
+   since this ships on Codex/Gemini/Copilot too). WS upgrade → destroy socket.
+
+The query token is the source of truth; the cookie is a convenience that never
+bears initial-auth load.
+
+**Blast radius.** `server.cjs` (all logic). `helper.js` optional one-liner
+(append `?key=` from `location.search` to the WS URL as a cookie-blocked
+fallback). `start-server.sh` none. `visual-companion.md` doc note (URL now has
+`?key=`; don't strip it). Tests updated to pass the token.
+
+### A2 — Host/Origin allowlist — DROPPED
+
+Subsumed by A1. The secret key closes the WS-injection vector (#1014), the
+HTTP/WS DNS-rebinding read vector (PR #1553), and the cross-origin WS vector
+(PR #1110) in one mechanism, and unlike an allowlist it actually protects the
+remote-bind case. No `Host`/`Origin` validation, no `BRAINSTORM_ALLOWED_HOSTS`.
+
+### A3 — Server crashes on `null` / primitive WS payload
+
+**Problem.** `handleMessage` (`server.cjs:233`) does `JSON.parse(text)` then
+`if (event.choice)` at `server.cjs:243`. A client that sends the 4-byte text
+frame `null` yields `event === null`, and `null.choice` throws. The throw is
+**not** caught — `handleMessage` is called from the `socket.on('data')` handler
+(`server.cjs:207`) outside the `try/catch`, which only wraps `decodeFrame`. The
+result is an uncaught exception and process exit. Any local client can kill the
+server.
+
+**Change.** Guard the access: `if (event && event.choice)`. Minimal and exact —
+`JSON.parse` can't produce `undefined`, and primitives return `undefined` for
+`.choice` without throwing, so only `null` is the live hazard. (Avoid the
+broader fixes — a top-level `try/catch` or `process.on('uncaughtException')`
+would mask other bugs.)
+
+### A4 — Frame-length bound in `decodeFrame` (adjacent)
+
+Referenced by PR #1504 as #1446. The current code **already** bounds extended
+frame lengths: `MAX_FRAME_PAYLOAD_BYTES = 10MB` (`server.cjs:10`) is enforced at
+`server.cjs:58-67` before any `Buffer.alloc`. Action: verify #1446 against
+current `dev` and close if already resolved, rather than re-implementing.
+
+---
+
+## B. Server robustness / correctness
+
+### B1 — macOS resource-fork dotfiles served as screen content
+
+**Problem.** The newest-screen selector filters on `f.endsWith('.html')` only
+(`server.cjs:127-128`). On macOS/ExFAT, `._screen.html` resource-fork files pass
+that filter and, being written alongside the real file, can sort newest — so the
+browser gets binary metadata instead of the mockup. Four read sites share the
+weak filter: `getNewestScreen` (`server.cjs:127`), `knownFiles` init
+(`server.cjs:279`), the `fs.watch` handler (`server.cjs:286`), and the `/files/`
+endpoint (`server.cjs:154-156`).
+
+**Change.** Reject dotfiles (`!f.startsWith('.')`) at all four sites. Covers
+`._*`, `.DS_Store`, etc.
+
+### B2 — `stop-server.sh` can kill a reused PID
+
+**Problem.** `stop-server.sh` reads the PID from `state/server.pid`
+(`stop-server.sh:20`) and `kill`s it (`:23`, escalating to `-9` at `:35`)
+without confirming the PID still belongs to our server. After a reboot or PID
+wraparound the file can point at an unrelated process, which we'd then SIGKILL.
+
+**Change.** Before signalling, verify ownership — the PID's command is `node`
+running our `server.cjs`, ideally matching this session. If ownership can't be
+proven, fail closed (report `stale_pid`, don't kill). Keep the existing
+`stopped` / `not_running` outputs for the real cases.
+
+### B3 — WebSocket client: silent reconnect, stale "Connected"
+
+**Problem.** `helper.js` reconnects on a fixed 1s timer (`helper.js:21-23`),
+has no `onerror` handler, never nulls `ws` on close, and never clears a pending
+reconnect timer. The frame's status element is hardcoded to "Connected" with the
+dot pinned to `var(--success)` (`frame-template.html:77,200`). When the laptop
+sleeps or the server restarts, the page shows "Connected" over a dead socket and
+queues events with no feedback.
+
+**Change.**
+- `helper.js`: exponential backoff (500ms → ×2 → cap 30s, reset on open);
+  `onerror` delegating to `onclose`; `ws = null` on close; `clearTimeout` before
+  reconnecting.
+- `frame-template.html`: drive the status dot from a `--status-color` custom
+  property so JS can switch Connected (green) / Reconnecting (yellow) /
+  Disconnected (red).
+
+---
+
+## C. Lifecycle / timeout (issue #1237)
+
+### C1 — Idle timeout too short, not configurable, WS keeps process alive
+
+**Problem.** `IDLE_TIMEOUT_MS` is hardcoded to 30 minutes (`server.cjs:258`),
+enforced by the 60s lifecycle check (`server.cjs:329-332`). A single brainstorm
+question can sit longer than 30 min while the user thinks or steps away, so the
+server dies mid-session. Separately, `shutdown()` (`server.cjs:310-321`) calls
+`server.close()` but never closes the upgraded sockets in `clients`
+(`server.cjs:174`), so an open browser connection can keep the Node process
+alive past shutdown.
+
+**Change.**
+- Raise the default (≈2h) and make it configurable:
+  `--idle-timeout-minutes` in `start-server.sh` → an env var → `IDLE_TIMEOUT_MS`,
+  with validation against Node timer overflow.
+- Expose the effective timeout in the startup JSON / `state/server-info`.
+- In `shutdown()`, close every socket in `clients` so the process actually
+  exits.
+
+### C2 — Server death is invisible
+
+**Problem.** When the server exits it writes `state/server-stopped` and removes
+`state/server-info` (`server.cjs:312-317`), and the skill is *told* to check
+those files (`visual-companion.md:98`) — but it's soft guidance the model skips,
+and the browser just shows a generic "can't be reached." The user diagnoses it
+manually; the agent keeps referring to a dead URL.
+
+**Change (two parts, independent of C1):**
+- **Browser-facing tombstone.** Leave something at the last-served URL that says
+  "this companion expired — ask Claude to restart it" instead of a connection
+  error. Options to weigh: `helper.js` rendering a banner when the socket stays
+  down past backoff (works only while the page is loaded), vs. a more involved
+  approach that keeps a minimal responder alive to serve a tombstone page.
+- **Harder skill check.** Tighten `visual-companion.md` / `SKILL.md` so
+  "check `server-info`/`server-stopped` before referring to the URL or pushing a
+  screen" is a required step, not a note. Keep it lightweight — possibly a
+  one-line helper the agent always runs.
+
+---
+
+## D. Features
+
+### D1 — Permanent opt-out of the visual companion (issue #892)
+
+**Problem.** The companion is offered as its own message every session
+(`SKILL.md:25,151-152`). A user who never wants it pays that round-trip — and
+HTML generation — every time. There's no way to say "never offer this."
+
+**Change.** Before the offer step, the skill checks a user-level setting and
+skips the offer entirely when opt-out is set.
+
+**Design choice open.** Mechanism isn't settled:
+- Env var (e.g. `SUPERPOWERS_VISUAL_COMPANION=off`) the skill is told to read —
+  simplest, matches what the issue asks for, lives in `.zshrc`.
+- A plugin-settings file (`.claude/superpowers.local.md` frontmatter) — more
+  structured, per-project capable, but heavier and project-scoped.
+- Reliability caveat from the issue: a separate "no-companion" skill competes on
+  trigger words and isn't reliable — rejected.
+
+Pick the mechanism, then it's a small `SKILL.md` change plus a documented knob.
+
+### D2 — Free-text feedback from the browser (issue #957)
+
+**Problem.** The client only captures clicks on `[data-choice]`
+(`helper.js:36-62`). A user who wants to annotate a mockup ("wrong shade of
+blue") has to switch to the terminal, breaking the visual flow.
+
+**Change.** Add a feedback `<textarea>` whose submit emits
+`{"type":"feedback","text":...,"timestamp":...}` via the existing
+`window.brainstorm.send` path (`helper.js:82-85`).
+
+**Cross-cutting — server change required.** `handleMessage` only persists events
+when `event.choice` is truthy (`server.cjs:243`). A `feedback` event has no
+`choice`, so today it would be logged but **never written to `state/events`**,
+and the agent wouldn't see it. The persistence condition must also accept
+`feedback` events. Document the new event shape in `visual-companion.md`
+(Browser Events Format, `:247-259`). Decide the submit trigger (button vs blur
+vs both) and where the textarea renders (frame-level vs opt-in per screen).
+
+### D3 — Auto-open the companion URL (PR #759, issue #755)
+
+**Problem.** `start-server.sh` only prints the URL; the user opens it manually.
+In WSL2 especially, people expect the browser to open.
+
+**Change.** Best-effort opener after the `server-started` JSON is parsed:
+WSL → `cmd.exe /c start`, macOS → `open`, Linux → `xdg-open` only when
+`DISPLAY`/`WAYLAND_DISPLAY` is set. Swallow failures, never block startup, keep
+echoing the URL. Document in `visual-companion.md`. (Consider an opt-out for
+headless/remote runs where popping a browser is wrong — ties into D1's config
+mechanism.)
+
+### D4 — Light/dark contrast helpers (PR #1683)
+
+**Problem.** Content fragments are wrapped in the OS-aware frame
+(`frame-template.html`). In dark mode, quick mockups often use white inline
+backgrounds while inheriting low-contrast frame text, making cards/panels hard
+to read.
+
+**Change.** Add `.light-surface` / `.dark-surface` helper classes plus a
+conservative fallback for common inline light backgrounds, and document them in
+`visual-companion.md`'s CSS reference. Pure CSS in `frame-template.html`.
+
+---
+
+## E. Workshop / deferred / dropped
+
+### E1 — Hard-gate terminal-vs-HTML per question (PR #1037) — WORKSHOP
+
+The soft guidance already exists: "decide per-question," with browser-vs-terminal
+tests in `SKILL.md:156-161` and `visual-companion.md:5-25`. The complaint is that
+the model renders HTML for purely textual content (A/B lists, clarifying
+questions), wasting tokens and a turn. PR #1037 wraps the decision in a
+`<HARD-GATE>`. **Per Jesse, we'll workshop the wording/mechanism together** —
+this is behavior-shaping skill content and not specced here.
+
+### E2 — Move session state out of the working tree (issue #975 / PR #977) — DEFERRED
+
+Today `--project-dir` writes session state to `<project>/.superpowers/brainstorm/`
+(`start-server.sh:80-84`) and the skill tells the user to gitignore it
+(`visual-companion.md:48`). The ask is a `--state-dir` / `SUPERPOWERS_STATE_DIR`
+default outside the repo (XDG), keeping `--project-dir` as an alias.
+**Deferred by Jesse for now.** Captured so it isn't lost.
+
+### E3 — Vendor Alpine.js for interactive mockups (PR #1639) — DROPPED
+
+Adds a vendored Alpine build so mockups can be interactive (tabs, accordions,
+forms) without hand-rolled JS. **Dropped per Jesse** — we are not taking on a
+vendored third-party dependency in the companion runtime. The underlying need
+(interactive mockups) is not being pursued via this route.
+
+### E4 — Shell-lint warnings (PR #1677) — OPPORTUNISTIC
+
+SC2034 (and friends) in `start-server.sh` / `stop-server.sh`. Trivial; fold into
+B2/C1/D3 when we're already editing those scripts rather than as its own change.
+
+---
+
+## Suggested grouping for implementation
+
+These cluster into a few coherent passes (each independently testable against
+`tests/brainstorm-server/`):
+
+1. **Security pass** (IN PROGRESS, branch `brainstorm-companion-session-key`) —
+   A1 per-session key (supersedes A2) + A3 null-crash guard. Verify/close A4.
+   *Highest priority.*
+2. **Lifecycle pass** — C1 + C2 together (both touch `shutdown()` and the
+   server-death story).
+3. **Robustness pass** — B1, B2, B3 (independent, small).
+4. **Feature pass** — D1, D2, D3, D4 (D2 needs the server persistence tweak;
+   D1's config mechanism is shared with D3's opt-out).
+
+E1 is a separate workshop session. E2/E3 are out of scope for this round.
--- a/docs/testing.md
+++ b/docs/testing.md
@@ -12,6 +12,7 @@ Live in `tests/`. Currently:
 - `tests/brainstorm-server/` — node test suite for the brainstorm server JS code.
 - `tests/opencode/` — bash tests for OpenCode plugin loading, bootstrap caching, and tool registration.
 - `tests/codex-plugin-sync/` — bash sync verification.
+- `tests/kimi/` — bash/Python checks for Kimi plugin manifest wiring.
 - `tests/claude-code/test-helpers.sh`, `analyze-token-usage.py` — utilities used by remaining bash tests.
 - `tests/claude-code/test-subagent-driven-development.sh` — agent-can-describe-SDD test (no drill counterpart; tests description-recall, not behavior).
 - `tests/claude-code/test-subagent-driven-development-integration.sh` — extended SDD integration with token analysis (drill covers the YAGNI subset; bash adds commit-count, Claude Code task-tracking, and token telemetry assertions).
--- a/docs/windows/polyglot-hooks.md
+++ b/docs/windows/polyglot-hooks.md
@@ -1,6 +1,8 @@
 # Cross-Platform Polyglot Hooks for Claude Code

-Claude Code plugins need hooks that work on Windows, macOS, and Linux. This document explains the polyglot wrapper technique that makes this possible.
+Claude Code plugins need hooks that work on Windows, macOS, and Linux. This document describes the single generic dispatcher pattern used in `hooks/run-hook.cmd`.
+
+> **Authoritative source:** `hooks/run-hook.cmd` is the canonical implementation. When this document and the code diverge, trust the code.

 ## The Problem

@@ -10,52 +12,22 @@ Claude Code runs hook commands through the system's default shell:

 This creates several challenges:

-1. **Script execution**: Windows CMD can't execute `.sh` files directly - it tries to open them in a text editor
+1. **Script execution**: Windows CMD can't execute `.sh` files directly
 2. **Path format**: Windows uses backslashes (`C:\path`), Unix uses forward slashes (`/path`)
 3. **Environment variables**: `$VAR` syntax doesn't work in CMD
-4. **No `bash` in PATH**: Even with Git Bash installed, `bash` isn't in the PATH when CMD runs
+4. **`.sh` auto-prepend**: Claude Code on Windows automatically prepends `bash` to any command that contains `.sh` in its path — this interferes with the dispatcher if scripts have extensions

-## The Solution: Polyglot `.cmd` Wrapper
+## The Solution: Extensionless Scripts + Single Generic Dispatcher

-A polyglot script is valid syntax in multiple languages simultaneously. Our wrapper is valid in both CMD and bash:
+The repo uses one generic `run-hook.cmd` dispatcher for all hooks. Hook scripts are **extensionless** (`session-start`, not `session-start.sh`). This is deliberate: it prevents Claude Code's Windows auto-detection from prepending `bash` to the dispatcher command and breaking it.

-```cmd
-: << 'CMDBLOCK'
-@echo off
-"C:\Program Files\Git\bin\bash.exe" -l -c "\"$(cygpath -u \"$CLAUDE_PLUGIN_ROOT\")/hooks/session-start.sh\""
-exit /b
-CMDBLOCK
-
-# Unix shell runs from here
-"${CLAUDE_PLUGIN_ROOT}/hooks/session-start.sh"
-```
-
-### How It Works
-
-#### On Windows (CMD.exe)
-
-1. `: << 'CMDBLOCK'` - CMD sees `:` as a label (like `:label`) and ignores `<< 'CMDBLOCK'`
-2. `@echo off` - Suppresses command echoing
-3. The bash.exe command runs with:
-   - `-l` (login shell) to get proper PATH with Unix utilities
-   - `cygpath -u` converts Windows path to Unix format (`C:\foo` → `/c/foo`)
-4. `exit /b` - Exits the batch script, stopping CMD here
-5. Everything after `CMDBLOCK` is never reached by CMD
-
-#### On Unix (bash/sh)
-
-1. `: << 'CMDBLOCK'` - `:` is a no-op, `<< 'CMDBLOCK'` starts a heredoc
-2. Everything until `CMDBLOCK` is consumed by the heredoc (ignored)
-3. `# Unix shell runs from here` - Comment
-4. The script runs directly with the Unix path
-
-## File Structure
+### File Structure

 ```
 hooks/
-├── hooks.json           # Points to the .cmd wrapper
-├── session-start.cmd    # Polyglot wrapper (cross-platform entry point)
-└── session-start.sh     # Actual hook logic (bash script)
+├── hooks.json          # Points to run-hook.cmd with extensionless script name
+├── run-hook.cmd        # Cross-platform dispatcher (the polyglot wrapper)
+└── session-start       # Actual hook logic — extensionless bash script
 ```

 ### hooks.json
@@ -65,11 +37,12 @@ hooks/
  "hooks": {
    "SessionStart": [
      {
-        "matcher": "startup|resume|clear|compact",
+        "matcher": "startup|clear|compact",
        "hooks": [
          {
            "type": "command",
-            "command": "\"${CLAUDE_PLUGIN_ROOT}/hooks/session-start.cmd\""
+            "command": "\"${CLAUDE_PLUGIN_ROOT}/hooks/run-hook.cmd\" session-start",
+            "async": false
          }
        ]
      }
@@ -78,41 +51,63 @@ hooks/
 }
 ```

-Note: The path must be quoted because `${CLAUDE_PLUGIN_ROOT}` may contain spaces on Windows (e.g., `C:\Program Files\...`).
+The path is quoted because `${CLAUDE_PLUGIN_ROOT}` may contain spaces.

-## Requirements
+## How `run-hook.cmd` Works at a High Level

-### Windows
- **Git for Windows** must be installed (provides `bash.exe` and `cygpath`)
- Default installation path: `C:\Program Files\Git\bin\bash.exe`
- If Git is installed elsewhere, the wrapper needs modification
+`run-hook.cmd` is a polyglot script: Windows treats the first block as batch
+commands, while Unix shells treat that block as a no-op heredoc and continue
+after it.

-### Unix (macOS/Linux)
- Standard bash or sh shell
- The `.cmd` file must have execute permission (`chmod +x`)
+Do not copy an implementation from this document. Read `hooks/run-hook.cmd`
+directly when changing the dispatcher, and run `tests/hooks/test-session-start.sh`
+afterward.
+
+### How it works on Windows (CMD.exe)
+
+1. The batch section validates the script name and resolves the hook directory
+   from the dispatcher's own location.
+2. It tries bash in three places:
+   - `C:\Program Files\Git\bin\bash.exe`
+   - `C:\Program Files (x86)\Git\bin\bash.exe`
+   - `bash` on `PATH` (MSYS2, Cygwin, or a non-default Git install)
+3. If bash is found, it runs the named extensionless hook script from the hooks
+   directory.
+4. If no bash is found, the dispatcher exits `0` silently — the plugin
+   continues working, it just skips the hook.
+5. `exit /b` stops CMD before it reaches the Unix section.
+
+### How it works on Unix (bash/sh)
+
+1. `: << 'CMDBLOCK'` opens a heredoc on a no-op command.
+2. The entire CMD batch block is consumed by the heredoc and ignored.
+3. After `CMDBLOCK`, bash resolves the script directory and `exec`s the named
+   extensionless script directly.
+
+### Key design decisions
+
+| Decision | Why |
+|----------|-----|
+| Extensionless scripts | Prevents Claude Code's Windows `.sh`-auto-prepend from interfering with the dispatcher command |
+| No `-l` (login shell) | Not needed; hook scripts should be self-contained and not depend on login-shell PATH setup |
+| No `cygpath` | Bash receives the Windows path directly and handles it correctly; `cygpath` was needed by the old `-c "..."` invocation pattern, not by direct exec |
+| Silent exit on no-bash | Avoids breaking the plugin for users who don't have Git for Windows; hook context injection is skipped gracefully |

 ## Writing Cross-Platform Hook Scripts

-Your actual hook logic goes in the `.sh` file. To ensure it works on Windows (via Git Bash):
+Your hook logic goes in the extensionless script file. A few portable patterns:

-### Do:
+### Do
 - Use pure bash builtins when possible
 - Use `$(command)` instead of backticks
 - Quote all variable expansions: `"$VAR"`
- Use `printf` or here-docs for output

-### Avoid:
- External commands that may not be in PATH (sed, awk, grep)
- If you must use them, they're available in Git Bash but ensure PATH is set up (use `bash -l`)
+### Avoid
+- Relying on PATH-dependent tools without fallbacks (the hook runs without `-l`, so login-shell PATH is not set)
+- Giving scripts a `.sh` extension — this triggers Claude Code's Windows auto-prepend

-### Example: JSON Escaping Without sed/awk
+### Example: JSON escaping without external tools

-Instead of:
-```bash
-escaped=$(echo "$content" | sed 's/\\/\\\\/g' | sed 's/"/\\"/g' | awk '{printf "%s\\n", $0}')
-```
-
-Use pure bash:
 ```bash
 escape_for_json() {
    local input="$1"
@@ -133,80 +128,21 @@ escape_for_json() {
 }
 ```

-## Reusable Wrapper Pattern
-
-For plugins with multiple hooks, you can create a generic wrapper that takes the script name as an argument:
-
-### run-hook.cmd
-```cmd
-: << 'CMDBLOCK'
-@echo off
-set "SCRIPT_DIR=%~dp0"
-set "SCRIPT_NAME=%~1"
-"C:\Program Files\Git\bin\bash.exe" -l -c "cd \"$(cygpath -u \"%SCRIPT_DIR%\")\" && \"./%SCRIPT_NAME%\""
-exit /b
-CMDBLOCK
-
-# Unix shell runs from here
-SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
-SCRIPT_NAME="$1"
-shift
-"${SCRIPT_DIR}/${SCRIPT_NAME}" "$@"
-```
-
-### hooks.json using the reusable wrapper
-```json
-{
-  "hooks": {
-    "SessionStart": [
-      {
-        "matcher": "startup",
-        "hooks": [
-          {
-            "type": "command",
-            "command": "\"${CLAUDE_PLUGIN_ROOT}/hooks/run-hook.cmd\" session-start.sh"
-          }
-        ]
-      }
-    ],
-    "PreToolUse": [
-      {
-        "matcher": "Bash",
-        "hooks": [
-          {
-            "type": "command",
-            "command": "\"${CLAUDE_PLUGIN_ROOT}/hooks/run-hook.cmd\" validate-bash.sh"
-          }
-        ]
-      }
-    ]
-  }
-}
-```
-
 ## Troubleshooting

 ### "bash is not recognized"
-CMD can't find bash. The wrapper uses the full path `C:\Program Files\Git\bin\bash.exe`. If Git is installed elsewhere, update the path.

-### "cygpath: command not found" or "dirname: command not found"
-Bash isn't running as a login shell. Ensure `-l` flag is used.
+CMD couldn't find bash in any of the three locations the dispatcher tries. The dispatcher exits silently (0) rather than erroring, so the hook is skipped. Install Git for Windows at the standard path or ensure `bash` is on `PATH`.

-### Path has weird `\/` in it
-`${CLAUDE_PLUGIN_ROOT}` expanded to a Windows path ending with backslash, then `/hooks/...` was appended. Use `cygpath` to convert the entire path.
+### Hook runs on Unix but does nothing on Windows

-### Script opens in text editor instead of running
-The hooks.json is pointing directly to the `.sh` file. Point to the `.cmd` wrapper instead.
+Check that the script filename is **extensionless** in `hooks.json`. A command like `run-hook.cmd session-start.sh` can trigger Claude Code's `.sh` auto-detection and bypass the intended CMD dispatcher path, or just try to run a non-existent `session-start.sh` script.

-### Works in terminal but not as hook
-Claude Code may run hooks differently. Test by simulating the hook environment:
-```powershell
-$env:CLAUDE_PLUGIN_ROOT = "C:\path\to\plugin"
-cmd /c "C:\path\to\plugin\hooks\session-start.cmd"
-```
+### Hook doesn't fire at all
+
+Verify the `matcher` in `hooks.json` matches the event type your harness emits. Claude Code uses `startup|clear|compact`; Codex uses `startup|resume|clear`. Check `hooks-codex.json` for the Codex variant.

 ## Related Issues

- [anthropics/claude-code#9758](https://github.com/anthropics/claude-code/issues/9758) - .sh scripts open in editor on Windows
- [anthropics/claude-code#3417](https://github.com/anthropics/claude-code/issues/3417) - Hooks don't work on Windows
- [anthropics/claude-code#6023](https://github.com/anthropics/claude-code/issues/6023) - CLAUDE_PROJECT_DIR not found
+- [anthropics/claude-code#9758](https://github.com/anthropics/claude-code/issues/9758) — `.sh` scripts open in editor on Windows
+- [anthropics/claude-code#3417](https://github.com/anthropics/claude-code/issues/3417) — Hooks don't work on Windows
--- a/2
+++ b/2
--- a/scripts/lint-shell.sh
+++ b/scripts/lint-shell.sh
@@ -0,0 +1,211 @@
+#!/usr/bin/env bash
+#
+# Lint shell scripts in this repository.
+#
+# Usage:
+#   scripts/lint-shell.sh [--all] [--format] [--strict] [file ...]
+#
+# By default, runs ShellCheck and shell syntax checks on changed shell scripts.
+# Use --format to format with shfmt before linting. Use --all for the full tracked
+# baseline, or pass files explicitly to lint a smaller set.
+set -euo pipefail
+
+usage() {
+  sed -n '2,9p' "$0" | sed 's/^# \{0,1\}//'
+}
+
+die() {
+  echo "error: $*" >&2
+  exit 1
+}
+
+require_tool() {
+  command -v "$1" >/dev/null 2>&1 || die "required tool '$1' is not on PATH"
+}
+
+is_shell_file() {
+  local path="$1"
+  local first_line=""
+
+  [[ -f "$path" ]] || return 1
+
+  case "$path" in
+    *.sh)
+      return 0
+      ;;
+  esac
+
+  IFS= read -r first_line <"$path" || true
+  [[ "$first_line" =~ ^#!.*[/[:space:]](bash|dash|ksh|sh)([[:space:]]|$) ]]
+}
+
+ensure_git_work_tree() {
+  git rev-parse --is-inside-work-tree >/dev/null 2>&1 \
+    || die "run this from inside a git work tree, or pass files explicitly"
+}
+
+add_shell_file() {
+  local path
+  local existing
+
+  path="$1"
+  if ! is_shell_file "$path"; then
+    return 0
+  fi
+
+  if [[ "${#files[@]}" -gt 0 ]]; then
+    for existing in "${files[@]}"; do
+      if [[ "$existing" == "$path" ]]; then
+        return 0
+      fi
+    done
+  fi
+
+  files+=("$path")
+}
+
+collect_all_shell_files() {
+  local path
+
+  ensure_git_work_tree
+
+  while IFS= read -r -d '' path; do
+    add_shell_file "$path"
+  done < <(git ls-files -z)
+}
+
+collect_changed_shell_files() {
+  local path
+
+  ensure_git_work_tree
+
+  if git rev-parse --verify HEAD >/dev/null 2>&1; then
+    while IFS= read -r -d '' path; do
+      add_shell_file "$path"
+    done < <(git diff --name-only -z --diff-filter=ACMR HEAD)
+
+    while IFS= read -r -d '' path; do
+      add_shell_file "$path"
+    done < <(git diff --cached --name-only -z --diff-filter=ACMR)
+  else
+    collect_all_shell_files
+  fi
+
+  while IFS= read -r -d '' path; do
+    add_shell_file "$path"
+  done < <(git ls-files --others --exclude-standard -z)
+}
+
+collect_requested_shell_files() {
+  local path
+
+  for path in "$@"; do
+    add_shell_file "$path"
+  done
+}
+
+syntax_shell_for() {
+  local path="$1"
+  local first_line=""
+
+  IFS= read -r first_line <"$path" || true
+
+  case "$first_line" in
+    *"/sh"* | *" env sh"* | *"/dash"* | *" env dash"*)
+      printf 'sh'
+      ;;
+    *)
+      printf 'bash'
+      ;;
+  esac
+}
+
+run_syntax_checks() {
+  local file
+  local shell_name
+
+  for file in "$@"; do
+    shell_name="$(syntax_shell_for "$file")"
+    case "$shell_name" in
+      sh)
+        sh -n "$file"
+        ;;
+      bash)
+        bash -n "$file"
+        ;;
+      *)
+        die "unsupported shell for syntax check: $shell_name"
+        ;;
+    esac
+  done
+}
+
+format=false
+strict=false
+all=false
+requested_files=()
+
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --all)
+      all=true
+      ;;
+    --format)
+      format=true
+      ;;
+    --strict)
+      strict=true
+      ;;
+    -h | --help)
+      usage
+      exit 0
+      ;;
+    --)
+      shift
+      requested_files+=("$@")
+      break
+      ;;
+    -*)
+      die "unknown option: $1"
+      ;;
+    *)
+      requested_files+=("$1")
+      ;;
+  esac
+  shift
+done
+
+require_tool shellcheck
+if [[ "$format" == true ]]; then
+  require_tool shfmt
+fi
+
+files=()
+if [[ "${#requested_files[@]}" -gt 0 ]]; then
+  collect_requested_shell_files "${requested_files[@]}"
+elif [[ "$all" == true ]]; then
+  collect_all_shell_files
+else
+  collect_changed_shell_files
+fi
+
+if [[ "${#files[@]}" -eq 0 ]]; then
+  echo "No shell files found."
+  exit 0
+fi
+
+if [[ "$format" == true ]]; then
+  echo "Formatting ${#files[@]} shell files"
+  shfmt_args=(-i 2 -ci -bn)
+  shfmt "${shfmt_args[@]}" -w "${files[@]}"
+fi
+
+echo "Linting ${#files[@]} shell files"
+
+shellcheck_args=(--severity=warning --external-sources --source-path=SCRIPTDIR)
+if [[ "$strict" == true ]]; then
+  shellcheck_args+=("--enable=check-extra-masked-returns,check-set-e-suppressed,quote-safe-variables,deprecate-which,avoid-nullary-conditions")
+fi
+
+shellcheck "${shellcheck_args[@]}" "${files[@]}"
+run_syntax_checks "${files[@]}"
--- a/scripts/sync-to-codex-plugin.sh
+++ b/scripts/sync-to-codex-plugin.sh
@@ -52,6 +52,7 @@ EXCLUDES=(
  "/.gitattributes"
  "/.github/"
  "/.gitignore"
+  "/.kimi-plugin/"
  "/.opencode/"
  "/.pi/"
  "/.version-bump.json"
--- a/skills/brainstorming/SKILL.md
+++ b/skills/brainstorming/SKILL.md
@@ -22,7 +22,7 @@ Every project goes through this process. A todo list, a single-function utility,
 You MUST create a task for each of these items and complete them in order:

 1. **Explore project context** — check files, docs, recent commits
-2. **Offer visual companion** (if topic will involve visual questions) — this is its own message, not combined with a clarifying question. See the Visual Companion section below.
+2. **Offer the visual companion just-in-time** — NOT upfront. The first time a question would genuinely be clearer shown than described, offer it then (its own message); on approval its browser tab opens for you. If no visual question ever arises, never offer it. See the Visual Companion section below.
 3. **Ask clarifying questions** — one at a time, understand purpose/constraints/success criteria
 4. **Propose 2-3 approaches** — with trade-offs and your recommendation
 5. **Present design** — in sections scaled to their complexity, get user approval after each section
@@ -36,8 +36,6 @@ You MUST create a task for each of these items and complete them in order:
 ```dot
 digraph brainstorming {
    "Explore project context" [shape=box];
-    "Visual questions ahead?" [shape=diamond];
-    "Offer Visual Companion\n(own message, no other content)" [shape=box];
    "Ask clarifying questions" [shape=box];
    "Propose 2-3 approaches" [shape=box];
    "Present design sections" [shape=box];
@@ -47,10 +45,7 @@ digraph brainstorming {
    "User reviews spec?" [shape=diamond];
    "Invoke writing-plans skill" [shape=doublecircle];

-    "Explore project context" -> "Visual questions ahead?";
-    "Visual questions ahead?" -> "Offer Visual Companion\n(own message, no other content)" [label="yes"];
-    "Visual questions ahead?" -> "Ask clarifying questions" [label="no"];
-    "Offer Visual Companion\n(own message, no other content)" -> "Ask clarifying questions";
+    "Explore project context" -> "Ask clarifying questions";
    "Ask clarifying questions" -> "Propose 2-3 approaches";
    "Propose 2-3 approaches" -> "Present design sections";
    "Present design sections" -> "User approves design?";
@@ -148,10 +143,10 @@ Wait for the user's response. If they request changes, make them and re-run the

 A browser-based companion for showing mockups, diagrams, and visual options during brainstorming. Available as a tool — not a mode. Accepting the companion means it's available for questions that benefit from visual treatment; it does NOT mean every question goes through the browser.

-**Offering the companion:** When you anticipate that upcoming questions will involve visual content (mockups, layouts, diagrams), offer it once for consent:
-> "Some of what we're working on might be easier to explain if I can show it to you in a web browser. I can put together mockups, diagrams, comparisons, and other visuals as we go. This feature is still new and can be token-intensive. Want to try it? (Requires opening a local URL)"
+**Offering the companion (just-in-time):** Do NOT offer it upfront. Wait until a question would genuinely be clearer shown than told — a real mockup / layout / diagram question, not merely a UI *topic*. The first time that happens, offer it then, as its own message:
+> "This next part might be easier if I show you — I can put together mockups, diagrams, and comparisons in a browser tab as we go. It's still new and can be token-intensive. Want me to? I'll open it for you."

-**This offer MUST be its own message.** Do not combine it with clarifying questions, context summaries, or any other content. The message should contain ONLY the offer above and nothing else. Wait for the user's response before continuing. If they decline, proceed with text-only brainstorming.
+**This offer MUST be its own message.** Only the offer — no clarifying question, summary, or other content. Wait for the user's response. If they accept, start the server with `--open` so their browser opens to the first screen automatically. If they decline, continue text-only and don't offer again unless they raise it.

 **Per-question decision:** Even after the user accepts, decide FOR EACH QUESTION whether to use the browser or the terminal. The test: **would the user understand this better by seeing it than reading it?**

--- a/skills/brainstorming/scripts/frame-template.html
+++ b/skills/brainstorming/scripts/frame-template.html
@@ -73,8 +73,8 @@
      flex-shrink: 0;
    }
    .header h1 { font-size: 0.85rem; font-weight: 500; color: var(--text-secondary); }
-    .header .status { font-size: 0.7rem; color: var(--success); display: flex; align-items: center; gap: 0.4rem; }
-    .header .status::before { content: ''; width: 6px; height: 6px; background: var(--success); border-radius: 50%; }
+    .header .status { font-size: 0.7rem; color: var(--status-color, var(--success)); display: flex; align-items: center; gap: 0.4rem; }
+    .header .status::before { content: ''; width: 6px; height: 6px; background: var(--status-color, var(--success)); border-radius: 50%; }

    .main { flex: 1; overflow-y: auto; }
    #frame-content { padding: 2rem; min-height: 100%; }
@@ -197,7 +197,7 @@
 <body>
  <div class="header">
    <h1><a href="https://github.com/obra/superpowers" style="color: inherit; text-decoration: none;">Superpowers Brainstorming</a></h1>
-    <div class="status">Connected</div>
+    <div class="status">Connecting…</div>
  </div>

  <div class="main">
--- a/skills/brainstorming/scripts/helper.js
+++ b/skills/brainstorming/scripts/helper.js
@@ -1,26 +1,99 @@
 (function() {
+  const MIN_RECONNECT_MS = 500;
+  const MAX_RECONNECT_MS = 30000;
+  const TOMBSTONE_AFTER_MS = 15000; // show the "paused" overlay after this long disconnected
+
+  // Pure: next backoff delay (doubles, capped). Exported for unit tests.
+  function nextReconnectDelay(current, max) {
+    return Math.min(current * 2, max);
+  }
+  if (typeof module !== 'undefined' && module.exports) {
+    module.exports = { nextReconnectDelay, MIN_RECONNECT_MS, MAX_RECONNECT_MS, TOMBSTONE_AFTER_MS };
+  }
+
+  // Everything below is browser-only; bail out when loaded in Node (tests).
+  if (typeof window === 'undefined') return;
+
  const WS_URL = 'ws://' + window.location.host;
  let ws = null;
  let eventQueue = [];
+  let reconnectDelay = MIN_RECONNECT_MS;
+  let reconnectTimer = null;
+  let disconnectedSince = null;
+  let everConnected = false;
+  let tombstoneShown = false;
+
+  // Reflect connection state in the frame's status pill (absent on full-doc screens).
+  function setStatus(state) {
+    const el = document.querySelector('.status');
+    if (!el) return;
+    const map = {
+      connecting:   ['Connecting…',   'var(--text-tertiary)'],
+      connected:    ['Connected',     'var(--success)'],
+      reconnecting: ['Reconnecting…', 'var(--warning)'],
+      disconnected: ['Disconnected',  'var(--error)']
+    };
+    const [text, color] = map[state] || map.disconnected;
+    el.textContent = text;
+    el.style.setProperty('--status-color', color);
+  }
+
+  // Self-styled so it works on framed and full-document screens alike.
+  function showTombstone() {
+    if (tombstoneShown) return;
+    tombstoneShown = true;
+    const el = document.createElement('div');
+    el.id = 'bs-tombstone';
+    el.style.cssText = 'position:fixed;inset:0;z-index:99999;display:flex;' +
+      'align-items:center;justify-content:center;padding:2rem;text-align:center;' +
+      'background:rgba(20,20,22,0.92);color:#f5f5f7;font-family:system-ui,sans-serif';
+    el.innerHTML = '<div style="max-width:480px">' +
+      '<h2 style="margin:0 0 .5rem;font-weight:600">Companion paused</h2>' +
+      '<p style="margin:0;opacity:.85">This brainstorm companion has stopped. ' +
+      'Ask your coding agent to bring it back — this page reconnects automatically.</p></div>';
+    if (document.body) document.body.appendChild(el);
+  }

  function connect() {
+    if (reconnectTimer) { clearTimeout(reconnectTimer); reconnectTimer = null; }
+    setStatus(everConnected ? 'reconnecting' : 'connecting');
    ws = new WebSocket(WS_URL);

    ws.onopen = () => {
+      const recovered = tombstoneShown;
+      everConnected = true;
+      disconnectedSince = null;
+      reconnectDelay = MIN_RECONNECT_MS;
+      tombstoneShown = false;
+      setStatus('connected');
      eventQueue.forEach(e => ws.send(JSON.stringify(e)));
      eventQueue = [];
+      // Recovered from a tombstoned outage (e.g. the server restarted on the same
+      // port) — reload to pick up the restarted server's current screen.
+      if (recovered) window.location.reload();
    };

    ws.onmessage = (msg) => {
-      const data = JSON.parse(msg.data);
-      if (data.type === 'reload') {
-        window.location.reload();
-      }
+      let data;
+      try { data = JSON.parse(msg.data); } catch (e) { return; }
+      if (data.type === 'reload') window.location.reload();
    };

    ws.onclose = () => {
-      setTimeout(connect, 1000);
+      ws = null;
+      if (disconnectedSince === null) disconnectedSince = Date.now();
+      if (Date.now() - disconnectedSince >= TOMBSTONE_AFTER_MS) {
+        setStatus('disconnected');
+        showTombstone();
+      } else {
+        setStatus('reconnecting');
+      }
+      reconnectTimer = setTimeout(connect, reconnectDelay);
+      reconnectDelay = nextReconnectDelay(reconnectDelay, MAX_RECONNECT_MS);
    };
+
+    // Let onclose own reconnection so we don't schedule it twice.
+    ws.onerror = () => { try { ws.close(); } catch (e) {} };
  }

  function sendEvent(event) {
--- a/skills/brainstorming/scripts/server.cjs
+++ b/skills/brainstorming/scripts/server.cjs
@@ -7,6 +7,7 @@ const path = require('path');

 const OPCODES = { TEXT: 0x01, CLOSE: 0x08, PING: 0x09, PONG: 0x0A };
 const WS_MAGIC = '258EAFA5-E914-47DA-95CA-C5AB0DC85B11';
+const MAX_FRAME_PAYLOAD_BYTES = 10 * 1024 * 1024;

 function computeAcceptKey(clientKey) {
  return crypto.createHash('sha1').update(clientKey + WS_MAGIC).digest('base64');
@@ -53,10 +54,18 @@ function decodeFrame(buffer) {
    offset = 4;
  } else if (payloadLen === 127) {
    if (buffer.length < 10) return null;
-    payloadLen = Number(buffer.readBigUInt64BE(2));
+    const extendedLen = buffer.readBigUInt64BE(2);
+    if (extendedLen > BigInt(MAX_FRAME_PAYLOAD_BYTES)) {
+      throw new Error('WebSocket frame payload exceeds maximum allowed size');
+    }
+    payloadLen = Number(extendedLen);
    offset = 10;
  }

+  if (payloadLen > MAX_FRAME_PAYLOAD_BYTES) {
+    throw new Error('WebSocket frame payload exceeds maximum allowed size');
+  }
+
  const maskOffset = offset;
  const dataOffset = offset + 4;
  const totalLen = dataOffset + payloadLen;
@@ -73,7 +82,21 @@ function decodeFrame(buffer) {

 // ========== Configuration ==========

-const PORT = process.env.BRAINSTORM_PORT || (49152 + Math.floor(Math.random() * 16383));
+const PORT_FILE = process.env.BRAINSTORM_PORT_FILE || null;
+const randomPort = () => 49152 + Math.floor(Math.random() * 16383);
+// Prefer an explicit port, else the port this session last bound (so a restart
+// reuses it and an already-open browser tab reconnects), else a random high port.
+function preferredPort() {
+  if (process.env.BRAINSTORM_PORT) return Number(process.env.BRAINSTORM_PORT);
+  if (PORT_FILE) {
+    try {
+      const p = Number(fs.readFileSync(PORT_FILE, 'utf-8').trim());
+      if (Number.isInteger(p) && p > 1023 && p < 65536) return p;
+    } catch (e) { /* no prior port recorded */ }
+  }
+  return randomPort();
+}
+let PORT = preferredPort();
 const HOST = process.env.BRAINSTORM_HOST || '127.0.0.1';
 const URL_HOST = process.env.BRAINSTORM_URL_HOST || (HOST === '127.0.0.1' ? 'localhost' : HOST);
 const SESSION_DIR = process.env.BRAINSTORM_DIR || '/tmp/brainstorm';
@@ -81,6 +104,27 @@ const CONTENT_DIR = path.join(SESSION_DIR, 'content');
 const STATE_DIR = path.join(SESSION_DIR, 'state');
 let ownerPid = process.env.BRAINSTORM_OWNER_PID ? Number(process.env.BRAINSTORM_OWNER_PID) : null;

+// Per-session secret key. The companion is reachable by any local browser tab
+// and, when bound to a non-loopback host, by any host that can route to it.
+// The key authenticates the real client uniformly across loopback, tunnel, and
+// remote binds — and defeats DNS rebinding — where a Host/Origin allowlist
+// cannot. It rides the served URL as ?key= and is mirrored into a cookie on
+// first load so same-origin subresources and the WebSocket carry it for free.
+// Persisted alongside the port (BRAINSTORM_TOKEN_FILE) so a restart keeps the
+// same key and an already-open tab's cookie still validates.
+const TOKEN_FILE = process.env.BRAINSTORM_TOKEN_FILE || null;
+const TOKEN = (() => {
+  if (process.env.BRAINSTORM_TOKEN) return process.env.BRAINSTORM_TOKEN;
+  if (TOKEN_FILE) {
+    try {
+      const t = fs.readFileSync(TOKEN_FILE, 'utf-8').trim();
+      if (/^[0-9a-f]{32,}$/i.test(t)) return t;
+    } catch (e) { /* no prior token recorded */ }
+  }
+  return crypto.randomBytes(32).toString('hex');
+})();
+let COOKIE_NAME = 'brainstorm-key-' + PORT; // refined to the actual bound port in onListen
+
 const MIME_TYPES = {
  '.html': 'text/html', '.css': 'text/css', '.js': 'application/javascript',
  '.json': 'application/json', '.png': 'image/png', '.jpg': 'image/jpeg',
@@ -98,6 +142,16 @@ h1 { color: #333; } p { color: #666; }</style>
 <body><h1>Brainstorm Companion</h1>
 <p>Waiting for the agent to push a screen...</p></body></html>`;

+const FORBIDDEN_PAGE = `<!DOCTYPE html>
+<html>
+<head><meta charset="utf-8"><title>Session key required</title>
+<style>body { font-family: system-ui, sans-serif; padding: 2rem; max-width: 800px; margin: 0 auto; }
+h1 { color: #333; } p { color: #666; } code { background: #f0f0f0; padding: 0.1em 0.3em; border-radius: 4px; }</style>
+</head>
+<body><h1>Session key required</h1>
+<p>This page needs the full URL your coding agent gave you, including the
+<code>?key=&hellip;</code> part. Copy the complete URL and open it again.</p></body></html>`;
+
 const frameTemplate = fs.readFileSync(path.join(__dirname, 'frame-template.html'), 'utf-8');
 const helperScript = fs.readFileSync(path.join(__dirname, 'helper.js'), 'utf-8');
 const helperInjection = '<script>\n' + helperScript + '\n</script>';
@@ -115,7 +169,7 @@ function wrapInFrame(content) {

 function getNewestScreen() {
  const files = fs.readdirSync(CONTENT_DIR)
-    .filter(f => f.endsWith('.html'))
+    .filter(f => !f.startsWith('.') && f.endsWith('.html'))
    .map(f => {
      const fp = path.join(CONTENT_DIR, f);
      return { path: fp, mtime: fs.statSync(fp).mtime.getTime() };
@@ -124,11 +178,63 @@ function getNewestScreen() {
  return files.length > 0 ? files[0].path : null;
 }

+// ========== Authentication ==========
+
+function timingSafeEqualStr(a, b) {
+  const ab = Buffer.from(String(a));
+  const bb = Buffer.from(String(b));
+  if (ab.length !== bb.length) return false;
+  return crypto.timingSafeEqual(ab, bb);
+}
+
+function parseCookies(header) {
+  const out = {};
+  if (!header) return out;
+  for (const part of header.split(';')) {
+    const eq = part.indexOf('=');
+    if (eq < 0) continue;
+    out[part.slice(0, eq).trim()] = part.slice(eq + 1).trim();
+  }
+  return out;
+}
+
+// A request is authorized if it carries the session key as ?key= or as the
+// session cookie. Both are compared in constant time.
+function isAuthorized(req) {
+  const q = req.url.indexOf('?');
+  if (q >= 0) {
+    const key = new URLSearchParams(req.url.slice(q + 1)).get('key');
+    if (key && timingSafeEqualStr(key, TOKEN)) return true;
+  }
+  const cookie = parseCookies(req.headers['cookie'])[COOKIE_NAME];
+  if (cookie && timingSafeEqualStr(cookie, TOKEN)) return true;
+  return false;
+}
+
+function pathnameOf(url) {
+  const q = url.indexOf('?');
+  return q >= 0 ? url.slice(0, q) : url;
+}
+
 // ========== HTTP Request Handler ==========

 function handleRequest(req, res) {
-  touchActivity();
-  if (req.method === 'GET' && req.url === '/') {
+  if (!isAuthorized(req)) {
+    res.writeHead(403, { 'Content-Type': 'text/html; charset=utf-8' });
+    res.end(FORBIDDEN_PAGE);
+    return;
+  }
+  touchActivity(); // only authorized requests count as activity
+
+  // Mirror the key into a cookie so same-origin subresources (/files/*) and the
+  // WebSocket handshake carry it automatically, whatever URL style the agent
+  // writes. SameSite=Strict: a cross-site page can neither read the key nor ride
+  // the cookie; HttpOnly: page scripts can't exfiltrate it.
+  res.setHeader('Set-Cookie',
+    COOKIE_NAME + '=' + TOKEN + '; HttpOnly; SameSite=Strict; Path=/');
+
+  const pathname = pathnameOf(req.url);
+  if (req.method === 'GET' && pathname === '/') {
    const screenFile = getNewestScreen();
    let html = screenFile
      ? (raw => isFullDocument(raw) ? raw : wrapInFrame(raw))(fs.readFileSync(screenFile, 'utf-8'))
@@ -142,10 +248,12 @@ function handleRequest(req, res) {

    res.writeHead(200, { 'Content-Type': 'text/html; charset=utf-8' });
    res.end(html);
-  } else if (req.method === 'GET' && req.url.startsWith('/files/')) {
-    const fileName = req.url.slice(7);
-    const filePath = path.join(CONTENT_DIR, path.basename(fileName));
-    if (!fs.existsSync(filePath)) {
+  } else if (req.method === 'GET' && pathname.startsWith('/files/')) {
+    const fileName = path.basename(pathname.slice(7));
+    const filePath = path.join(CONTENT_DIR, fileName);
+    // Reject empty/dotfile names and anything that isn't a regular file —
+    // `/files/` would otherwise resolve to CONTENT_DIR and crash readFileSync (EISDIR).
+    if (!fileName || fileName.startsWith('.') || !fs.existsSync(filePath) || !fs.statSync(filePath).isFile()) {
      res.writeHead(404);
      res.end('Not found');
      return;
@@ -165,6 +273,8 @@ function handleRequest(req, res) {
 const clients = new Set();

 function handleUpgrade(req, socket) {
+  if (!isAuthorized(req)) { socket.destroy(); return; }
+
  const key = req.headers['sec-websocket-key'];
  if (!key) { socket.destroy(); return; }

@@ -231,7 +341,7 @@ function handleMessage(text) {
  }
  touchActivity();
  console.log(JSON.stringify({ source: 'user-event', ...event }));
-  if (event.choice) {
+  if (event && event.choice) {
    const eventsFile = path.join(STATE_DIR, 'events');
    fs.appendFileSync(eventsFile, JSON.stringify(event) + '\n');
  }
@@ -244,9 +354,48 @@ function broadcast(msg) {
  }
 }

+// Best-effort: open the user's browser the first time a screen is actually ready
+// to show. Skips when disabled, on a non-loopback (remote) bind, or when a
+// browser is already connected. Override the launcher with BRAINSTORM_OPEN_CMD.
+let browserOpened = false;
+function maybeOpenBrowser() {
+  if (browserOpened) return;
+  browserOpened = true;
+  if (!process.env.BRAINSTORM_OPEN) return; // opt-in: only after the user approves the companion
+  if (HOST !== '127.0.0.1' && HOST !== 'localhost') return;
+  if (clients.size > 0) return; // the user already opened it
+  const url = 'http://' + URL_HOST + ':' + PORT + '/?key=' + TOKEN; // must carry the key or the gate 403s it
+  const cp = require('child_process');
+  // Operator-provided launcher: run as given (this env var is trusted operator input).
+  if (process.env.BRAINSTORM_OPEN_CMD) {
+    try { cp.exec(process.env.BRAINSTORM_OPEN_CMD + ' ' + JSON.stringify(url), () => {}); } catch (e) { /* best effort */ }
+    return;
+  }
+  // Platform launchers: pass the URL as an argv element via execFile (no shell),
+  // so a url-host containing shell metacharacters can't inject a command.
+  const isWSL = process.platform === 'linux' && /microsoft/i.test(require('os').release());
+  let bin, args;
+  if (process.platform === 'darwin') { bin = 'open'; args = [url]; }
+  else if (process.platform === 'win32' || isWSL) { bin = 'cmd.exe'; args = ['/c', 'start', '', url]; }
+  else if (process.env.DISPLAY || process.env.WAYLAND_DISPLAY) { bin = 'xdg-open'; args = [url]; }
+  else return; // headless: nothing to open
+  try { cp.execFile(bin, args, () => {}); } catch (e) { /* best effort */ }
+}
+
 // ========== Activity Tracking ==========

-const IDLE_TIMEOUT_MS = 30 * 60 * 1000; // 30 minutes
+// Idle timeout: shut down after this long with no activity. Default 4 hours;
+// override with BRAINSTORM_IDLE_TIMEOUT_MS (start-server.sh: --idle-timeout-minutes).
+const IDLE_TIMEOUT_MS = (() => {
+  const ms = Number(process.env.BRAINSTORM_IDLE_TIMEOUT_MS);
+  return Number.isFinite(ms) && ms > 0 ? ms : 4 * 60 * 60 * 1000;
+})();
+// How often the watchdog checks for owner-death / idleness. Configurable mainly
+// so tests can run fast; production default is 60s.
+const LIFECYCLE_CHECK_MS = (() => {
+  const ms = Number(process.env.BRAINSTORM_LIFECYCLE_CHECK_MS);
+  return Number.isFinite(ms) && ms > 0 ? ms : 60 * 1000;
+})();
 let lastActivity = Date.now();

 function touchActivity() {
@@ -267,14 +416,14 @@ function startServer() {
  // macOS fs.watch reports 'rename' for both new files and overwrites,
  // so we can't rely on eventType alone.
  const knownFiles = new Set(
-    fs.readdirSync(CONTENT_DIR).filter(f => f.endsWith('.html'))
+    fs.readdirSync(CONTENT_DIR).filter(f => !f.startsWith('.') && f.endsWith('.html'))
  );

  const server = http.createServer(handleRequest);
  server.on('upgrade', handleUpgrade);

  const watcher = fs.watch(CONTENT_DIR, (eventType, filename) => {
-    if (!filename || !filename.endsWith('.html')) return;
+    if (!filename || filename.startsWith('.') || !filename.endsWith('.html')) return;

    if (debounceTimers.has(filename)) clearTimeout(debounceTimers.get(filename));
    debounceTimers.set(filename, setTimeout(() => {
@@ -289,6 +438,7 @@ function startServer() {
        const eventsFile = path.join(STATE_DIR, 'events');
        if (fs.existsSync(eventsFile)) fs.unlinkSync(eventsFile);
        console.log(JSON.stringify({ type: 'screen-added', file: filePath }));
+        maybeOpenBrowser();
      } else {
        console.log(JSON.stringify({ type: 'screen-updated', file: filePath }));
      }
@@ -308,6 +458,11 @@ function startServer() {
    );
    watcher.close();
    clearInterval(lifecycleCheck);
+    // Close any upgraded WebSocket sockets so server.close() can complete and
+    // the process actually exits instead of lingering on an open connection.
+    for (const socket of clients) {
+      try { socket.destroy(); } catch (e) { /* already gone */ }
+    }
    server.close(() => process.exit(0));
  }

@@ -316,11 +471,11 @@ function startServer() {
    try { process.kill(ownerPid, 0); return true; } catch (e) { return e.code === 'EPERM'; }
  }

-  // Check every 60s: exit if owner process died or idle for 30 minutes
+  // Periodically exit if the owner process died or we've been idle too long.
  const lifecycleCheck = setInterval(() => {
    if (!ownerAlive()) shutdown('owner process exited');
    else if (Date.now() - lastActivity > IDLE_TIMEOUT_MS) shutdown('idle timeout');
-  }, 60 * 1000);
+  }, LIFECYCLE_CHECK_MS);
  lifecycleCheck.unref();

  // Validate owner PID at startup. If it's already dead, the PID resolution
@@ -336,19 +491,50 @@ function startServer() {
    }
  }

-  server.listen(PORT, HOST, () => {
+  // If the preferred port is already taken (e.g. a previous server is still
+  // alive), fall back to a random port once instead of failing.
+  let triedFallback = false;
+
+  function onListen() {
+    // Cookie name keys on the ACTUAL bound port (may differ from the preferred
+    // one after an EADDRINUSE fallback) so it can't collide with another server's
+    // cookie in the shared localhost jar.
+    COOKIE_NAME = 'brainstorm-key-' + PORT;
+    // Record the bound port AND token so the next restart of this session reuses
+    // them — but ONLY when we got our preferred port. On a fallback we bound a
+    // *different* port because someone else holds the preferred one; persisting
+    // would overwrite the shared files and strand that other session's open tab.
+    if (PORT_FILE && !triedFallback) {
+      try { fs.writeFileSync(PORT_FILE, String(PORT)); } catch (e) { /* best effort */ }
+      if (TOKEN_FILE) {
+        try { fs.writeFileSync(TOKEN_FILE, TOKEN, { mode: 0o600 }); } catch (e) { /* best effort */ }
+      }
+    }
    const info = JSON.stringify({
      type: 'server-started', port: Number(PORT), host: HOST,
-      url_host: URL_HOST, url: 'http://' + URL_HOST + ':' + PORT,
-      screen_dir: CONTENT_DIR, state_dir: STATE_DIR
+      url_host: URL_HOST, url: 'http://' + URL_HOST + ':' + PORT + '/?key=' + TOKEN,
+      screen_dir: CONTENT_DIR, state_dir: STATE_DIR, idle_timeout_ms: IDLE_TIMEOUT_MS
    });
    console.log(info);
-    fs.writeFileSync(path.join(STATE_DIR, 'server-info'), info + '\n');
+    // server-info embeds the key — keep it owner-only.
+    fs.writeFileSync(path.join(STATE_DIR, 'server-info'), info + '\n', { mode: 0o600 });
+  }
+
+  server.on('error', (err) => {
+    if (err.code === 'EADDRINUSE' && !triedFallback) {
+      triedFallback = true;
+      PORT = randomPort();
+      server.listen(PORT, HOST, onListen);
+    } else {
+      console.error('Server failed to bind:', err.message);
+      process.exit(1);
+    }
  });
+  server.listen(PORT, HOST, onListen);
 }

 if (require.main === module) {
  startServer();
 }

-module.exports = { computeAcceptKey, encodeFrame, decodeFrame, OPCODES };
+module.exports = { computeAcceptKey, encodeFrame, decodeFrame, OPCODES, MAX_FRAME_PAYLOAD_BYTES };
--- a/skills/brainstorming/scripts/start-server.sh
+++ b/skills/brainstorming/scripts/start-server.sh
@@ -11,6 +11,9 @@
 #   --host <bind-host>    Host/interface to bind (default: 127.0.0.1).
 #                         Use 0.0.0.0 in remote/containerized environments.
 #   --url-host <host>     Hostname shown in returned URL JSON.
+#   --idle-timeout-minutes <n>  Shut down after n minutes idle (default 240 = 4h).
+#   --open                Auto-open the browser on the first screen (use only
+#                         after the user approves the visual companion).
 #   --foreground          Run server in the current terminal (no backgrounding).
 #   --background          Force background mode (overrides Codex auto-foreground).

@@ -22,6 +25,7 @@ FOREGROUND="false"
 FORCE_BACKGROUND="false"
 BIND_HOST="127.0.0.1"
 URL_HOST=""
+IDLE_TIMEOUT_MINUTES=""
 while [[ $# -gt 0 ]]; do
  case "$1" in
    --project-dir)
@@ -36,6 +40,14 @@ while [[ $# -gt 0 ]]; do
      URL_HOST="$2"
      shift 2
      ;;
+    --idle-timeout-minutes)
+      IDLE_TIMEOUT_MINUTES="$2"
+      shift 2
+      ;;
+    --open)
+      export BRAINSTORM_OPEN=1
+      shift
+      ;;
    --foreground|--no-daemon)
      FOREGROUND="true"
      shift
@@ -59,6 +71,14 @@ if [[ -z "$URL_HOST" ]]; then
  fi
 fi

+if [[ -n "$IDLE_TIMEOUT_MINUTES" ]]; then
+  if ! [[ "$IDLE_TIMEOUT_MINUTES" =~ ^[0-9]+$ ]] || [[ "$IDLE_TIMEOUT_MINUTES" -lt 1 ]]; then
+    echo "{\"error\": \"--idle-timeout-minutes must be a positive integer\"}"
+    exit 1
+  fi
+  export BRAINSTORM_IDLE_TIMEOUT_MS=$(( IDLE_TIMEOUT_MINUTES * 60 * 1000 ))
+fi
+
 # Some environments reap detached/background processes. Auto-foreground when detected.
 if [[ -n "${CODEX_CI:-}" && "$FOREGROUND" != "true" && "$FORCE_BACKGROUND" != "true" ]]; then
  FOREGROUND="true"
@@ -74,11 +94,19 @@ if [[ "$FOREGROUND" != "true" && "$FORCE_BACKGROUND" != "true" ]]; then
  fi
 fi

+# Session files (server.log, server-info, .last-token) embed the session key —
+# keep everything this script and the server create owner-only.
+umask 077
+
 # Generate unique session directory
 SESSION_ID="$$-$(date +%s)"

 if [[ -n "$PROJECT_DIR" ]]; then
  SESSION_DIR="${PROJECT_DIR}/.superpowers/brainstorm/${SESSION_ID}"
+  # Persist the bound port and key per project so a restart reuses them and an
+  # already-open browser tab reconnects to the same URL with a valid cookie.
+  export BRAINSTORM_PORT_FILE="${PROJECT_DIR}/.superpowers/brainstorm/.last-port"
+  export BRAINSTORM_TOKEN_FILE="${PROJECT_DIR}/.superpowers/brainstorm/.last-token"
 else
  SESSION_DIR="/tmp/brainstorm-${SESSION_ID}"
 fi
@@ -107,10 +135,23 @@ if [[ -z "$OWNER_PID" || "$OWNER_PID" == "1" ]]; then
  OWNER_PID="$PPID"
 fi

+# Windows/MSYS2: Node.js cannot see POSIX PIDs from the MSYS2 namespace.
+# Passing a PID node cannot verify causes server to log owner-pid-invalid
+# and self-terminate at the 60-second lifecycle check. Clear it so the
+# watchdog is disabled and the idle timeout becomes the only shutdown trigger.
+case "${OSTYPE:-}" in
+  msys*|cygwin*|mingw*) OWNER_PID="" ;;
+esac
+if [[ -n "${MSYSTEM:-}" ]]; then
+  OWNER_PID=""
+fi
+
 # Foreground mode for environments that reap detached/background processes.
 if [[ "$FOREGROUND" == "true" ]]; then
-  echo "$$" > "$PID_FILE"
-  env BRAINSTORM_DIR="$SESSION_DIR" BRAINSTORM_HOST="$BIND_HOST" BRAINSTORM_URL_HOST="$URL_HOST" BRAINSTORM_OWNER_PID="$OWNER_PID" node server.cjs
+  env BRAINSTORM_DIR="$SESSION_DIR" BRAINSTORM_HOST="$BIND_HOST" BRAINSTORM_URL_HOST="$URL_HOST" BRAINSTORM_OWNER_PID="$OWNER_PID" node server.cjs &
+  SERVER_PID=$!
+  echo "$SERVER_PID" > "$PID_FILE"
+  wait "$SERVER_PID"
  exit $?
 fi

--- a/skills/brainstorming/scripts/stop-server.sh
+++ b/skills/brainstorming/scripts/stop-server.sh
@@ -16,9 +16,40 @@ fi
 STATE_DIR="${SESSION_DIR}/state"
 PID_FILE="${STATE_DIR}/server.pid"

+# Confirm a PID is actually our brainstorm server (node running server.cjs),
+# not a reused/unrelated process whose PID was recycled into a stale pid file.
+is_brainstorm_server() {
+  kill -0 "$1" 2>/dev/null || return 1
+  case "$(ps -p "$1" -o command= 2>/dev/null)" in
+    *node*server.cjs*) ;;
+    *) return 1 ;;
+  esac
+  # Stronger check: if we recorded the bound port and lsof is available, require
+  # the PID to be the process actually LISTENING on this session's port. This
+  # rules out an unrelated `node ... server.cjs` (another project, an editor task
+  # runner, a different session) that happened to recycle the stale PID.
+  local info="${STATE_DIR}/server-info"
+  if [[ -f "$info" ]] && command -v lsof >/dev/null 2>&1; then
+    local port
+    port=$(sed -n 's/.*"port":\([0-9][0-9]*\).*/\1/p' "$info" | head -1)
+    if [[ -n "$port" ]]; then
+      [[ "$(lsof -nP -iTCP:"$port" -sTCP:LISTEN -t 2>/dev/null | head -1)" == "$1" ]] || return 1
+    fi
+  fi
+  return 0
+}
+
 if [[ -f "$PID_FILE" ]]; then
  pid=$(cat "$PID_FILE")

+  # Refuse to signal a PID we can't prove is our server. A stale pid file may
+  # point at an unrelated process after a reboot/PID wraparound.
+  if ! is_brainstorm_server "$pid"; then
+    rm -f "$PID_FILE"
+    echo '{"status": "stale_pid"}'
+    exit 0
+  fi
+
  # Try to stop gracefully, fallback to force if still alive
  kill "$pid" 2>/dev/null || true

--- a/skills/brainstorming/visual-companion.md
+++ b/skills/brainstorming/visual-companion.md
@@ -33,15 +33,25 @@ The server watches a directory for HTML files and serves the newest one to the b
 ## Starting a Session

 ```bash
-# Start server with persistence (mockups saved to project)
-scripts/start-server.sh --project-dir /path/to/project
+# Start AFTER the user approves the companion. --open auto-opens their browser on
+# the first screen; --project-dir persists mockups and enables same-port restart.
+scripts/start-server.sh --project-dir /path/to/project --open

-# Returns: {"type":"server-started","port":52341,"url":"http://localhost:52341",
+# Returns: {"type":"server-started","port":52341,
+#           "url":"http://localhost:52341/?key=ab12…",
 #           "screen_dir":"/path/to/project/.superpowers/brainstorm/12345-1706000000/content",
 #           "state_dir":"/path/to/project/.superpowers/brainstorm/12345-1706000000/state"}
 ```

-Save `screen_dir` and `state_dir` from the response. Tell user to open the URL.
+Save `screen_dir` and `state_dir` from the response. With `--open`, the browser opens itself when you push the first screen — you don't need to ask the user to open it, but still share the URL as a fallback (headless/remote setups won't auto-open).
+
+**The URL contains a session key (`?key=…`).** The server rejects any request
+without it, so always give the user the **complete** URL from the `url` field —
+never strip the query string, and never hand out a bare `http://host:port`. The
+key gates HTTP and WebSocket access so a stray browser tab or another machine on
+the network can't read the screens or inject events. After the first load the
+browser remembers the key via a cookie, so reloads and `/files/*` assets work
+without repeating it.

 **Finding connection info:** The server writes its startup JSON to `$STATE_DIR/server-info`. If you launched the server in the background and didn't capture stdout, read that file to get the URL and port. When using `--project-dir`, check `<project>/.superpowers/brainstorm/` for the session directory.

@@ -95,7 +105,7 @@ Use `--url-host` to control what hostname is printed in the returned URL JSON.
 ## The Loop

 1. **Check server is alive**, then **write HTML** to a new file in `screen_dir`:
-   - Before each write, check that `$STATE_DIR/server-info` exists. If it doesn't (or `$STATE_DIR/server-stopped` exists), the server has shut down — restart it with `start-server.sh` before continuing. The server auto-exits after 30 minutes of inactivity.
+   - **Required: confirm the server is alive before referring to the URL or pushing a screen.** Check that `$STATE_DIR/server-info` exists and `$STATE_DIR/server-stopped` does not. If it has shut down, restart it with `start-server.sh` using the **same `--project-dir`** — it reuses the same port, so the user's open tab reconnects on its own (it shows a "paused" overlay while the server is down) and you don't need to send a new URL. The server auto-exits after 4 hours idle (configurable with `--idle-timeout-minutes`).
   - Use semantic filenames: `platform.html`, `visual-style.html`, `layout.html`
   - **Never reuse filenames** — each screen gets a fresh file
   - Use your file-creation tool — **never use cat/heredoc** (dumps noise into terminal)
--- a/skills/finishing-a-development-branch/SKILL.md
+++ b/skills/finishing-a-development-branch/SKILL.md
@@ -123,16 +123,6 @@ git branch -d <feature-branch>
 ```bash
 # Push branch
 git push -u origin <feature-branch>
-
-# Create PR
-gh pr create --title "<title>" --body "$(cat <<'EOF'
-## Summary
-<2-3 bullets of what changed>
-
-## Test Plan
- [ ] <verification steps>
-EOF
-)"
 ```

 **Do NOT clean up worktree** — user needs it alive to iterate on PR feedback.
--- a/skills/subagent-driven-development/implementer-prompt.md
+++ b/skills/subagent-driven-development/implementer-prompt.md
@@ -103,6 +103,9 @@ Subagent (general-purpose):
    - **Status:** DONE | DONE_WITH_CONCERNS | BLOCKED | NEEDS_CONTEXT
    - What you implemented (or what you attempted, if blocked)
    - What you tested and test results
+    - **TDD Evidence** (if TDD was required for this task):
+      - RED: command run, relevant failing output before implementation, and why the failure was expected
+      - GREEN: command run and relevant passing output after implementation
    - Files changed
    - Self-review findings (if any)
    - Any issues or concerns
--- a/tests/brainstorm-server/auth.test.js
+++ b/tests/brainstorm-server/auth.test.js
@@ -0,0 +1,201 @@
+/**
+ * Security tests for the brainstorm server's per-session key.
+ *
+ * The companion server is reachable by any local browser tab (default loopback
+ * bind) and by any host that can route to it (remote `--host 0.0.0.0` bind).
+ * A per-session secret key gates every endpoint so that neither a browser
+ * confused-deputy nor a direct remote client can read screens/files or inject
+ * events into state/events (prompt injection into a live agent session).
+ *
+ * Auth = a valid `?key=<token>` query param OR a valid session cookie.
+ *
+ * Uses the `ws` npm package as a test client (test-only dependency).
+ */
+
+const { spawn } = require('child_process');
+const http = require('http');
+const WebSocket = require('ws');
+const fs = require('fs');
+const path = require('path');
+const assert = require('assert');
+
+const SERVER_PATH = path.join(__dirname, '../../skills/brainstorming/scripts/server.cjs');
+const TEST_PORT = 3335;
+const TEST_DIR = '/tmp/brainstorm-auth-test';
+const CONTENT_DIR = path.join(TEST_DIR, 'content');
+const TOKEN = 'testtoken-0123456789abcdef0123456789abcdef';
+const COOKIE_NAME = `brainstorm-key-${TEST_PORT}`;
+
+function cleanup() {
+  if (fs.existsSync(TEST_DIR)) fs.rmSync(TEST_DIR, { recursive: true });
+}
+
+async function sleep(ms) {
+  return new Promise(resolve => setTimeout(resolve, ms));
+}
+
+// Raw HTTP GET with optional key query and Cookie header.
+function get(pathname, { key, cookie } = {}) {
+  const url = `http://localhost:${TEST_PORT}${pathname}` + (key !== undefined ? `?key=${key}` : '');
+  const headers = {};
+  if (cookie) headers['Cookie'] = cookie;
+  return new Promise((resolve, reject) => {
+    http.get(url, { headers }, (res) => {
+      let data = '';
+      res.on('data', chunk => data += chunk);
+      res.on('end', () => resolve({ status: res.statusCode, headers: res.headers, body: data }));
+    }).on('error', reject);
+  });
+}
+
+// Try to open a WebSocket; resolve 'opened' or 'rejected'.
+function wsConnect({ key, cookie } = {}) {
+  const url = `ws://localhost:${TEST_PORT}/` + (key !== undefined ? `?key=${key}` : '');
+  const opts = cookie ? { headers: { Cookie: cookie } } : {};
+  const ws = new WebSocket(url, opts);
+  return new Promise((resolve) => {
+    let settled = false;
+    const done = (outcome) => { if (!settled) { settled = true; resolve({ outcome, ws }); } };
+    ws.on('open', () => done('opened'));
+    ws.on('error', () => done('rejected'));
+    ws.on('close', () => done('rejected'));
+    setTimeout(() => done('rejected'), 1500);
+  });
+}
+
+function startServer() {
+  return spawn('node', [SERVER_PATH], {
+    env: { ...process.env, BRAINSTORM_PORT: TEST_PORT, BRAINSTORM_DIR: TEST_DIR, BRAINSTORM_TOKEN: TOKEN }
+  });
+}
+
+async function waitForServer(server) {
+  let stdout = '', stderr = '';
+  return new Promise((resolve, reject) => {
+    server.stdout.on('data', (d) => {
+      stdout += d.toString();
+      if (stdout.includes('server-started')) resolve({ stdout });
+    });
+    server.stderr.on('data', (d) => { stderr += d.toString(); });
+    server.on('error', reject);
+    setTimeout(() => reject(new Error(`Server didn't start. stderr: ${stderr}`)), 5000);
+  });
+}
+
+async function runTests() {
+  cleanup();
+  fs.mkdirSync(CONTENT_DIR, { recursive: true });
+  fs.writeFileSync(path.join(CONTENT_DIR, 'screen.html'), '<h2>Secret screen</h2>');
+  fs.writeFileSync(path.join(CONTENT_DIR, 'asset.txt'), 'secret asset');
+
+  const server = startServer();
+  let stdoutAccum = '';
+  server.stdout.on('data', (d) => { stdoutAccum += d.toString(); });
+  const { stdout: initialStdout } = await waitForServer(server);
+
+  let passed = 0, failed = 0;
+  async function test(name, fn) {
+    try { await fn(); console.log(`  PASS: ${name}`); passed++; }
+    catch (e) { console.log(`  FAIL: ${name}`); console.log(`    ${e.message}`); failed++; }
+  }
+
+  try {
+    console.log('\n--- Startup URL ---');
+
+    await test('server-started url includes the session key', () => {
+      const msg = JSON.parse(initialStdout.trim());
+      assert(msg.url.includes(`key=${TOKEN}`), `url should carry the key, got: ${msg.url}`);
+    });
+
+    console.log('\n--- HTTP / gate ---');
+
+    await test('GET / without key is rejected with 403', async () => {
+      const res = await get('/');
+      assert.strictEqual(res.status, 403, 'no-key request must be 403');
+    });
+
+    await test('403 page names "coding agent" and the key', async () => {
+      const res = await get('/');
+      assert(/coding agent/i.test(res.body), '403 body should reference the coding agent');
+      assert(/key/i.test(res.body), '403 body should mention the key');
+    });
+
+    await test('GET / with wrong key is rejected with 403', async () => {
+      const res = await get('/', { key: 'wrong-token' });
+      assert.strictEqual(res.status, 403);
+    });
+
+    await test('GET / with valid key serves the screen', async () => {
+      const res = await get('/', { key: TOKEN });
+      assert.strictEqual(res.status, 200);
+      assert(res.body.includes('Secret screen'), 'should serve the screen content');
+    });
+
+    await test('valid key load sets an HttpOnly SameSite=Strict cookie', async () => {
+      const res = await get('/', { key: TOKEN });
+      const setCookie = (res.headers['set-cookie'] || []).join('; ');
+      assert(setCookie.includes(`${COOKIE_NAME}=${TOKEN}`), `should set ${COOKIE_NAME}`);
+      assert(/HttpOnly/i.test(setCookie), 'cookie should be HttpOnly');
+      assert(/SameSite=Strict/i.test(setCookie), 'cookie should be SameSite=Strict');
+    });
+
+    await test('GET / with valid cookie (no query key) serves the screen', async () => {
+      const res = await get('/', { cookie: `${COOKIE_NAME}=${TOKEN}` });
+      assert.strictEqual(res.status, 200);
+      assert(res.body.includes('Secret screen'));
+    });
+
+    console.log('\n--- HTTP /files gate ---');
+
+    await test('GET /files without key is rejected with 403', async () => {
+      const res = await get('/files/asset.txt');
+      assert.strictEqual(res.status, 403);
+    });
+
+    await test('GET /files with valid key serves the file', async () => {
+      const res = await get('/files/asset.txt', { key: TOKEN });
+      assert.strictEqual(res.status, 200);
+      assert(res.body.includes('secret asset'));
+    });
+
+    console.log('\n--- WebSocket gate ---');
+
+    await test('WS upgrade without key is rejected', async () => {
+      const { outcome, ws } = await wsConnect();
+      ws.close();
+      assert.strictEqual(outcome, 'rejected', 'unauthenticated WS must not open');
+    });
+
+    await test('WS upgrade with valid key opens', async () => {
+      const { outcome, ws } = await wsConnect({ key: TOKEN });
+      ws.close();
+      assert.strictEqual(outcome, 'opened');
+    });
+
+    await test('WS upgrade with valid cookie opens', async () => {
+      const { outcome, ws } = await wsConnect({ cookie: `${COOKIE_NAME}=${TOKEN}` });
+      ws.close();
+      assert.strictEqual(outcome, 'opened');
+    });
+
+    console.log('\n--- Robustness (A3) ---');
+
+    await test('null payload over an authed WS does not crash the server', async () => {
+      const { ws } = await wsConnect({ key: TOKEN });
+      ws.send('null');
+      await sleep(300);
+      const res = await get('/', { key: TOKEN });
+      assert.strictEqual(res.status, 200, 'server must still respond after null payload');
+      ws.close();
+    });
+
+    console.log(`\n--- Results: ${passed} passed, ${failed} failed ---`);
+    if (failed > 0) process.exit(1);
+  } finally {
+    server.kill();
+    await sleep(100);
+    cleanup();
+  }
+}
+
+runTests().catch(err => { console.error('Test failed:', err); process.exit(1); });
--- a/tests/brainstorm-server/helper.test.js
+++ b/tests/brainstorm-server/helper.test.js
@@ -0,0 +1,163 @@
+/**
+ * Tests for the injected browser client (helper.js).
+ *
+ * helper.js runs in the browser, so its DOM behaviour is exercised live; here we
+ * unit-test the pure reconnect-backoff function it exports and assert that the
+ * reconnect / status / tombstone wiring is present.
+ */
+
+const assert = require('assert');
+const fs = require('fs');
+const path = require('path');
+
+const HELPER = path.join(__dirname, '../../skills/brainstorming/scripts/helper.js');
+
+const src = fs.readFileSync(HELPER, 'utf-8');
+
+// helper.js is browser code, and the repo is an ES module package, so a plain
+// require() won't surface its exports. Evaluate the source in a CommonJS sandbox
+// with no `window`, so only the exported pure helpers run (not the browser code).
+const moduleShim = { exports: {} };
+new Function('module', src)(moduleShim);
+const { nextReconnectDelay, MIN_RECONNECT_MS, MAX_RECONNECT_MS, TOMBSTONE_AFTER_MS } = moduleShim.exports;
+
+let passed = 0, failed = 0;
+function test(name, fn) {
+  try { fn(); console.log(`  PASS: ${name}`); passed++; }
+  catch (e) { console.log(`  FAIL: ${name}`); console.log(`    ${e.message}`); failed++; }
+}
+
+console.log('\n--- Backoff (pure) ---');
+
+test('doubles the delay each call', () => {
+  assert.strictEqual(nextReconnectDelay(500, 30000), 1000);
+  assert.strictEqual(nextReconnectDelay(1000, 30000), 2000);
+  assert.strictEqual(nextReconnectDelay(2000, 30000), 4000);
+});
+
+test('caps at the maximum', () => {
+  assert.strictEqual(nextReconnectDelay(20000, 30000), 30000);
+  assert.strictEqual(nextReconnectDelay(30000, 30000), 30000);
+});
+
+test('full progression from MIN caps at MAX and never exceeds it', () => {
+  const seq = [MIN_RECONNECT_MS];
+  let d = MIN_RECONNECT_MS;
+  for (let i = 0; i < 10; i++) { d = nextReconnectDelay(d, MAX_RECONNECT_MS); seq.push(d); }
+  assert.strictEqual(seq[0], 500);
+  assert.deepStrictEqual(seq.slice(0, 7), [500, 1000, 2000, 4000, 8000, 16000, 30000]);
+  assert(seq.every(v => v <= MAX_RECONNECT_MS), 'never exceeds max');
+  assert.strictEqual(seq[seq.length - 1], 30000, 'settles at the cap');
+});
+
+test('exposes sane constants', () => {
+  assert.strictEqual(MIN_RECONNECT_MS, 500);
+  assert.strictEqual(MAX_RECONNECT_MS, 30000);
+  assert(TOMBSTONE_AFTER_MS >= 5000, 'tombstone grace is at least a few seconds');
+});
+
+console.log('\n--- Wiring (source) ---');
+
+test('reflects all three connection states', () => {
+  assert(/Connected/.test(src) && /Reconnecting/.test(src) && /Disconnected/.test(src),
+    'should set Connected / Reconnecting / Disconnected status');
+  assert(src.includes("setProperty('--status-color'"), 'drives the status dot via --status-color');
+});
+
+test('renders a tombstone overlay when paused', () => {
+  assert(src.includes('bs-tombstone'), 'creates the tombstone element');
+  assert(/Companion paused/.test(src), 'tombstone explains the companion paused');
+});
+
+test('hardens reconnection (onerror, null socket, clears pending timer)', () => {
+  assert(src.includes('onerror'), 'handles onerror');
+  assert(/ws = null/.test(src), 'nulls the socket on close so sendEvent queues');
+  assert(src.includes('clearTimeout'), 'clears a pending reconnect before scheduling another');
+  assert(src.includes('nextReconnectDelay'), 'uses exponential backoff for reconnects');
+});
+
+test('reloads on recovery and on reload messages', () => {
+  assert(/location\.reload\(\)/.test(src), 'reloads to pick up restarted/updated content');
+});
+
+console.log('\n--- Reconnect state machine (mocked browser) ---');
+
+// Drive helper.js's browser code against mocked DOM/WebSocket/timers/clock so we
+// can exercise the actual reconnect/status/tombstone behaviour, not just grep it.
+function makeEnv() {
+  const state = { now: 1000, timers: [], reloads: 0, appended: [] };
+  const sockets = [];
+  const statusEl = { textContent: '', style: { setProperty() {} } };
+  class FakeWS {
+    constructor(url) { this.url = url; this.readyState = 0; this.onopen = this.onclose = this.onmessage = this.onerror = null; sockets.push(this); }
+    send() {}
+    close() { this.readyState = 3; if (this.onclose) this.onclose(); }
+    open() { this.readyState = 1; if (this.onopen) this.onopen(); }
+  }
+  FakeWS.OPEN = 1;
+  const env = {
+    module: { exports: {} },
+    window: { location: { host: 'localhost:7777', reload() { state.reloads++; } } },
+    document: {
+      querySelector: (s) => s === '.status' ? statusEl : null,
+      getElementById: () => null,
+      createElement: () => ({ style: {}, id: '' }),
+      addEventListener() {},
+      body: { appendChild: (el) => state.appended.push(el) }
+    },
+    WebSocket: FakeWS,
+    setTimeout: (fn, ms) => { state.timers.push({ fn, ms, fired: false, cleared: false }); return state.timers.length; },
+    clearTimeout: (id) => { if (state.timers[id - 1]) state.timers[id - 1].cleared = true; },
+    Date: { now: () => state.now },
+    console
+  };
+  return {
+    state, statusEl, sockets,
+    boot() { new Function(...Object.keys(env), src)(...Object.values(env)); },
+    advance(ms) { state.now += ms; },
+    last() { return sockets[sockets.length - 1]; },
+    fireReconnect() {
+      const t = [...state.timers].reverse().find(x => !x.fired && !x.cleared);
+      if (!t) throw new Error('no reconnect scheduled');
+      t.fired = true; t.fn();
+    }
+  };
+}
+
+test('on disconnect shows Reconnecting and schedules a 500ms reconnect', () => {
+  const e = makeEnv(); e.boot();
+  e.last().open();
+  assert.strictEqual(e.statusEl.textContent, 'Connected');
+  e.last().close();
+  assert.strictEqual(e.statusEl.textContent, 'Reconnecting…');
+  assert.strictEqual(e.state.timers[e.state.timers.length - 1].ms, 500);
+});
+
+test('reconnect delay backs off 500 -> 1000 -> 2000', () => {
+  const e = makeEnv(); e.boot();
+  e.last().open(); e.last().close();
+  e.fireReconnect(); e.last().close();
+  e.fireReconnect(); e.last().close();
+  assert.deepStrictEqual(e.state.timers.map(t => t.ms).slice(0, 3), [500, 1000, 2000]);
+});
+
+test('shows the tombstone and Disconnected after the grace period', () => {
+  const e = makeEnv(); e.boot();
+  e.last().open(); e.last().close();
+  e.advance(20000);          // past TOMBSTONE_AFTER_MS while still down
+  e.fireReconnect(); e.last().close();
+  assert.strictEqual(e.statusEl.textContent, 'Disconnected');
+  assert.strictEqual(e.state.appended.length, 1, 'tombstone appended exactly once');
+});
+
+test('reloads to recover when a tombstoned connection comes back', () => {
+  const e = makeEnv(); e.boot();
+  e.last().open(); e.last().close();
+  e.advance(20000); e.fireReconnect(); e.last().close(); // tombstone now shown
+  assert.strictEqual(e.state.reloads, 0);
+  e.fireReconnect(); e.last().open();                    // server back (e.g. same-port restart)
+  assert.strictEqual(e.state.reloads, 1, 'reloads once on recovery');
+});
+
+console.log(`\n--- Results: ${passed} passed, ${failed} failed ---`);
+if (failed > 0) process.exit(1);
--- a/tests/brainstorm-server/lifecycle.test.js
+++ b/tests/brainstorm-server/lifecycle.test.js
@@ -0,0 +1,203 @@
+/**
+ * Tests for the brainstorm server's lifecycle (idle timeout + shutdown).
+ *
+ * - The idle timeout is configurable (default 4h) and reported in server-info.
+ * - Idle shutdown must close any open WebSocket so the process actually exits,
+ *   not hang on a lingering connection.
+ * - start-server.sh exposes the timeout via --idle-timeout-minutes.
+ *
+ * Uses the `ws` npm package as a test client (test-only dependency).
+ */
+
+const { spawn, execFileSync } = require('child_process');
+const WebSocket = require('ws');
+const fs = require('fs');
+const path = require('path');
+const assert = require('assert');
+
+const SERVER = path.join(__dirname, '../../skills/brainstorming/scripts/server.cjs');
+const START = path.join(__dirname, '../../skills/brainstorming/scripts/start-server.sh');
+const STOP = path.join(__dirname, '../../skills/brainstorming/scripts/stop-server.sh');
+const sleep = ms => new Promise(r => setTimeout(r, ms));
+
+function firstServerStarted(out) {
+  return JSON.parse(out.trim().split('\n').find(l => l.includes('server-started')));
+}
+
+async function runTests() {
+  let passed = 0, failed = 0;
+  async function test(name, fn) {
+    try { await fn(); console.log(`  PASS: ${name}`); passed++; }
+    catch (e) { console.log(`  FAIL: ${name}`); console.log(`    ${e.message}`); failed++; }
+  }
+
+  await test('server-info reports the configured idle_timeout_ms', async () => {
+    const dir = fs.mkdtempSync('/tmp/bs-life-');
+    const srv = spawn('node', [SERVER], { env: { ...process.env, BRAINSTORM_PORT: 3401, BRAINSTORM_DIR: dir, BRAINSTORM_IDLE_TIMEOUT_MS: 1234567 } });
+    let out = ''; srv.stdout.on('data', d => out += d.toString());
+    for (let i = 0; i < 60 && !out.includes('server-started'); i++) await sleep(50);
+    try {
+      const info = firstServerStarted(out);
+      assert.strictEqual(info.idle_timeout_ms, 1234567, 'idle_timeout_ms should reflect the env override');
+    } finally {
+      srv.kill(); await sleep(100); fs.rmSync(dir, { recursive: true, force: true });
+    }
+  });
+
+  await test('idle shutdown closes an open WebSocket and the process exits', async () => {
+    const dir = fs.mkdtempSync('/tmp/bs-life-');
+    const srv = spawn('node', [SERVER], { env: { ...process.env, BRAINSTORM_PORT: 3402, BRAINSTORM_DIR: dir, BRAINSTORM_TOKEN: 'lifetoken', BRAINSTORM_IDLE_TIMEOUT_MS: 200, BRAINSTORM_LIFECYCLE_CHECK_MS: 100 } });
+    let out = ''; srv.stdout.on('data', d => out += d.toString());
+    let exited = false, code = null; srv.on('exit', c => { exited = true; code = c; });
+    for (let i = 0; i < 60 && !out.includes('server-started'); i++) await sleep(50);
+
+    const ws = new WebSocket('ws://localhost:3402/?key=lifetoken');
+    await new Promise((res, rej) => { ws.on('open', res); ws.on('error', rej); });
+
+    // 200ms idle, checked every 100ms — should shut down and exit well within 4s,
+    // *despite* the open WS, only if shutdown() closes client sockets.
+    for (let i = 0; i < 40 && !exited; i++) await sleep(100);
+
+    try {
+      assert(exited, 'process must exit after idle shutdown even with an open WebSocket');
+      assert.strictEqual(code, 0, 'should exit cleanly (0)');
+      assert(fs.existsSync(path.join(dir, 'state', 'server-stopped')), 'should write server-stopped');
+    } finally {
+      try { ws.close(); } catch (e) {}
+      if (!exited) srv.kill();
+      fs.rmSync(dir, { recursive: true, force: true });
+    }
+  });
+
+  await test('start-server.sh --idle-timeout-minutes sets the timeout', async () => {
+    const dir = fs.mkdtempSync('/tmp/bs-life-');
+    let info;
+    const out = execFileSync('bash', [START, '--project-dir', dir, '--idle-timeout-minutes', '5'], { encoding: 'utf8' });
+    info = firstServerStarted(out);
+    try {
+      assert.strictEqual(info.idle_timeout_ms, 5 * 60 * 1000, '5 minutes -> 300000 ms');
+    } finally {
+      execFileSync('bash', [STOP, path.dirname(info.state_dir)], { stdio: 'ignore' });
+      fs.rmSync(dir, { recursive: true, force: true });
+    }
+  });
+
+  await test('persists the bound port AND key, and restores both on restart', async () => {
+    const dir = fs.mkdtempSync('/tmp/bs-port-');
+    const portFile = path.join(dir, '.last-port');
+    const tokenFile = path.join(dir, '.last-token');
+    const env = { ...process.env, BRAINSTORM_PORT_FILE: portFile, BRAINSTORM_TOKEN_FILE: tokenFile, BRAINSTORM_LIFECYCLE_CHECK_MS: 100000 };
+
+    const a = spawn('node', [SERVER], { env: { ...env, BRAINSTORM_DIR: path.join(dir, 's1') } });
+    let outA = ''; a.stdout.on('data', d => outA += d.toString());
+    for (let i = 0; i < 60 && !outA.includes('server-started'); i++) await sleep(50);
+    const infoA = firstServerStarted(outA);
+    const keyA = new URL(infoA.url).searchParams.get('key');
+    assert(fs.existsSync(portFile) && fs.existsSync(tokenFile), 'should write the port and token files');
+    a.kill(); await sleep(400); // free the port
+
+    const b = spawn('node', [SERVER], { env: { ...env, BRAINSTORM_DIR: path.join(dir, 's2') } });
+    let outB = ''; b.stdout.on('data', d => outB += d.toString());
+    for (let i = 0; i < 60 && !outB.includes('server-started'); i++) await sleep(50);
+    const infoB = firstServerStarted(outB);
+    const keyB = new URL(infoB.url).searchParams.get('key');
+    b.kill(); await sleep(100); fs.rmSync(dir, { recursive: true, force: true });
+
+    assert.strictEqual(infoB.port, infoA.port, 'restart should reuse the same port');
+    // Same key too — otherwise the open tab's cookie would 403 against the restart.
+    assert.strictEqual(keyB, keyA, 'restart should reuse the same session key');
+  });
+
+  await test('falls back to a random port when the preferred port is taken', async () => {
+    const dir = fs.mkdtempSync('/tmp/bs-port-');
+    const portFile = path.join(dir, '.last-port');
+
+    const a = spawn('node', [SERVER], { env: { ...process.env, BRAINSTORM_DIR: path.join(dir, 'a'), BRAINSTORM_PORT: 3415, BRAINSTORM_LIFECYCLE_CHECK_MS: 100000 } });
+    let outA = ''; a.stdout.on('data', d => outA += d.toString());
+    for (let i = 0; i < 60 && !outA.includes('server-started'); i++) await sleep(50);
+
+    fs.writeFileSync(portFile, '3415'); // preferred port, but it's taken by A
+    const b = spawn('node', [SERVER], { env: { ...process.env, BRAINSTORM_DIR: path.join(dir, 'b'), BRAINSTORM_PORT_FILE: portFile, BRAINSTORM_LIFECYCLE_CHECK_MS: 100000 } });
+    let outB = ''; b.stdout.on('data', d => outB += d.toString());
+    for (let i = 0; i < 60 && !outB.includes('server-started'); i++) await sleep(50);
+    const portB = firstServerStarted(outB).port;
+    const persisted = fs.readFileSync(portFile, 'utf8').trim();
+
+    a.kill(); b.kill(); await sleep(100); fs.rmSync(dir, { recursive: true, force: true });
+
+    assert.notStrictEqual(portB, 3415, 'must not bind the already-taken port');
+    assert(portB >= 49152, 'should fall back to a random high port');
+    // The fallback must NOT clobber the shared port file — A still owns 3415 and
+    // its open tab must keep reconnecting there.
+    assert.strictEqual(persisted, '3415', 'fallback must not overwrite .last-port');
+  });
+
+  await test('auto-opens the browser once, on the first screen', async () => {
+    const dir = fs.mkdtempSync('/tmp/bs-open-');
+    const marker = path.join(dir, 'opened.log');
+    const openCmd = `sh -c 'echo "$0" >> ${marker}'`; // capture the launch instead of opening a browser
+    const srv = spawn('node', [SERVER], { env: { ...process.env, BRAINSTORM_PORT: 3417, BRAINSTORM_DIR: dir, BRAINSTORM_OPEN: '1', BRAINSTORM_OPEN_CMD: openCmd, BRAINSTORM_LIFECYCLE_CHECK_MS: 100000 } });
+    let out = ''; srv.stdout.on('data', d => out += d.toString());
+    for (let i = 0; i < 60 && !out.includes('server-started'); i++) await sleep(50);
+
+    // First screen, with no browser connected -> should auto-open.
+    fs.writeFileSync(path.join(dir, 'content', 'first.html'), '<h2>First</h2>');
+    await sleep(700);
+    // Second screen -> must NOT open again.
+    fs.writeFileSync(path.join(dir, 'content', 'second.html'), '<h2>Second</h2>');
+    await sleep(700);
+
+    const lines = fs.existsSync(marker) ? fs.readFileSync(marker, 'utf8').trim().split('\n').filter(Boolean) : [];
+    // The opened URL must carry the key AND be reachable — a keyless URL hits 403.
+    let status = 0;
+    if (lines[0]) {
+      status = await new Promise(r => require('http').get(lines[0], res => { res.resume(); r(res.statusCode); }).on('error', () => r(0)));
+    }
+    srv.kill(); await sleep(100);
+    fs.rmSync(dir, { recursive: true, force: true });
+
+    assert.strictEqual(lines.length, 1, 'should open exactly once');
+    assert(lines[0].includes('3417'), `should open the server URL, got: ${lines[0]}`);
+    assert(/[?&]key=/.test(lines[0]), `opened URL must carry the session key, got: ${lines[0]}`);
+    assert.strictEqual(status, 200, 'the opened URL must be reachable (valid key), not the 403 page');
+  });
+
+  await test('does NOT auto-open unless approved (BRAINSTORM_OPEN unset)', async () => {
+    const dir = fs.mkdtempSync('/tmp/bs-open-');
+    const marker = path.join(dir, 'opened.log');
+    const openCmd = `sh -c 'echo "$0" >> ${marker}'`;
+    // BRAINSTORM_OPEN intentionally NOT set — auto-open must stay off.
+    const srv = spawn('node', [SERVER], { env: { ...process.env, BRAINSTORM_PORT: 3418, BRAINSTORM_DIR: dir, BRAINSTORM_OPEN_CMD: openCmd, BRAINSTORM_LIFECYCLE_CHECK_MS: 100000 } });
+    let out = ''; srv.stdout.on('data', d => out += d.toString());
+    for (let i = 0; i < 60 && !out.includes('server-started'); i++) await sleep(50);
+    fs.writeFileSync(path.join(dir, 'content', 'first.html'), '<h2>First</h2>');
+    await sleep(700);
+    srv.kill(); await sleep(100);
+    const opened = fs.existsSync(marker);
+    fs.rmSync(dir, { recursive: true, force: true });
+    assert(!opened, 'must not open the browser without explicit approval');
+  });
+
+  await test('unauthenticated requests do not defeat the idle timeout', async () => {
+    const dir = fs.mkdtempSync('/tmp/bs-life-');
+    const srv = spawn('node', [SERVER], { env: { ...process.env, BRAINSTORM_PORT: 3419, BRAINSTORM_DIR: dir, BRAINSTORM_TOKEN: 'authtok', BRAINSTORM_IDLE_TIMEOUT_MS: 400, BRAINSTORM_LIFECYCLE_CHECK_MS: 100 } });
+    let out = ''; srv.stdout.on('data', d => out += d.toString());
+    let exited = false; srv.on('exit', () => { exited = true; });
+    for (let i = 0; i < 60 && !out.includes('server-started'); i++) await sleep(50);
+
+    // Flood with UNAUTHENTICATED (keyless → 403) requests. These must NOT count
+    // as activity, so the idle timeout still fires and the process exits.
+    const hammer = setInterval(() => { require('http').get('http://localhost:3419/', r => r.resume()).on('error', () => {}); }, 60);
+    for (let i = 0; i < 40 && !exited; i++) await sleep(100);
+    clearInterval(hammer);
+    if (!exited) srv.kill();
+    fs.rmSync(dir, { recursive: true, force: true });
+
+    assert(exited, 'idle shutdown must still fire despite a flood of unauthenticated requests');
+  });
+
+  console.log(`\n--- Results: ${passed} passed, ${failed} failed ---`);
+  if (failed > 0) process.exit(1);
+}
+
+runTests().catch(err => { console.error('Test failed:', err); process.exit(1); });
--- a/tests/brainstorm-server/package.json
+++ b/tests/brainstorm-server/package.json
@@ -2,7 +2,7 @@
  "name": "brainstorm-server-tests",
  "version": "1.0.0",
  "scripts": {
-    "test": "node server.test.js"
+    "test": "node ws-protocol.test.js && node helper.test.js && node auth.test.js && node server.test.js && node lifecycle.test.js && bash stop-server.test.sh"
  },
  "dependencies": {
    "ws": "^8.19.0"
--- a/tests/brainstorm-server/server.test.js
+++ b/tests/brainstorm-server/server.test.js
@@ -20,6 +20,9 @@ const TEST_PORT = 3334;
 const TEST_DIR = '/tmp/brainstorm-test';
 const CONTENT_DIR = path.join(TEST_DIR, 'content');
 const STATE_DIR = path.join(TEST_DIR, 'state');
+// Fixed session key so the test client can authenticate (see auth.test.js for
+// the security behavior itself; here we just need authorized requests).
+const TOKEN = 'testtoken-server-0123456789abcdef';

 function cleanup() {
  if (fs.existsSync(TEST_DIR)) {
@@ -32,8 +35,9 @@ async function sleep(ms) {
 }

 async function fetch(url) {
+  const authed = url + (url.includes('?') ? '&' : '?') + 'key=' + TOKEN;
  return new Promise((resolve, reject) => {
-    http.get(url, (res) => {
+    http.get(authed, (res) => {
      let data = '';
      res.on('data', chunk => data += chunk);
      res.on('end', () => resolve({
@@ -47,7 +51,7 @@ async function fetch(url) {

 function startServer() {
  return spawn('node', [SERVER_PATH], {
-    env: { ...process.env, BRAINSTORM_PORT: TEST_PORT, BRAINSTORM_DIR: TEST_DIR }
+    env: { ...process.env, BRAINSTORM_PORT: TEST_PORT, BRAINSTORM_DIR: TEST_DIR, BRAINSTORM_TOKEN: TOKEN }
  });
 }

@@ -179,6 +183,33 @@ async function runTests() {
      assert(!res.body.includes('"not"'), 'Should not serve JSON');
    });

+    await test('ignores macOS resource-fork dotfiles (._*.html) when serving', async () => {
+      // On macOS/ExFAT/SMB, the OS writes ._name.html sidecar files holding
+      // binary metadata. They end with .html but must never be served as a screen.
+      fs.writeFileSync(path.join(CONTENT_DIR, 'real-screen.html'), '<h2>Real Screen Content</h2>');
+      await sleep(100);
+      fs.writeFileSync(path.join(CONTENT_DIR, '._real-screen.html'), 'Mac OS X resource fork garbage');
+      await sleep(300);
+
+      const res = await fetch(`http://localhost:${TEST_PORT}/`);
+      assert(res.body.includes('Real Screen Content'), 'should serve the real screen, not the newer ._ sidecar');
+      assert(!res.body.includes('resource fork garbage'), 'must not serve ._*.html dotfile content');
+    });
+
+    await test('does not serve dotfiles via /files/', async () => {
+      fs.writeFileSync(path.join(CONTENT_DIR, '._secret.html'), 'dotfile body should not be served');
+      const res = await fetch(`http://localhost:${TEST_PORT}/files/._secret.html`);
+      assert.strictEqual(res.status, 404, '/files/ must 404 on dotfiles');
+    });
+
+    await test('GET /files/ (empty name) returns 404 and does not crash the server', async () => {
+      const res = await fetch(`http://localhost:${TEST_PORT}/files/`);
+      assert.strictEqual(res.status, 404, '/files/ (the content dir) must 404, not EISDIR-crash');
+      // The server must still be alive afterward.
+      const alive = await fetch(`http://localhost:${TEST_PORT}/`);
+      assert.strictEqual(alive.status, 200, 'server must survive a /files/ request');
+    });
+
    await test('returns 404 for non-root paths', async () => {
      const res = await fetch(`http://localhost:${TEST_PORT}/other`);
      assert.strictEqual(res.status, 404);
@@ -188,7 +219,7 @@ async function runTests() {
    console.log('\n--- WebSocket Communication ---');

    await test('accepts WebSocket upgrade on /', async () => {
-      const ws = new WebSocket(`ws://localhost:${TEST_PORT}`);
+      const ws = new WebSocket(`ws://localhost:${TEST_PORT}/?key=${TOKEN}`);
      await new Promise((resolve, reject) => {
        ws.on('open', resolve);
        ws.on('error', reject);
@@ -198,7 +229,7 @@ async function runTests() {

    await test('relays user events to stdout with source field', async () => {
      stdoutAccum = '';
-      const ws = new WebSocket(`ws://localhost:${TEST_PORT}`);
+      const ws = new WebSocket(`ws://localhost:${TEST_PORT}/?key=${TOKEN}`);
      await new Promise(resolve => ws.on('open', resolve));

      ws.send(JSON.stringify({ type: 'click', text: 'Test Button' }));
@@ -214,7 +245,7 @@ async function runTests() {
      const eventsFile = path.join(STATE_DIR, 'events');
      if (fs.existsSync(eventsFile)) fs.unlinkSync(eventsFile);

-      const ws = new WebSocket(`ws://localhost:${TEST_PORT}`);
+      const ws = new WebSocket(`ws://localhost:${TEST_PORT}/?key=${TOKEN}`);
      await new Promise(resolve => ws.on('open', resolve));

      ws.send(JSON.stringify({ type: 'click', choice: 'b', text: 'Option B' }));
@@ -232,7 +263,7 @@ async function runTests() {
      const eventsFile = path.join(STATE_DIR, 'events');
      if (fs.existsSync(eventsFile)) fs.unlinkSync(eventsFile);

-      const ws = new WebSocket(`ws://localhost:${TEST_PORT}`);
+      const ws = new WebSocket(`ws://localhost:${TEST_PORT}/?key=${TOKEN}`);
      await new Promise(resolve => ws.on('open', resolve));

      ws.send(JSON.stringify({ type: 'hover', text: 'Something' }));
@@ -244,8 +275,8 @@ async function runTests() {
    });

    await test('handles multiple concurrent WebSocket clients', async () => {
-      const ws1 = new WebSocket(`ws://localhost:${TEST_PORT}`);
-      const ws2 = new WebSocket(`ws://localhost:${TEST_PORT}`);
+      const ws1 = new WebSocket(`ws://localhost:${TEST_PORT}/?key=${TOKEN}`);
+      const ws2 = new WebSocket(`ws://localhost:${TEST_PORT}/?key=${TOKEN}`);
      await Promise.all([
        new Promise(resolve => ws1.on('open', resolve)),
        new Promise(resolve => ws2.on('open', resolve))
@@ -270,7 +301,7 @@ async function runTests() {
    });

    await test('cleans up closed clients from broadcast list', async () => {
-      const ws1 = new WebSocket(`ws://localhost:${TEST_PORT}`);
+      const ws1 = new WebSocket(`ws://localhost:${TEST_PORT}/?key=${TOKEN}`);
      await new Promise(resolve => ws1.on('open', resolve));
      ws1.close();
      await sleep(100);
@@ -282,7 +313,7 @@ async function runTests() {
    });

    await test('handles malformed JSON from client gracefully', async () => {
-      const ws = new WebSocket(`ws://localhost:${TEST_PORT}`);
+      const ws = new WebSocket(`ws://localhost:${TEST_PORT}/?key=${TOKEN}`);
      await new Promise(resolve => ws.on('open', resolve));

      // Send invalid JSON — server should not crash
@@ -299,7 +330,7 @@ async function runTests() {
    console.log('\n--- File Watching ---');

    await test('sends reload on new .html file', async () => {
-      const ws = new WebSocket(`ws://localhost:${TEST_PORT}`);
+      const ws = new WebSocket(`ws://localhost:${TEST_PORT}/?key=${TOKEN}`);
      await new Promise(resolve => ws.on('open', resolve));

      let gotReload = false;
@@ -319,7 +350,7 @@ async function runTests() {
      fs.writeFileSync(filePath, '<h2>Original</h2>');
      await sleep(500);

-      const ws = new WebSocket(`ws://localhost:${TEST_PORT}`);
+      const ws = new WebSocket(`ws://localhost:${TEST_PORT}/?key=${TOKEN}`);
      await new Promise(resolve => ws.on('open', resolve));

      let gotReload = false;
@@ -335,7 +366,7 @@ async function runTests() {
    });

    await test('does NOT send reload for non-.html files', async () => {
-      const ws = new WebSocket(`ws://localhost:${TEST_PORT}`);
+      const ws = new WebSocket(`ws://localhost:${TEST_PORT}/?key=${TOKEN}`);
      await new Promise(resolve => ws.on('open', resolve));

      let gotReload = false;
@@ -350,6 +381,22 @@ async function runTests() {
      ws.close();
    });

+    await test('does NOT send reload for ._*.html resource-fork dotfiles', async () => {
+      const ws = new WebSocket(`ws://localhost:${TEST_PORT}/?key=${TOKEN}`);
+      await new Promise(resolve => ws.on('open', resolve));
+
+      let gotReload = false;
+      ws.on('message', (data) => {
+        if (JSON.parse(data.toString()).type === 'reload') gotReload = true;
+      });
+
+      fs.writeFileSync(path.join(CONTENT_DIR, '._sidecar.html'), 'resource fork');
+      await sleep(500);
+
+      assert(!gotReload, 'a ._ dotfile appearing must not trigger a reload');
+      ws.close();
+    });
+
    await test('clears state/events on new screen', async () => {
      // Create an events file
      const eventsFile = path.join(STATE_DIR, 'events');
--- a/tests/brainstorm-server/stop-server.test.sh
+++ b/tests/brainstorm-server/stop-server.test.sh
@@ -0,0 +1,86 @@
+#!/usr/bin/env bash
+# Tests for stop-server.sh PID-ownership safety.
+#
+# A stale server.pid (e.g. after a reboot, when the kernel has recycled the PID)
+# can point at an unrelated, live process. stop-server.sh must verify the PID is
+# actually our brainstorm server before signalling it.
+
+set -u
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+STOP="$SCRIPT_DIR/../../skills/brainstorming/scripts/stop-server.sh"
+SERVER="$SCRIPT_DIR/../../skills/brainstorming/scripts/server.cjs"
+
+PASS=0; FAIL=0
+ok() { echo "  PASS: $1"; PASS=$((PASS + 1)); }
+bad() { echo "  FAIL: $1"; echo "    $2"; FAIL=$((FAIL + 1)); }
+
+# --- Test 1: an unrelated, reused PID must NOT be killed ---
+SESS="$(mktemp -d)"; mkdir -p "$SESS/state"
+sleep 600 &
+UNRELATED=$!
+echo "$UNRELATED" > "$SESS/state/server.pid"
+OUT="$("$STOP" "$SESS")"
+if kill -0 "$UNRELATED" 2>/dev/null; then
+  case "$OUT" in
+    *stale_pid*) ok "unrelated reused PID is left alone (stale_pid)" ;;
+    *) bad "unrelated PID survived but status was not stale_pid" "$OUT" ;;
+  esac
+else
+  bad "unrelated reused PID was KILLED" "$OUT"
+fi
+kill -9 "$UNRELATED" 2>/dev/null
+rm -rf "$SESS"
+
+# --- Test 2: a real brainstorm server IS stopped ---
+SESS="$(mktemp -d)"; mkdir -p "$SESS/content" "$SESS/state"
+BRAINSTORM_DIR="$SESS" BRAINSTORM_PORT=3399 node "$SERVER" > /dev/null 2>&1 &
+SRV=$!
+for _ in $(seq 1 40); do kill -0 "$SRV" 2>/dev/null && break; sleep 0.1; done
+sleep 0.4
+echo "$SRV" > "$SESS/state/server.pid"
+OUT="$("$STOP" "$SESS")"
+sleep 0.3
+if kill -0 "$SRV" 2>/dev/null; then
+  bad "real brainstorm server still running after stop" "$OUT"
+  kill -9 "$SRV" 2>/dev/null
+else
+  case "$OUT" in
+    *stopped*) ok "real brainstorm server is stopped" ;;
+    *) bad "server stopped but status was not 'stopped'" "$OUT" ;;
+  esac
+fi
+rm -rf "$SESS"
+
+# --- Test 3: no pid file ---
+SESS="$(mktemp -d)"; mkdir -p "$SESS/state"
+OUT="$("$STOP" "$SESS")"
+case "$OUT" in
+  *not_running*) ok "missing pid file reports not_running" ;;
+  *) bad "missing pid file: unexpected status" "$OUT" ;;
+esac
+rm -rf "$SESS"
+
+# --- Test 4: a `node server.cjs` impostor NOT listening on our port is spared ---
+if command -v lsof > /dev/null 2>&1; then
+  SESS="$(mktemp -d)"; mkdir -p "$SESS/state"
+  echo '{"type":"server-started","port":3499}' > "$SESS/state/server-info" # nothing listens on 3499
+  ( exec -a "node server.cjs" sleep 600 ) &
+  IMPOSTOR=$!
+  echo "$IMPOSTOR" > "$SESS/state/server.pid"
+  OUT="$("$STOP" "$SESS")"
+  if kill -0 "$IMPOSTOR" 2>/dev/null; then
+    case "$OUT" in
+      *stale_pid*) ok "a node server.cjs not listening on our port is left alone" ;;
+      *) bad "impostor survived but status was not stale_pid" "$OUT" ;;
+    esac
+  else
+    bad "killed a node server.cjs that was NOT on our recorded port" "$OUT"
+  fi
+  kill -9 "$IMPOSTOR" 2>/dev/null
+  rm -rf "$SESS"
+else
+  echo "  SKIP: lsof unavailable — port cross-check test"
+fi
+
+echo "--- Results: $PASS passed, $FAIL failed ---"
+[ "$FAIL" -eq 0 ] || exit 1
--- a/tests/brainstorm-server/ws-protocol.test.js
+++ b/tests/brainstorm-server/ws-protocol.test.js
@@ -329,6 +329,21 @@ function runTests() {
    assert.strictEqual(result.payload.length, 65536);
  });

+  test('rejects oversized 64-bit frames before payload allocation', () => {
+    const mask = Buffer.from([0x00, 0x00, 0x00, 0x00]);
+    const header = Buffer.alloc(14);
+    header[0] = 0x81; // FIN + TEXT
+    header[1] = 0x80 | 127; // masked, 64-bit length
+    header.writeBigUInt64BE(BigInt(ws.MAX_FRAME_PAYLOAD_BYTES) + 1n, 2);
+    mask.copy(header, 10);
+
+    assert.throws(
+      () => ws.decodeFrame(header),
+      /exceeds maximum allowed size/i,
+      'oversized advertised payload must be rejected from header alone'
+    );
+  });
+
  // ========== Close Frame with Status Code ==========
  console.log('\n--- Close Frame Details ---');

--- a/tests/codex-plugin-sync/test-sync-to-codex-plugin.sh
+++ b/tests/codex-plugin-sync/test-sync-to-codex-plugin.sh
@@ -175,6 +175,7 @@ write_upstream_fixture() {

    mkdir -p \
        "$repo/.codex-plugin" \
+        "$repo/.kimi-plugin" \
        "$repo/.private-journal" \
        "$repo/assets" \
        "$repo/evals/drill" \
@@ -210,6 +211,13 @@ EOF
  "name": "superpowers",
  "version": "$MANIFEST_VERSION"
 }
+EOF
+
+    cat > "$repo/.kimi-plugin/plugin.json" <<EOF
+{
+  "name": "superpowers",
+  "version": "$MANIFEST_VERSION"
+}
 EOF

    cat > "$repo/assets/superpowers-small.svg" <<'EOF'
@@ -267,6 +275,7 @@ EOF

    git -C "$repo" add \
        .codex-plugin/plugin.json \
+        .kimi-plugin/plugin.json \
        .gitignore \
        assets/app-icon.png \
        assets/superpowers-small.svg \
@@ -415,10 +424,15 @@ EOF
 write_stale_ignored_destination_fixture() {
    local repo="$1"

-    mkdir -p "$repo/plugins/superpowers/.private-journal"
+    mkdir -p \
+        "$repo/plugins/superpowers/.kimi-plugin" \
+        "$repo/plugins/superpowers/.private-journal"
    printf 'fixture keep\n' > "$repo/plugins/superpowers/.fixture-keep"
+    printf '{"name":"stale-kimi"}\n' > "$repo/plugins/superpowers/.kimi-plugin/plugin.json"
    printf 'stale ignored leak\n' > "$repo/plugins/superpowers/.private-journal/leak.txt"
-    git -C "$repo" add plugins/superpowers/.fixture-keep
+    git -C "$repo" add \
+        plugins/superpowers/.fixture-keep \
+        plugins/superpowers/.kimi-plugin/plugin.json

    commit_fixture "$repo" "Initial stale ignored destination fixture"
 }
@@ -618,6 +632,7 @@ main() {
    assert_contains "$preview_output" "Version:  $MANIFEST_VERSION" "Preview uses manifest version"
    assert_not_contains "$preview_output" "Version:  $PACKAGE_VERSION" "Preview does not use package.json version"
    assert_contains "$preview_section" ".codex-plugin/plugin.json" "Preview includes manifest path"
+    assert_not_contains "$preview_section" ".kimi-plugin/plugin.json" "Preview excludes Kimi manifest from Codex sync"
    assert_contains "$preview_section" "assets/superpowers-small.svg" "Preview includes SVG asset"
    assert_contains "$preview_section" "assets/app-icon.png" "Preview includes PNG asset"
    assert_contains "$preview_section" "hooks/hooks-codex.json" "Preview includes Codex hook manifest"
@@ -644,6 +659,7 @@ main() {
    echo ""
    echo "Convergence assertions..."
    assert_equals "$stale_preview_status" "0" "Stale ignored destination preview exits successfully"
+    assert_matches "$stale_preview_section" "\\*deleting +\\.kimi-plugin/plugin\\.json" "Preview deletes stale Kimi manifest from Codex plugin"
    assert_matches "$stale_preview_section" "\\*deleting +\\.private-journal/leak\\.txt" "Preview deletes stale ignored destination file"

    echo ""
--- a/tests/kimi/run-tests.sh
+++ b/tests/kimi/run-tests.sh
@@ -0,0 +1,6 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+
+bash "$SCRIPT_DIR/test-plugin-manifest.sh"
--- a/tests/kimi/test-plugin-manifest.sh
+++ b/tests/kimi/test-plugin-manifest.sh
@@ -0,0 +1,86 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
+MANIFEST="$REPO_ROOT/.kimi-plugin/plugin.json"
+
+python3 - "$MANIFEST" <<'PY'
+import json
+import sys
+from pathlib import Path
+
+manifest_path = Path(sys.argv[1])
+manifest = json.loads(manifest_path.read_text(encoding="utf-8"))
+
+def assert_equal(actual, expected, label):
+    if actual != expected:
+        raise AssertionError(f"{label}: expected {expected!r}, got {actual!r}")
+
+def assert_present(text, needle, label):
+    if needle not in text:
+        raise AssertionError(f"{label}: missing {needle!r}")
+
+assert_equal(manifest.get("name"), "superpowers", "plugin name")
+assert_equal(manifest.get("skills"), "./skills/", "skills path")
+assert_equal(
+    manifest.get("sessionStart", {}).get("skill"),
+    "using-superpowers",
+    "sessionStart.skill",
+)
+
+instructions = manifest.get("skillInstructions")
+if not isinstance(instructions, str) or not instructions.strip():
+    raise AssertionError("skillInstructions must be a non-empty string")
+
+for token in [
+    "AskUserQuestion",
+    "TodoList",
+    "Agent",
+    "Skill",
+    "Read",
+    "Write",
+    "Edit",
+    "Bash",
+    "Grep",
+    "Glob",
+    "FetchURL",
+    "WebSearch",
+]:
+    assert_present(instructions, token, "skillInstructions")
+
+version_config = json.loads(
+    (manifest_path.parents[1] / ".version-bump.json").read_text(encoding="utf-8")
+)
+version_entries = version_config.get("files")
+if not isinstance(version_entries, list):
+    raise AssertionError(".version-bump.json must contain files list")
+
+if not any(
+    entry.get("path") == ".kimi-plugin/plugin.json" and entry.get("field") == "version"
+    for entry in version_entries
+    if isinstance(entry, dict)
+):
+    raise AssertionError(
+        ".version-bump.json must update .kimi-plugin/plugin.json version"
+    )
+
+unsupported_fields = [
+    "tools",
+    "commands",
+    "hooks",
+    "apps",
+    "inject",
+    "configFile",
+    "config_file",
+    "bootstrap",
+]
+present_unsupported = sorted(field for field in unsupported_fields if field in manifest)
+if present_unsupported:
+    raise AssertionError(
+        "unsupported Kimi runtime fields present: "
+        + ", ".join(present_unsupported)
+    )
+
+print("Kimi plugin manifest looks good")
+PY
--- a/tests/shell-lint/test-lint-shell.sh
+++ b/tests/shell-lint/test-lint-shell.sh
@@ -0,0 +1,179 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
+SCRIPT_UNDER_TEST="$REPO_ROOT/scripts/lint-shell.sh"
+
+FAILURES=0
+TEST_ROOT="$(mktemp -d)"
+
+cleanup() {
+  rm -rf "$TEST_ROOT"
+}
+trap cleanup EXIT
+
+pass() {
+  echo "  [PASS] $1"
+}
+
+fail() {
+  echo "  [FAIL] $1"
+  FAILURES=$((FAILURES + 1))
+}
+
+assert_contains() {
+  local haystack="$1"
+  local needle="$2"
+  local description="$3"
+
+  if printf '%s' "$haystack" | grep -Fq -- "$needle"; then
+    pass "$description"
+  else
+    fail "$description"
+    echo "    expected to find: $needle"
+    echo "    in:"
+    printf '%s\n' "$haystack" | sed 's/^/      /'
+  fi
+}
+
+assert_not_contains() {
+  local haystack="$1"
+  local needle="$2"
+  local description="$3"
+
+  if printf '%s' "$haystack" | grep -Fq -- "$needle"; then
+    fail "$description"
+    echo "    did not expect to find: $needle"
+    echo "    in:"
+    printf '%s\n' "$haystack" | sed 's/^/      /'
+  else
+    pass "$description"
+  fi
+}
+
+configure_git_identity() {
+  local repo="$1"
+
+  git -C "$repo" config user.name "Test Bot"
+  git -C "$repo" config user.email "test@example.com"
+}
+
+write_stub_tool() {
+  local path="$1"
+  local name="$2"
+
+  cat >"$path" <<EOF
+#!/usr/bin/env bash
+{
+  printf '${name}:'
+  for arg in "\$@"; do
+    printf ' <%s>' "\$arg"
+  done
+  printf '\n'
+} >> "\$SUPERPOWERS_SHELL_LINT_TEST_LOG"
+exit 0
+EOF
+  chmod +x "$path"
+}
+
+make_fixture_repo() {
+  local repo="$1"
+
+  git init -q -b main "$repo"
+  configure_git_identity "$repo"
+
+  mkdir -p "$repo/hooks"
+  cat >"$repo/tracked.sh" <<'EOF'
+#!/usr/bin/env bash
+echo "tracked"
+EOF
+  cat >"$repo/hooks/session-start" <<'EOF'
+#!/bin/sh
+echo "extensionless"
+EOF
+  cat >"$repo/README.md" <<'EOF'
+# Fixture
+
+```bash
+echo "not a shell script"
+```
+EOF
+  cat >"$repo/untracked.sh" <<'EOF'
+#!/usr/bin/env bash
+echo "untracked"
+EOF
+
+  git -C "$repo" add tracked.sh hooks/session-start README.md
+  git -C "$repo" commit -q -m "fixture"
+
+  printf '\necho "changed"\n' >>"$repo/tracked.sh"
+  printf '\necho "changed extensionless"\n' >>"$repo/hooks/session-start"
+}
+
+run_lint_shell() {
+  local repo="$1"
+  local fakebin="$2"
+  local log="$3"
+  shift 3
+
+  (
+    cd "$repo"
+    PATH="$fakebin:$PATH" \
+      SUPERPOWERS_SHELL_LINT_TEST_LOG="$log" \
+      bash "$SCRIPT_UNDER_TEST" "$@"
+  )
+}
+
+echo "Shell lint script tests"
+
+fixture="$TEST_ROOT/repo"
+fakebin="$TEST_ROOT/bin"
+log="$TEST_ROOT/tool.log"
+mkdir -p "$fixture" "$fakebin"
+: >"$log"
+write_stub_tool "$fakebin/shellcheck" "shellcheck"
+write_stub_tool "$fakebin/shfmt" "shfmt"
+make_fixture_repo "$fixture"
+
+if output="$(run_lint_shell "$fixture" "$fakebin" "$log" 2>&1)"; then
+  pass "lint-shell check mode exits successfully with stub tools"
+else
+  fail "lint-shell check mode exits successfully with stub tools"
+  printf '%s\n' "$output" | sed 's/^/      /'
+fi
+
+tool_log="$(cat "$log")"
+assert_contains "$output" "Linting 3 shell files" "reports changed shell file count"
+assert_not_contains "$tool_log" "shfmt:" "does not run shfmt in lint mode"
+assert_contains "$tool_log" "shellcheck:" "runs ShellCheck"
+assert_contains "$tool_log" "<--severity=warning>" "uses warning severity as the baseline"
+assert_contains "$tool_log" "<--external-sources>" "allows ShellCheck to follow sourced files"
+assert_contains "$tool_log" "<--source-path=SCRIPTDIR>" "resolves ShellCheck sources relative to each script"
+assert_contains "$tool_log" "<hooks/session-start>" "includes changed extensionless shell shebang file"
+assert_contains "$tool_log" "<tracked.sh>" "includes changed tracked .sh file"
+assert_contains "$tool_log" "<untracked.sh>" "includes untracked shell files by default"
+assert_not_contains "$tool_log" "README.md" "ignores Markdown with shell snippets"
+
+: >"$log"
+if output="$(run_lint_shell "$fixture" "$fakebin" "$log" --all --format 2>&1)"; then
+  pass "lint-shell --format exits successfully with stub tools"
+else
+  fail "lint-shell --format exits successfully with stub tools"
+  printf '%s\n' "$output" | sed 's/^/      /'
+fi
+
+tool_log="$(cat "$log")"
+assert_contains "$tool_log" "<-w>" "uses shfmt write mode with --format"
+assert_contains "$tool_log" "shellcheck:" "runs ShellCheck after --format"
+assert_contains "$tool_log" "<--severity=warning>" "keeps warning severity after --format"
+assert_contains "$tool_log" "<hooks/session-start>" "--all includes tracked extensionless shell shebang file"
+assert_contains "$tool_log" "<tracked.sh>" "--all includes tracked .sh file"
+assert_not_contains "$tool_log" "untracked.sh" "--all ignores untracked shell files"
+
+if [[ "$FAILURES" -eq 0 ]]; then
+  echo "All shell lint script tests passed"
+else
+  echo "$FAILURES shell lint script test(s) failed"
+  exit 1
+fi
Author	SHA1	Message	Date
Jesse Vincent	ff50f01ab2	chore(evals): bump submodule to companion just-in-time scenario Bump the evals submodule (ff3ee83 -> f1ac859) to include the brainstorming visual-companion just-in-time eval scenario that validates the SKILL.md consent-move in this PR (GREEN pass / RED fail on Quorum and drill). Scope: dev's recorded pointer predates the drill->Quorum migration, so this bump also carries that migration.	2026-06-09 19:55:22 -07:00
Jesse Vincent	b0fa0f2e36	fix(brainstorm-server): fix auth-integration bugs from full-branch review A second adversarial review of the merged branch found that combining the session-key auth with the feature work created real bugs the (vacuous) tests missed: - [Critical] GET /files/ (empty name) resolved to CONTENT_DIR and crashed the process with uncaught EISDIR — newly reachable because the query-stripping refactor turns /files/?key=... into /files/. Reject non-regular-file names. - [High] --open opened a KEYLESS url, which the auth gate 403s — the headline feature landed on the error page. Open the keyed url. - [High] Same-port restart regenerated the token (port persisted, token not), so the open tab's old cookie 403'd and never reconnected — contradicting the documented promise. Persist the token (BRAINSTORM_TOKEN_FILE / .last-token) alongside the port. - [Medium] Token sat in world-readable server-info/server.log (0644 in /tmp). umask 077 in start-server.sh + mode 0600 on server-info/.last-token. - [Medium] touchActivity() ran before the auth check, so unauthenticated requests defeated the idle timeout. Count activity only after authorization. - [Low] COOKIE_NAME embedded the pre-fallback port; derive it from the actual bound port (also prevents a cross-server cookie-jar collision on fallback). Tests added/strengthened (previously passed vacuously): /files/ no-crash; the auto-open url carries the key and is reachable (200); restart reuses the same key not just the port; unauthenticated requests don't reset the idle clock. Full suite green (ws-protocol 32, helper 12, auth 13, server 29, lifecycle 8, stop-server 4); restart smoke confirms same port+key and old URL -> 200.	2026-06-09 19:13:52 -07:00
Jesse Vincent	610e4d39f0	test(brainstorm-server): thread session key through tests after auth merge Integrating the per-session-key auth onto the same branch as the dotfile and lifecycle work: two tests added after the auth commit opened WebSockets without a key (server.test.js dotfile-reload, lifecycle.test.js idle-shutdown), which the auth gate now resets. Pass ?key=/BRAINSTORM_TOKEN in both. Full suite green: ws-protocol 32, helper 12, auth 13, server 28, lifecycle 7, stop-server 4.	2026-06-09 18:33:00 -07:00
Jesse Vincent	e3fe480b29	feat(brainstorm-server): gate every endpoint behind a per-session key The companion server is reachable by any local browser tab (default loopback bind) and by any host that can route to it (remote --host bind). It served screens, files, and accepted event-injecting WebSocket connections with no authentication, so a malicious browser tab or a direct remote client could read brainstorm content or inject events that the agent reads as the user's input (prompt injection into a live session). Generate a per-session secret token, carry it in the served URL as ?key=, and mirror it into an HttpOnly SameSite=Strict per-port cookie on first load so same-origin subresources and the WebSocket handshake authenticate automatically. Every HTTP request and WebSocket upgrade now requires a valid key (query or cookie, constant-time compared); unauthenticated requests get a friendly 403 explaining they need the full URL. A secret authenticates the client uniformly across loopback, tunnel, and remote binds and defeats DNS rebinding, which a Host/Origin allowlist cannot. Also guard handleMessage against a null JSON payload that crashed the process. Tests: new auth.test.js (13 cases) covering the key on /, /files/*, and WS plus cookie bootstrap and the null-payload guard; server.test.js threads the key; ws-protocol.test.js + auth.test.js wired into npm test. Closes #1014 Refs #1110, #1553, #1504	2026-06-09 18:29:49 -07:00
Jesse Vincent	3e3c10e671	docs(brainstorm): catalog visual companion issues; choose session-key for security Records the triage of open issues/PRs touching the brainstorm companion server and the decision to protect it with a per-session secret key (supersedes the Host/Origin allowlist approach) so remote-connected users are covered, not just loopback.	2026-06-09 18:27:43 -07:00
Jesse Vincent	843c473382	fix(brainstorm-server): tie stop-server PID check to the session's port The node+server.cjs command match (from the adversarial review) still matched any unrelated node process running a file named server.cjs. When we recorded the bound port (state/server-info) and lsof is available, additionally require the PID to be the process actually LISTENING on this session's port — which rules out a different project's server.cjs / editor task runner that recycled the stale PID. Falls back to the command match when the port or lsof isn't available. Test: a 'node server.cjs' process not listening on the recorded port is spared. Refs #1703	2026-06-09 17:27:30 -07:00
Jesse Vincent	f8f87ff43a	fix(brainstorm-server): address adversarial review findings From a two-reviewer adversarial pass: - [High] EADDRINUSE fallback clobbered the shared .last-port: onListen wrote the bound port unconditionally, so a fallback to a random port overwrote the preferred port another live session still owns — stranding that session's open tab forever. Now persist only when we bound the preferred port (not on fallback). The fallback test now asserts .last-port integrity (teeth-verified). - [Medium] maybeOpenBrowser ran the URL through a shell (exec + JSON.stringify), which does NOT neutralize $(...) in a url-host. Platform launchers now use execFile with the URL as an argv element (no shell). The operator-set BRAINSTORM_OPEN_CMD path stays shell-based (trusted input). - [Medium] --open was a silent no-op on native Windows (no win32 branch). Added. - [Medium] helper.js reconnect/status/tombstone had only substring-grep tests. Added behavioral tests driving the state machine against a mocked browser: Reconnecting+backoff (500->1000->2000), tombstone after the grace period, and reload-on-recovery. - [Low] status pill showed a false 'Connected' before the socket opened; now starts 'Connecting…' until onopen. Not changed (flagged): stop-server.sh's PID-ownership check still matches any 'node ... server.cjs' (narrow residual — a recycled PID onto an unrelated node server.cjs); robust fix needs fragile cross-platform process introspection.	2026-06-09 15:59:59 -07:00
Jesse Vincent	7b815ed8c8	feat(brainstorming): offer the visual companion just-in-time; harden lifecycle guidance Move the companion consent from an upfront, anticipatory offer to the first moment a question would genuinely be clearer shown than told. If no visual question ever arises, it's never offered. On approval the agent starts the server with --open, so the user's browser opens to the first screen — the pop is tied to that approval, never unsolicited. Also hardens visual-companion.md: confirming the server is alive (server-info present, server-stopped absent) before referring to the URL is now a required step; restart with the same --project-dir reuses the port so the open tab reconnects on its own (paused overlay while down); idle default corrected to 4h. NOTE: SKILL.md is behavior-shaping content — this flow change should be eval-tested (writing-skills adversarial pressure test) before merge. Refs #1237, #1037	2026-06-09 15:32:58 -07:00
Jesse Vincent	bccc41dffe	feat(brainstorm-server): opt-in auto-open of the browser on the first screen When the user approves the visual companion, open their browser automatically the first time a screen is actually ready to show — rather than at startup (just the waiting page) or making them open the URL by hand. Opt-in and gated on approval: off unless BRAINSTORM_OPEN is set (start-server.sh --open, which the agent passes only after the user agrees to use the companion). Even then it fires once, and is skipped if a browser is already connected, on a non-loopback/remote bind, or when headless. Launcher is the platform default (open / xdg-open / WSL cmd.exe) or BRAINSTORM_OPEN_CMD; best-effort, never fatal. lifecycle.test.js: opens once on the first screen when approved; does NOT open without approval. Closes #755 Refs #759	2026-06-09 15:30:25 -07:00
Jesse Vincent	b53c62eba8	feat(brainstorm-server): reuse the same port on session restart When the companion idle-shuts-down and the agent restarts it, a fresh random port meant the user's open browser tab pointed at a dead URL. Persist the bound port per project and prefer it on the next start, so the restarted server comes up on the same port and the open tab's reconnect just works. - start-server.sh exports BRAINSTORM_PORT_FILE=<project>/.superpowers/brainstorm/ .last-port for project sessions (not /tmp). - server.cjs prefers an explicit BRAINSTORM_PORT, else the recorded port, else random; writes the actually-bound port back; and on EADDRINUSE (preferred port still in use) falls back to a random port once instead of crashing. lifecycle.test.js: restart reuses the recorded port; a taken preferred port falls back to a random one without crashing. Refs #1237	2026-06-09 15:22:23 -07:00
Jesse Vincent	e6cf11f68c	feat(brainstorm-companion): resilient reconnect, live status, paused overlay The injected client reconnected on a fixed 1s timer with no feedback: if the laptop slept or the server restarted, the page showed 'Connected' over a dead socket and silently queued events. And when the server stopped, the user got a bare connection-refused with no explanation. helper.js now: - reconnects with exponential backoff (500ms, doubling, capped at 30s; reset on open), with an onerror->close handler, nulls the socket on close, and clears a pending timer before scheduling another; - drives the frame status pill Connected/Reconnecting/Disconnected via a --status-color custom property (frame-template.html); - after ~15s disconnected, shows a self-styled 'Companion paused' overlay (tombstone) explaining the companion stopped and will reconnect automatically; - on recovery from a tombstoned outage (e.g. server restarted on the same port) reloads to pick up the restarted server's current screen. The reconnect-backoff is an exported pure function; helper.test.js unit-tests it (doubling + cap progression) and asserts the status/tombstone/reconnect wiring. DOM behaviour is verified live. Refs #856, #1237	2026-06-09 15:18:19 -07:00
Jesse Vincent	f057b4a30b	feat(brainstorm-server): 4h configurable idle timeout; close WS on shutdown The companion shut down after only 30 minutes idle — too short for real brainstorming, where a single question can sit far longer. And shutdown() never closed upgraded WebSocket sockets, so an open browser connection could keep the Node process alive after it was supposed to exit. - Default idle timeout raised to 4 hours, configurable via BRAINSTORM_IDLE_TIMEOUT_MS and start-server.sh --idle-timeout-minutes (validated positive integer). - Reported as idle_timeout_ms in the server-started JSON / server-info. - shutdown() now destroys all client sockets so the process exits even with an open WebSocket. - Watchdog check interval is configurable (BRAINSTORM_LIFECYCLE_CHECK_MS, default 60s) so the lifecycle can be tested without minute-long waits. Adds lifecycle.test.js (configured timeout reported; idle shutdown exits despite an open WS — teeth-verified; the start-server flag). Wires ws-protocol, lifecycle, and stop-server suites into npm test. Closes #1237 Refs #1689	2026-06-09 15:08:09 -07:00
Jesse Vincent	ddcb56c16e	fix(brainstorm-server): verify PID ownership before stopping stop-server.sh read server.pid and SIGKILL'd that PID with no checks. After a reboot or PID wraparound the pid file can point at an unrelated, live process — which we would then kill. Verify the PID is actually our server (a running 'node ... server.cjs') before signalling it. If ownership can't be proven, fail closed: remove the stale pid file and report {status: stale_pid} without killing anything. Real servers still stop ({status: stopped}); a missing pid file still reports not_running. Adds stop-server.test.sh covering: an unrelated reused PID is left alone, a real server is stopped, and a missing pid file. Refs #1703	2026-06-09 15:02:25 -07:00
Jesse Vincent	e0442fba00	fix(brainstorm-server): ignore macOS resource-fork dotfiles On macOS (and ExFAT/SMB volumes) the OS writes ._<name>.html sidecar files holding binary resource-fork metadata. These end with .html, so they passed the content filter and could be picked as the newest screen — serving binary garbage to the browser instead of the mockup — or fetched via /files/. Skip dotfiles (leading '.') at all four sites that list or serve content: getNewestScreen, the /files/ endpoint, the known-files seed, and the fs.watch handler. Tests cover serving (/ and /files/) and the watch path (a ._ file must not trigger a reload). Refs #950	2026-06-09 15:02:25 -07:00
Jesse Vincent	f55642e0dd	Require contributors to disclose authoring environment and target dev Add a mandatory self-identification disclosure (model, harness, harness version, all installed plugins) to the PR template and all three issue templates, and document the requirement in the contributor guidelines. We weigh contributions differently depending on what produced them: content reasoned from documentation is held to a different bar than work grounded in a real session. Also state explicitly, in both CLAUDE.md and the PR template, that all PRs must target the dev branch rather than main.	2026-06-08 22:14:34 -07:00
Drew Ritter	ae1eefb7f9	chore(evals): bump submodule to --scenarios filter (ff3ee83) Adds `run-all --scenarios` for resuming a scenario subset across the Code Assist rate-limit windows. Follows the agy rate-limit fix (79f9963). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 22:46:00 -07:00
Drew Ritter	617168aff5	chore(evals): bump submodule to antigravity rate-limit fix (79f9963) Serialize antigravity against the Gemini Code Assist rate limit (max_concurrency=1), diagnose 429/RESOURCE_EXHAUSTED honestly instead of as auth, fail-fast on a latched window, and tolerant preflight OK match. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 16:27:35 -07:00
Rahul	d7c260a978	fix(brainstorming): cap websocket frame payloads	2026-06-02 11:24:02 -07:00
Drew Ritter	f3f0789c5c	Add shell lint script	2026-06-01 19:48:28 -07:00
Drew Ritter	16a1719988	Tighten Kimi plugin porting coverage	2026-06-01 19:41:58 -07:00
Drew Ritter	c74c22daa7	docs: restore Kimi direct install command	2026-06-01 19:41:58 -07:00
Drew Ritter	773bbf61d6	docs: simplify Kimi README install steps	2026-06-01 19:41:58 -07:00
Drew Ritter	6b76158550	fix: wire Kimi plugin into release metadata	2026-06-01 19:41:58 -07:00
Drew Ritter	7fec40bb55	fix: align Kimi manifest with supported fields	2026-06-01 19:41:58 -07:00
qer	2a8e54735b	feat: add Kimi Code plugin manifest	2026-06-01 19:41:58 -07:00
Matt Van Horn	f776394360	feat(subagent-dev): add TDD RED evidence to implementer report format Add a conditional TDD Evidence field to the implementer report format so controllers can verify RED and GREEN output when TDD was required. The field asks for the command run, relevant RED/GREEN output, and the expected RED failure reason rather than raw full logs. Fixes #994.	2026-06-01 16:15:05 -07:00
Drew Ritter	7301c81b4d	docs(windows): trim polyglot hook implementation copy	2026-06-01 16:07:01 -07:00
dev_Hakaze	9d3e68a5ad	docs(windows): update polyglot hook docs Rewrite the Windows polyglot hook documentation to match the current run-hook.cmd dispatcher and update the porting guide cross-reference.\n\nFixes #1653.	2026-06-01 15:57:30 -07:00
nestorluiscamachopaz	81c3052416	fix: foreground mode saves node PID and clears OWNER_PID on Windows/MSYS2 Verified on real Windows Git Bash: lifecycle test passed 12/12, manual start/stop released the port, and no brainstorm node processes remained.	2026-06-01 14:26:22 -07:00
nawfal	c879454a0d	fix(finishing-a-development-branch): remove gh-specific PR creation instruction Per obra's guidance on #1609: remove the github-specific instruction rather than replacing it with a platform-detection table. Agents already know their forge tooling; the skill only needs to cover the push step. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-01 13:58:22 -07:00
nawfal	ff213eb2cf	fix(finishing-a-development-branch): detect remote platform before creating PR/MR Replaces hardcoded `gh pr create` in Option 2 with a platform-neutral note: check `git remote get-url origin` first, then use gh (GitHub), glab (GitLab), or fall back to the compare URL for unknown platforms. Adds matching Red Flag entry so agents don't skip the detection step. Fixes #1609 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-01 13:58:22 -07:00
Jesse Vincent	da00e59958	feat: add Antigravity CLI (agy) support Antigravity (Google's `agy` CLI) installs the existing Superpowers plugin directly: agy plugin install https://github.com/obra/superpowers agy imports the bundled skills and runs the plugin's SessionStart hook, so using-superpowers bootstraps from the first message — verified on agy 1.0.3: a fresh session given "Let's make a react todo list" auto-triggers the brainstorming skill instead of writing code. agy discovers skills natively and, having no Skill tool, loads them by reading SKILL.md with view_file. No scaffold, installer, or generated context file is needed. This adds only: - README.md: an Antigravity install section + Quickstart link - skills/using-superpowers/SKILL.md: reference to the agy tool mapping - skills/using-superpowers/references/antigravity-tools.md: action->tool mapping for agy (view_file, write_to_file, invoke_subagent, manage_task, and skill loading via view_file on SKILL.md) - tests/antigravity/: structural test for the tool mapping, mirroring tests/pi/	2026-06-01 11:42:09 -07:00
Jesse Vincent	deceaec78d	docs: add 'Porting Superpowers to a New Harness' guide An evergreen guide for adding support for a new harness (IDE, CLI, or agent runner). Teaches the invariants — automatic session-start bootstrap, skill discovery/invocation, tool mapping, the acceptance test — and points at the closest reference integration shape (shell-hook, in-process plugin, instructions-file / declared context file) to copy. Covers discovery, build, local install, tmux-driven verification, distribution, and PR submission, with a live reference-integration index and a gotchas appendix. Two non-negotiable rules: (1) never edit skill bodies; (2) everything ships through the harness's own install mechanism — never edit the user's config. When a plugin installer strips undeclared files, declare the bootstrap as a recognized component (a manifest contextFileName-style context file the installer preserves and the harness loads every session), generated at install time from the live SKILL.md + tool mapping. Surfaced-skill-description bootstrap is the softer fallback. Hardened against real end-to-end ports (Antigravity CLI): shapes can compose; a fork doesn't inherit its parent's behavior; a hook system != a usable session-start event; verify @-includes AND context-file preservation with a marker; web-search the docs and study existing plugins; reverse-engineer undocumented harnesses; print/headless modes may hang; workspace-trust gates stall tmux; declared context files survive plugin install while undeclared files are stripped; skills-path registration is per-harness.	2026-06-01 10:07:38 -07:00