Bump version to 6.0.0

Add visual companion Prime Radiant branding
Draft Superpowers 6 release notes
2026-06-16 23:59:05 +08:00 · 2026-06-15 21:19:07 -07:00 · 2026-06-15 20:56:31 -07:00 · 2026-06-15 20:27:43 -07:00 · 2026-06-15 13:48:30 -07:00 · 2026-06-15 12:17:46 -07:00
19 changed files with 590 additions and 71 deletions
--- a/.claude-plugin/marketplace.json
+++ b/.claude-plugin/marketplace.json
@@ -9,7 +9,7 @@
    {
      "name": "superpowers",
      "description": "Core skills library for Claude Code: TDD, debugging, collaboration patterns, and proven techniques",
-      "version": "5.1.0",
+      "version": "6.0.0",
      "source": "./",
      "author": {
        "name": "Jesse Vincent",
--- a/.claude-plugin/plugin.json
+++ b/.claude-plugin/plugin.json
@@ -1,7 +1,7 @@
 {
  "name": "superpowers",
  "description": "Core skills library for Claude Code: TDD, debugging, collaboration patterns, and proven techniques",
-  "version": "5.1.0",
+  "version": "6.0.0",
  "author": {
    "name": "Jesse Vincent",
    "email": "jesse@fsck.com"
--- a/.codex-plugin/plugin.json
+++ b/.codex-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
  "name": "superpowers",
-  "version": "5.1.0",
+  "version": "6.0.0",
  "description": "An agentic skills framework & software development methodology that works: planning, TDD, debugging, and collaboration workflows.",
  "author": {
    "name": "Jesse Vincent",
--- a/.cursor-plugin/plugin.json
+++ b/.cursor-plugin/plugin.json
@@ -2,7 +2,7 @@
  "name": "superpowers",
  "displayName": "Superpowers",
  "description": "Core skills library: TDD, debugging, collaboration patterns, and proven techniques",
-  "version": "5.1.0",
+  "version": "6.0.0",
  "author": {
    "name": "Jesse Vincent",
    "email": "jesse@fsck.com"
--- a/.kimi-plugin/plugin.json
+++ b/.kimi-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
  "name": "superpowers",
-  "version": "5.1.0",
+  "version": "6.0.0",
  "description": "An agentic skills framework and software development methodology.",
  "author": {
    "name": "Jesse Vincent",
--- a/README.md
+++ b/README.md
@@ -2,6 +2,13 @@
 Superpowers is a complete software development methodology for your coding agents, built on top of a set of composable skills and some initial instructions that make sure your agent uses them.
 ## We're Hiring!
 We're hiring someone to help out full time with Superpowers community and code work. 
 You can read about the job at https://primeradiant.com/jobs/superpowers-community-engineer/
 If this sounds like someone you know, definitely send them our way.
 ## Quickstart
 Give your agent Superpowers: [Claude Code](#claude-code), [Antigravity](#antigravity), [Codex App](#codex-app), [Codex CLI](#codex-cli), [Cursor](#cursor), [Factory Droid](#factory-droid), [Gemini CLI](#gemini-cli), [GitHub Copilot CLI](#github-copilot-cli), [Kimi Code](#kimi-code), [OpenCode](#opencode), [Pi](#pi).
@@ -18,15 +25,9 @@ Next up, once you say "go", it launches a *subagent-driven-development* process,
 There's a bunch more to it, but that's the core of the system. And because the skills trigger automatically, you don't need to do anything special. Your coding agent just has Superpowers.
 ## Commercial Services
-## Sponsorship
+If you're using Superpowers in enterprise and could benefit from commercial support, additional tooling, or managed spending, please don't hesitate to drop us a line at sales@primeradiant.com.
 If Superpowers has helped you do stuff that makes money and you are so inclined, I'd greatly appreciate it if you'd consider [sponsoring my opensource work](https://github.com/sponsors/obra).
 Thanks! 
 \- Jesse
 ## Installation
@@ -273,6 +274,10 @@ Superpowers updates are somewhat coding-agent dependent, but are often automatic
 MIT License - see LICENSE file for details
 ## Visual companion telemetry
 Because skills and plugins don't provide any feedback to creators, we have no idea how many of you are using Superpowers. By default, the Prime Radiant logo on brainstorming's optional visual companion feature is loaded from our website. It includes the version of Superpowers in use. It does not include any details about your project, prompt, or coding agent. We don't see your clicks or anything about what you're building. This helps us have a rough idea of how many folks are using Superpowers and which version of Superpowers they're using. It's 100% optional. To disable this, set the environment variable `SUPERPOWERS_DISABLE_TELEMETRY` to any true value. Superpowers also honors Claude Code's `DISABLE_TELEMETRY` and `CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC` opt-outs.
 ## Community
 Superpowers is built by [Jesse Vincent](https://blog.fsck.com) and the rest of the folks at [Prime Radiant](https://primeradiant.com).
--- a/RELEASE-NOTES.md
+++ b/RELEASE-NOTES.md
@@ -1,5 +1,103 @@
 # Superpowers Release Notes
 ## v6.0.0 (date TBD)
 Superpowers 6.0 is a big release. The headline is a rewrite of how `subagent-driven-development` reviews each task — cheaper, stricter, and harder to game. 
 While these numbers won't hold on every harness and for every workload, in our evals, Claude Code and Codex produce similar high-quality results roughly twice as fast and while spending almost 50% fewer tokens.
 It also adds three new harnesses (Kimi Code, Pi, and Antigravity), gives the brainstorming visual companion a better security model, and rewrites a number of skills' tool calls to be significantly more vendor-neutral.
 ### Visible Changes
 - **The two per-task reviewer prompts became one.** `spec-reviewer-prompt.md` and `code-quality-reviewer-prompt.md` are gone, replaced by a single `task-reviewer-prompt.md`. If you dispatch the old files directly, switch to the new one.
 - **The legacy global worktree directory is gone.** `using-git-worktrees` and `finishing-a-development-branch` no longer use `~/.config/superpowers/worktrees/`. Worktrees now land in the project — an existing `.worktrees/` or `worktrees/` if you have one, otherwise a fresh `.worktrees/` — unless you say otherwise.
 ### New Harness Support
 Superpowers now runs on three more harnesses. Each ships its own bootstrap, a tool-mapping reference, and tests, and each gets its own install section in the README.
 - **Kimi Code** — a plugin manifest, install docs, and manifest tests; install from Kimi's marketplace or straight from the repo. (initial manifest by @qer)
 - **Pi** — a session-start extension that registers the skills and injects the `using-superpowers` bootstrap. Pi has native skills, so it needs no compatibility shim.
 - **Antigravity (`agy`)** — installs the plugin directly and bootstraps from the first message; verified end-to-end against the standard "make a react todo list" acceptance test.
 ### Subagent-Driven Development
 A long run of cost-and-quality experiments on real projects reshaped how the controller reviews each task. The old flow ran two reviewers per task and leaned on the controller's judgment for model choice and severity, and both turned out to be expensive and easy to game. The new flow runs one reviewer per task, hands work off as files instead of pasted text, and takes several judgment calls away from the controller.
 - **One reviewer per task, two verdicts.** A single `task-reviewer-prompt.md` reads the task's diff once and returns both a spec-compliance verdict and a quality verdict, so one fix pass clears both. A new "can't verify from the diff" verdict flags requirements that live in untouched code, for the controller to check itself. (#1538, #1543)
 - **One broad review at the end.** The run finishes with a single whole-branch review on the most capable model, instead of re-reviewing everything task by task.
 - **Plans get a pre-flight read.** Before the first task, the controller checks the plan for internal conflicts — and for anything the plan asks for that a reviewer would flag as a defect — and raises it all at once, rather than stumbling into it mid-run.
 - **Diffs and task text move as files.** A pasted diff parks itself permanently in the most expensive context, and a reviewer without one rebuilds it by hand — the single biggest reviewer cost. Two new scripts, `task-brief` and `review-package`, write the task text and the review diff to files for the subagent to read.
 - **Every dispatch states its model.** Left to choose, controllers stopped naming a model at all — and an unnamed model quietly inherits the session's most expensive one, so one run put all 26 of its reviewers on the top tier. The templates now require a model, with guidance that reaches for cheaper tiers when the work allows.
 - **The controller can't tell a reviewer what to ignore.** Real runs caught controllers coaching reviewers to skip a finding or call it "Minor at most," and the flaw shipped. Suppressing findings and pre-rating severity are now banned outright, and a defect the plan itself mandates gets reported for you to decide on rather than waved through.
 - **Reviewers are read-only and skeptical of rationales.** Review no longer touches the working tree or branch — a reviewer running `git checkout` had been orphaning later commits — and an implementer's "I left this unabstracted on purpose" no longer talks a reviewer out of a real finding.
 - **Stronger evidence and reporting.** Reviewers back each answer with a file and line, the implementer's report moves to a file and carries red/green evidence when TDD applies, and a progress ledger lets a controller that loses its context resume instead of redoing finished work. (#994)
 ### Writing Plans
 Plans now carry the structure the controller and reviewers used to re-derive on every dispatch.
 - **A Global Constraints block** lists the rules that bind every task — version floors, dependency limits, naming and copy, exact values — copied in verbatim, so they actually reach the implementers and reviewers downstream.
 - **A per-task Interfaces block** names exactly what each task consumes and produces, so an implementer who sees only its own task still knows its neighbors' contracts.
 - **Right-sizing guidance** keeps a task at the size that earns its own test cycle and a reviewer's pass, folding setup, config, and docs into the task that needs them. In testing, a plan written this way needed one round of fixes where the control needed two to four — and the control shipped a real bug.
 ### Brainstorming Visual Companion
 The visual companion is a small web server the agent opens alongside the conversation. It had no authentication at all, so on a shared or remote machine anyone who could reach the port could read your brainstorm — or inject events the agent treats as your input. This release gives it a real security model and makes it survive restarts and dropped connections.
 - **A per-session key now guards everything.** The agent's URL carries a one-time key, the browser tucks it into a tab-scoped cookie, and every request and WebSocket connection has to present it. This closes the door to stray local tabs and routable remote hosts alike, including the DNS-rebinding case an origin allowlist can't catch. (Closes #1014)
 - **The file server stays in its sandbox.** It refuses symlinks, dotfiles, and any path that climbs out of the content directory, ignores macOS resource-fork files, and sends the usual no-store and deny-framing headers. Files that hold the session key are written owner-only.
 - **The companion is offered only when it helps.** The skill raises it the first time a question would read better shown than told, as its own message, and lets a decline stand. Accepting opens your browser to the first screen. (Closes #755)
 - **It survives restarts and flaky connections.** Given a project directory, the server keeps the same port and key across restarts, so an open tab simply reconnects. The page reconnects on its own, shows a live status pill, and raises a "paused" overlay while the server is down.
 - **Longer idle life, safer shutdown.** The idle timeout went from 30 minutes to 4 hours, and `stop-server.sh` now confirms it owns the right process before signaling, so it never kills an unrelated `node` after a reboot. (#1703)
 - **Windows launch hardening** — consolidated shell detection, and Windows now relies on the idle timeout for shutdown, since Node can't track POSIX process ownership across MSYS2.
 ### Existing Harness Updates
 - **Codex** now bootstraps through its own SessionStart hook rather than shared wiring, and the Codex App gained an install section and fuller tool docs (web search, `AGENTS.md`, personal skills). (#1540)
 - **OpenCode** got an action-based tool mapping across its plugin, install doc, and README, plus a bootstrap-caching test.
 - **Cursor**'s manifest dropped its `agents` and `commands` entries, since those directories no longer exist.
 ### One Set of Skills, Every Harness
 The skills used to speak Claude Code's dialect — "use the Task tool," "put it in CLAUDE.md." This release rewrites that vocabulary in terms of what you're actually doing ("dispatch a subagent," "your instructions file") and adds a per-harness reference that maps each action to the right tool, checked against each runtime. Prose that named "Claude" now says "your agent."
 - **A tool reference per harness** at `skills/using-superpowers/references/`, covering Claude Code, Codex, Copilot, Gemini, Pi, and Antigravity.
 - **`finishing-a-development-branch` went forge-neutral** — it no longer hardcodes `gh pr create`, so agents push with whatever forge tooling they have. (#1609)
 - **One rename:** "Claude Search Optimization" is now "Skill Discovery Optimization," since the technique isn't Claude-specific.
 ### Writing Skills
 Two additions for skill authors.
 - **Match the Form to the Failure** — a short table for picking the right kind of guidance. A flat "don't do X" works for discipline slips but backfires when the problem is the *shape* of an output, where a worked example does better. The table, and a tighter scope on the existing rationalization section, steer authors to the form that actually helps.
 - **Micro-Test Wording** — a cheap way to check a phrasing before committing to it: sample it a handful of times against a no-guidance control and read every result by hand, treating run-to-run variance as a warning sign.
 ### Testing
 Skill-behavior testing moved out of `tests/` into a new `evals/` submodule built on "drill," which runs real Claude Code, Codex, and Gemini sessions and judges them with an LLM. Several in-tree bash suites retired once a stricter drill scenario covered them; the few with no equivalent stayed. From here on, `tests/` holds plugin-code tests and `evals/` holds skill-behavior tests, and `docs/testing.md` explains the split. New backends reach Antigravity, Pi, and more models, and new shell-lint and pre-commit checks guard the harness. (#1541)
 ### Bug Fixes
 - **systematic-debugging no longer forces every session into extended thinking.** One bullet held the exact keyword Claude Code scans for, quietly tripping the switch on every session that loaded the skill. A hyphen breaks the keyword; the text still reads. (#1283, by @Nick Galatis)
 - **The Windows SessionStart hook stopped printing a write error every session** — each `printf` now routes through `cat` to absorb the broken pipe, and the output is otherwise unchanged. (#1612, reported by @silvertakana)
 - **Windows foreground mode** tracks the right process and clears its owner PID on MSYS2. (by @nestorluiscamachopaz)
 - **The `using-superpowers` bootstrap** no longer lists "debugging" as a skill that doesn't exist. (reported by @mhat)
 - **The TDD skill** links the testing anti-patterns reference. (#1532, #1529; link fix #1474 by @Stable Genius)
 - **`using-git-worktrees`** fixes its step numbering and drops stale Cursor references. (#1522, and by @fuleinist)
 - **The Codex review skill** swaps a private in-joke for plain guidance. (#1531)
 ### Documentation & Contributor Guidelines
 - **A guide to porting Superpowers to a new harness** (`docs/porting-to-a-new-harness.md`) lays out the three pieces every integration needs and the one rule that makes or breaks it: load the bootstrap at session start.
 - **Every PR and issue now discloses how it was made** — model, harness, version, and installed plugins, or a note that it was written by hand. We weigh a contribution differently depending on what produced it. PRs also target `dev`, not `main`. The PR template, all three issue templates, and a new platform-support template carry this.
 ### Contributors
 Thanks to @mattvanhorn, @nawfal, @Nick Galatis, @silvertakana, @nestorluiscamachopaz, @qer, @mhat, @Stable Genius, @fuleinist, @dev_Hakaze, @robotsnh, Rahul, and @arittr.
 ## v5.1.0 (2026-04-30)
 ### Removals
--- a/docs/superpowers/specs/2026-06-10-strict-cost-sdd-design.md
+++ b/docs/superpowers/specs/2026-06-10-strict-cost-sdd-design.md
@@ -133,8 +133,29 @@ opus controller flagged it 5/5. Cheap controllers handle explicit
 escalation; they absorb implicit authority-vs-quality adjudication.
 A possible L2b (discrete rule: "a reviewer finding that conflicts with
 the plan's text is the human's decision — escalate it") would route the
-failing judgment through the escalation behavior that held; untested.
+failing judgment through the escalation behavior that held.
-Original recon notes follow.
+
 **L2b tested 2026-06-11 (E35/E36, evals
 `docs/experiments/2026-06-11-build-loop-autoresearch.md`): improves the
 opus stack, does NOT rescue the sonnet rung.** Two rules: a reviewer
 tripwire (a plan-mandated defect IS a finding — Important, labeled
 plan-mandated; the human decides) and a controller escalation rule
 (plan-mandated findings go to the human like any plan contradiction).
 Micro on frozen sonnet-composed inputs: 0/6 → 6/6 labeled findings.
 Full battery: opus controllers 2/2 internalized the rule, caught their
 reviewer's miss as self-described backstop, and escalated for a
 sanctioned fix (the 4241 ad-hoc behavior made structural); escalation
 sanity 2/2 unbroken. Sonnet controllers: 1/5 full pass — paraphrase
 drops the tripwire from dispatches (2/5 transmitted), transmission
 alone doesn't fire it live (read-once dilution across the reviewer's
 tool reads; placement within the dispatch refuted as the variable),
 and no sonnet controller showed backstop behavior; 1/5 shipped the
 defect. The L2b rules are a candidate commit for the opus stack.
 A future L2c for the sonnet rung would pair the SKILL.md
 constraints-recipe (the one channel sonnet transmits verbatim) with a
 mandatory output-format slot for plan-mandated findings (the skeleton
 survives every observed paraphrase and is consulted at composition
 time); untested. Original recon notes follow.
 **Recon (superseded):**
 Sonnet-controller runs (claude-sonnet coding-agent): all gates green at
--- a/gemini-extension.json
+++ b/gemini-extension.json
@@ -1,6 +1,6 @@
 {
  "name": "superpowers",
  "description": "Core skills library: TDD, debugging, collaboration patterns, and proven techniques",
-  "version": "5.1.0",
+  "version": "6.0.0",
  "contextFileName": "GEMINI.md"
 }
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
  "name": "superpowers",
-  "version": "5.1.0",
+  "version": "6.0.0",
  "description": "Superpowers skills and runtime bootstrap for coding agents",
  "type": "module",
  "main": ".opencode/plugins/superpowers.js",
--- a/skills/brainstorming/scripts/frame-template.html
+++ b/skills/brainstorming/scripts/frame-template.html
@@ -9,7 +9,7 @@
     *
     * This template provides a consistent frame with:
     * - OS-aware light/dark theming
-     * - Fixed header and selection indicator bar
+     * - Header branding and connection status
     * - Scrollable main content area
     * - CSS helpers for common UI patterns
     *
@@ -63,34 +63,37 @@
    }
    /* ===== FRAME STRUCTURE ===== */
-    .header {
+    .brand { display: flex; align-items: center; min-width: 0; overflow: hidden; color: var(--text-secondary); line-height: 1; }
-      background: var(--bg-secondary);
+    .brand a { color: inherit; text-decoration: none; display: flex; align-items: center; gap: 0.5rem; min-width: 0; max-width: 100%; line-height: 1; }
-      padding: 0.5rem 1.5rem;
+    .brand-copy { display: block; min-width: 0; overflow: hidden; text-overflow: ellipsis; white-space: nowrap; line-height: 1; transform: translateY(-1px); }
-      display: flex;
+    .brand-logo { display: block; height: 1em; width: auto; max-width: 180px; flex-shrink: 0; filter: invert(1); }
-      justify-content: space-between;
+    @media (prefers-color-scheme: dark) {
-      align-items: center;
+      .brand-logo { filter: none; }
      border-bottom: 1px solid var(--border);
      flex-shrink: 0;
    }
-    .header h1 { font-size: 0.85rem; font-weight: 500; color: var(--text-secondary); }
+    .status { font-size: 0.7rem; color: var(--status-color, var(--success)); display: flex; align-items: center; gap: 0.4rem; justify-self: end; white-space: nowrap; line-height: 1; }
-    .header .status { font-size: 0.7rem; color: var(--status-color, var(--success)); display: flex; align-items: center; gap: 0.4rem; }
+    .status::before { content: ''; width: 6px; height: 6px; background: var(--status-color, var(--success)); border-radius: 50%; }
    .header .status::before { content: ''; width: 6px; height: 6px; background: var(--status-color, var(--success)); border-radius: 50%; }
    .main { flex: 1; overflow-y: auto; }
    #frame-content { padding: 2rem; min-height: 100%; }
-    .indicator-bar {
+    .header {
      background: var(--bg-secondary);
-      border-top: 1px solid var(--border);
+      border-bottom: 1px solid var(--border);
      padding: 0.5rem 1.5rem;
      flex-shrink: 0;
-      text-align: center;
+      display: grid;
      grid-template-columns: minmax(0, 1fr) auto;
      align-items: center;
      gap: 1rem;
      min-height: 42px;
    }
-    .indicator-bar span {
+    .header .brand { justify-self: start; width: 100%; font-size: 0.75rem; line-height: 1; }
    .header .status { grid-column: 2; line-height: 1; }
    .header span {
      font-size: 0.75rem;
      color: var(--text-secondary);
    }
-    .indicator-bar .selected-text {
+    .header .selected-text {
      color: var(--accent);
      font-weight: 500;
    }
@@ -196,7 +199,7 @@
 </head>
 <body>
  <div class="header">
-    <h1><a href="https://github.com/obra/superpowers" style="color: inherit; text-decoration: none;">Superpowers Brainstorming</a></h1>
+    <!-- BRANDING -->
    <div class="status">Connecting…</div>
  </div>
@@ -206,9 +209,5 @@
    </div>
  </div>
  <div class="indicator-bar">
    <span id="indicator-text">Click an option above, then return to the terminal</span>
  </div>
 </body>
 </html>
--- a/skills/brainstorming/scripts/helper.js
+++ b/skills/brainstorming/scripts/helper.js
@@ -138,21 +138,6 @@
      id: target.id || null
    });
    // Update indicator bar (defer so toggleSelect runs first)
    setTimeout(() => {
      const indicator = document.getElementById('indicator-text');
      if (!indicator) return;
      const container = target.closest('.options') || target.closest('.cards');
      const selected = container ? container.querySelectorAll('.selected') : [];
      if (selected.length === 0) {
        indicator.textContent = 'Click an option above, then return to the terminal';
      } else if (selected.length === 1) {
        const label = selected[0].querySelector('h3, .content h3, .card-body h3')?.textContent?.trim() || selected[0].dataset.choice;
        indicator.innerHTML = '<span class="selected-text">' + label + ' selected</span> — return to terminal to continue';
      } else {
        indicator.innerHTML = '<span class="selected-text">' + selected.length + ' selected</span> — return to terminal to continue';
      }
    }, 0);
  });
  // Frame UI: selection tracking
--- a/skills/brainstorming/scripts/server.cjs
+++ b/skills/brainstorming/scripts/server.cjs
@@ -102,6 +102,14 @@ const URL_HOST = process.env.BRAINSTORM_URL_HOST || (HOST === '127.0.0.1' ? 'loc
 const SESSION_DIR = process.env.BRAINSTORM_DIR || '/tmp/brainstorm';
 const CONTENT_DIR = path.join(SESSION_DIR, 'content');
 const STATE_DIR = path.join(SESSION_DIR, 'state');
 const SUPERPOWERS_VERSION = readSuperpowersVersion();
 const SUPERPOWERS_BRAND_IMAGE_URL = 'https://primeradiant.com/brand/superpowers-visual-brainstorming-logo.png';
 const TELEMETRY_DISABLE_ENV_VARS = [
  'SUPERPOWERS_DISABLE_TELEMETRY',
  'DISABLE_TELEMETRY',
  'CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC'
 ];
 const SUPERPOWERS_TELEMETRY_DISABLED = TELEMETRY_DISABLE_ENV_VARS.some(name => isTruthyEnv(process.env[name]));
 let ownerPid = process.env.BRAINSTORM_OWNER_PID ? Number(process.env.BRAINSTORM_OWNER_PID) : null;
 // Per-session secret key. The companion is reachable by any local browser tab
@@ -150,14 +158,22 @@ const MIME_TYPES = {
 // ========== Templates and Constants ==========
-const WAITING_PAGE = `<!DOCTYPE html>
+function waitingPage() {
  return renderBranding(`<!DOCTYPE html>
 <html>
 <head><meta charset="utf-8"><title>Brainstorm Companion</title>
-<style>body { font-family: system-ui, sans-serif; padding: 2rem; max-width: 800px; margin: 0 auto; }
+<style>
-h1 { color: #333; } p { color: #666; }</style>
+body { font-family: system-ui, sans-serif; padding: 2rem; max-width: 800px; margin: 0 auto; }
 h1 { color: #333; } p { color: #666; }
 .brand { display: flex; align-items: center; min-width: 0; overflow: hidden; margin-bottom: 1.5rem; color: #666; font-size: 0.9rem; line-height: 1; }
 .brand a { color: inherit; text-decoration: none; display: flex; align-items: center; gap: 0.5rem; min-width: 0; max-width: 100%; line-height: 1; }
 .brand-copy { display: block; min-width: 0; overflow: hidden; text-overflow: ellipsis; white-space: nowrap; line-height: 1; transform: translateY(-1px); }
 .brand-logo { display: block; height: 1em; width: auto; max-width: 180px; filter: invert(1); }
 </style>
 </head>
-<body><h1>Brainstorm Companion</h1>
+<body><!-- BRANDING --><h1>Brainstorm Companion</h1>
-<p>Waiting for the agent to push a screen...</p></body></html>`;
+<p>Waiting for the agent to push a screen...</p></body></html>`);
 }
 const FORBIDDEN_PAGE = `<!DOCTYPE html>
 <html>
@@ -189,13 +205,55 @@ const helperInjection = '<script>\n' + helperScript + '\n</script>';
 // ========== Helper Functions ==========
 function readSuperpowersVersion() {
  try {
    const packageJson = JSON.parse(
      fs.readFileSync(path.join(__dirname, '../../..', 'package.json'), 'utf-8')
    );
    return String(packageJson.version || 'unknown');
  } catch (e) {
    return 'unknown';
  }
 }
 function isTruthyEnv(value) {
  if (!value) return false;
  const normalized = String(value).trim().toLowerCase();
  if (!normalized) return false;
  return !['0', 'false', 'no', 'off'].includes(normalized);
 }
 function escapeHtmlText(value) {
  return String(value)
    .replace(/&/g, '&amp;')
    .replace(/</g, '&lt;')
    .replace(/>/g, '&gt;')
    .replace(/"/g, '&quot;');
 }
 function brandMarkup() {
  const version = escapeHtmlText(SUPERPOWERS_VERSION);
  const text = SUPERPOWERS_TELEMETRY_DISABLED
    ? 'Prime Radiant Superpowers v' + version
    : 'Superpowers v' + version;
  const logo = SUPERPOWERS_TELEMETRY_DISABLED
    ? ''
    : '<img class="brand-logo" src="' + SUPERPOWERS_BRAND_IMAGE_URL + '?v=' + encodeURIComponent(SUPERPOWERS_VERSION) + '" alt="Prime Radiant" referrerpolicy="no-referrer" decoding="async">';
  return '<div class="brand"><a href="https://github.com/obra/superpowers">' + logo + '<span class="brand-copy">' + text + '</span></a></div>';
 }
 function renderBranding(html) {
  return html.split('<!-- BRANDING -->').join(brandMarkup());
 }
 function isFullDocument(html) {
  const trimmed = html.trimStart().toLowerCase();
  return trimmed.startsWith('<!doctype') || trimmed.startsWith('<html');
 }
 function wrapInFrame(content) {
-  return frameTemplate.replace('<!-- CONTENT -->', content);
+  return renderBranding(frameTemplate).replace('<!-- CONTENT -->', content);
 }
 function getNewestScreen() {
@@ -341,7 +399,7 @@ function handleRequest(req, res) {
    const screenFile = getNewestScreen();
    let html = screenFile
      ? (raw => isFullDocument(raw) ? raw : wrapInFrame(raw))(fs.readFileSync(screenFile, 'utf-8'))
-      : WAITING_PAGE;
+      : waitingPage();
    if (html.includes('</body>')) {
      html = html.replace('</body>', helperInjection + '\n</body>');
--- a/skills/brainstorming/visual-companion.md
+++ b/skills/brainstorming/visual-companion.md
@@ -28,7 +28,7 @@ A question *about* a UI topic is not automatically a visual question. "What kind
 The server watches a directory for HTML files and serves the newest one to the browser. You write HTML content to `screen_dir`, the user sees it in their browser and can click to select options. Selections are recorded to `state_dir/events` that you read on your next turn.
-**Content fragments vs full documents:** If your HTML file starts with `<!DOCTYPE` or `<html`, the server serves it as-is (just injects the helper script). Otherwise, the server automatically wraps your content in the frame template — adding the header, CSS theme, selection indicator, and all interactive infrastructure. **Write content fragments by default.** Only write full documents when you need complete control over the page.
+**Content fragments vs full documents:** If your HTML file starts with `<!DOCTYPE` or `<html`, the server serves it as-is (just injects the helper script). Otherwise, the server automatically wraps your content in the frame template — adding the header, CSS theme, connection status, and all interactive infrastructure. **Write content fragments by default.** Only write full documents when you need complete control over the page.
 ## Starting a Session
@@ -138,7 +138,7 @@ Use `--url-host` to control what hostname is printed in the returned URL JSON.
 ## Writing Content Fragments
-Write just the content that goes inside the page. The server wraps it in the frame template automatically (header, theme CSS, selection indicator, and all interactive infrastructure).
+Write just the content that goes inside the page. The server wraps it in the frame template automatically (header, theme CSS, connection status, and all interactive infrastructure).
 **Minimal example:**
@@ -184,7 +184,7 @@ The frame template provides these CSS classes for your content:
 </div>
 ```
-**Multi-select:** Add `data-multiselect` to the container to let users select multiple options. Each click toggles the item. The indicator bar shows the count.
+**Multi-select:** Add `data-multiselect` to the container to let users select multiple options. Each click toggles the item's selected styling.
 ```html
 <div class="options" data-multiselect>
--- a/skills/subagent-driven-development/SKILL.md
+++ b/skills/subagent-driven-development/SKILL.md
@@ -11,6 +11,9 @@ Execute plan by dispatching a fresh implementer subagent per task, a task review
 **Core principle:** Fresh subagent per task + task review (spec + quality) + broad final review = high quality, fast iteration
 **Narration:** between tool calls, narrate at most one short line — the
 ledger and the tool results carry the record.
 **Continuous execution:** Do not pause to check in with your human partner between tasks. Execute all tasks from the plan without stopping. The only reasons to stop are: BLOCKED status you cannot resolve, ambiguity that genuinely prevents progress, or all tasks complete. "Should I continue?" prompts and progress summaries waste their time — they asked you to execute the plan, so execute it.
 ## When to Use
@@ -79,6 +82,20 @@ digraph process {
 }
 ```
 ## Pre-Flight Plan Review
 Before dispatching Task 1, scan the plan once for conflicts:
 - tasks that contradict each other or the plan's Global Constraints
 - anything the plan explicitly mandates that the review rubric treats as a
  defect (a test that asserts nothing, verbatim duplication of a logic block)
 Present everything you find to your human partner as one batched question —
 each finding beside the plan text that mandates it, asking which governs —
 before execution begins, not one interrupt per discovery mid-plan. If the
 scan is clean, proceed without comment. The review loop remains the net for
 conflicts that only emerge from implementation.
 ## Model Selection
 Use the least powerful model that can handle each role to conserve cost and increase speed.
@@ -88,6 +105,8 @@ Use the least powerful model that can handle each role to conserve cost and incr
 **Integration and judgment tasks** (multi-file coordination, pattern matching, debugging): use a standard model.
 **Architecture and design tasks**: use the most capable available model.
 The final whole-branch review is one of these — dispatch it on the most
 capable available model, not the session default.
 **Review tasks**: choose the model with the same judgment, scaled to the
 diff's size, complexity, and risk. A small mechanical diff does not need the
@@ -100,8 +119,10 @@ most expensive — which silently defeats this section.
 **Turn count beats token price.** Wall-clock and context cost scale with how
 many turns a subagent takes, and the cheapest models routinely take 2-3× the
 turns on multi-step work — costing more overall. Use a mid-tier model as the
-floor for implementers and reviewers; reserve the cheapest tier for
+floor for reviewers and for implementers working from prose descriptions.
-single-file mechanical fixes.
+When the task's plan text contains the complete code to write, the
 implementation is transcription plus testing: use the cheapest tier for
 that implementer. Single-file mechanical fixes also take the cheapest tier.
 **Task complexity signals (implementation tasks):**
 - Touches 1-2 files with a complete spec → cheap model
@@ -174,6 +195,11 @@ final whole-branch review. When you fill a reviewer template:
  findings in the progress ledger as you go, and point the final
  whole-branch review at that list so it can triage which must be fixed
  before merge. A roll-up nobody reads is a silent discard.
 - A finding labeled plan-mandated — or any finding that conflicts with
  what the plan's text requires — is the human's decision, like any plan
  contradiction: present the finding and the plan text, ask which governs.
  Do not dismiss the finding because the plan mandates it, and do not
  dispatch a fix that contradicts the plan without asking.
 - The final whole-branch review gets a package too: run
  `scripts/review-package MERGE_BASE HEAD` (MERGE_BASE = the commit the
  branch started from, e.g. `git merge-base main HEAD`) and include the
--- a/skills/subagent-driven-development/task-reviewer-prompt.md
+++ b/skills/subagent-driven-development/task-reviewer-prompt.md
@@ -115,6 +115,11 @@ Subagent (general-purpose):
    "yes." A tight report that cites lines gives the controller everything
    it needs.
    Your final message is the report itself: begin directly with the
    spec-compliance verdict. Every line is a verdict, a finding with
    file:line, or a check you ran — no preamble, no process narration,
    no closing summary.
    ## Calibration
    Categorize issues by actual severity. Not everything is Critical.
@@ -123,6 +128,11 @@ Subagent (general-purpose):
    would block a merge over — verbatim duplication of a logic block,
    swallowed errors, tests that assert nothing. "Coverage could be broader"
    and polish suggestions are Minor.
    If the plan or brief explicitly mandates something this rubric calls a
    defect (a test that asserts nothing, verbatim duplication of a logic
    block), that IS a finding — report it as Important, labeled
    plan-mandated. The plan's authorship does not grade its own work; the
    human decides.
    Acknowledge what was done well before listing issues — accurate praise
    helps the implementer trust the rest of the feedback.
--- a/tests/brainstorm-server/branding.test.js
+++ b/tests/brainstorm-server/branding.test.js
@@ -0,0 +1,309 @@
 /**
 * Tests for the visual companion's Superpowers/Prime Radiant branding.
 */
 const { spawn } = require('child_process');
 const http = require('http');
 const fs = require('fs');
 const path = require('path');
 const assert = require('assert');
 const REPO_ROOT = path.join(__dirname, '../..');
 const SERVER_PATH = path.join(REPO_ROOT, 'skills/brainstorming/scripts/server.cjs');
 const PACKAGE_VERSION = JSON.parse(
  fs.readFileSync(path.join(REPO_ROOT, 'package.json'), 'utf-8')
 ).version;
 const TOKEN = 'testtoken-branding-0123456789abcdef';
 const ASSET_URL = 'https://primeradiant.com/brand/superpowers-visual-brainstorming-logo.png';
 function cleanup(dir) {
  if (fs.existsSync(dir)) {
    fs.rmSync(dir, { recursive: true });
  }
 }
 function sleep(ms) {
  return new Promise(resolve => setTimeout(resolve, ms));
 }
 function startServer({ port, dir, env = {} }) {
  cleanup(dir);
  return spawn('node', [SERVER_PATH], {
    env: {
      ...process.env,
      BRAINSTORM_PORT: String(port),
      BRAINSTORM_DIR: dir,
      BRAINSTORM_TOKEN: TOKEN,
      ...env
    }
  });
 }
 function waitForServer(server) {
  let stdout = '';
  let stderr = '';
  return new Promise((resolve, reject) => {
    const timeout = setTimeout(() => reject(new Error(`Server did not start. stderr: ${stderr}`)), 5000);
    server.stdout.on('data', (data) => {
      stdout += data.toString();
      if (stdout.includes('server-started')) {
        clearTimeout(timeout);
        resolve();
      }
    });
    server.stderr.on('data', (data) => { stderr += data.toString(); });
    server.on('error', reject);
  });
 }
 function fetchHtml(port) {
  return new Promise((resolve, reject) => {
    const headers = { Cookie: `brainstorm-key-${port}=${TOKEN}` };
    http.get(`http://localhost:${port}/`, { headers }, (res) => {
      let body = '';
      res.on('data', chunk => { body += chunk; });
      res.on('end', () => resolve(body));
    }).on('error', reject);
  });
 }
 function writeFragment(dir) {
  const contentDir = path.join(dir, 'content');
  fs.mkdirSync(contentDir, { recursive: true });
  fs.writeFileSync(path.join(contentDir, 'screen.html'), '<h2>Pick a layout</h2>');
 }
 async function withServer(options, fn) {
  const server = startServer(options);
  try {
    await waitForServer(server);
    await fn();
  } finally {
    if (server.exitCode === null && server.signalCode === null) {
      server.kill();
      await new Promise(resolve => server.once('exit', resolve));
    }
    await sleep(100);
    cleanup(options.dir);
  }
 }
 let passed = 0;
 let failed = 0;
 async function test(name, fn) {
  try {
    await fn();
    console.log(`  PASS: ${name}`);
    passed++;
  } catch (e) {
    console.log(`  FAIL: ${name}`);
    console.log(`    ${e.message}`);
    failed++;
  }
 }
 function assertBrandedWithLogo(html) {
  assert(
    html.includes(`Superpowers v${PACKAGE_VERSION}`),
    'branding text should include dynamic package version'
  );
  assert(
    !html.includes(`Superpowers v${PACKAGE_VERSION} by`),
    'branding text should not include "by" when the logo is visible'
  );
  assert(
    /<img class="brand-logo"[^>]*>\s*<span class="brand-copy">Superpowers v/.test(html),
    'visible logo should appear before the Superpowers version text'
  );
  assert(
    /\.brand a\s*\{[^}]*line-height:\s*1/i.test(html),
    'brand row should align the logo and version text by their visual height'
  );
  assert(
    /\.brand a\s*\{[^}]*gap:\s*0\.5rem/i.test(html),
    'brand row should keep the logo and version text close together'
  );
  assert(
    /\.brand a\s*\{[^}]*max-width:\s*100%/i.test(html),
    'brand link should be constrained so it cannot overlap the status column'
  );
  assert(
    /\.brand\s*\{[^}]*line-height:\s*1/i.test(html),
    'brand wrapper should not inherit the page line height'
  );
  assert(
    /\.brand\s*\{[^}]*overflow:\s*hidden/i.test(html),
    'brand wrapper should clip before it reaches the status column'
  );
 }
 function assertBrandedFallbackText(html) {
  assert(
    html.includes(`Prime Radiant Superpowers v${PACKAGE_VERSION}`),
    'disabled telemetry should keep plain text Prime Radiant/Superpowers branding'
  );
 }
 function assertTelemetryImage(html) {
  const expectedUrl = `${ASSET_URL}?v=${encodeURIComponent(PACKAGE_VERSION)}`;
  assert(html.includes(`src="${expectedUrl}"`), 'remote image should use the dedicated main-domain asset with only v=');
  assert(!html.includes('event='), 'remote image URL must not include event=');
  assert(!html.includes('surface='), 'remote image URL must not include surface=');
  assert(!html.includes('launch_id='), 'remote image URL must not include launch_id=');
  assert(!html.includes('lid='), 'remote image URL must not include lid=');
 }
 function assertLogoKeepsTransparentBackground(html) {
  assert(
    /\.brand-logo\s*\{[^}]*height:\s*1em/i.test(html),
    'logo should match the surrounding brand text size'
  );
  assert(
    /\.brand-logo\s*\{[^}]*display:\s*block/i.test(html),
    'logo should not reserve inline-image descender space'
  );
  assert(
    /\.brand-copy\s*\{[^}]*line-height:\s*1/i.test(html),
    'version text should use the same compact line height as the logo'
  );
  assert(
    /\.brand-copy\s*\{[^}]*min-width:\s*0/i.test(html),
    'version text should be allowed to shrink inside the brand row'
  );
  assert(
    /\.brand-copy\s*\{[^}]*transform:\s*translateY\(-1px\)/i.test(html),
    'version text should compensate for bottom padding inside the logo asset'
  );
  assert(
    /\.brand-logo\s*\{[^}]*filter:\s*invert\(1\)/i.test(html),
    'white logo asset should invert on light backgrounds'
  );
  assert(
    !/\.brand-logo\s*\{[^}]*background:/i.test(html),
    'logo should keep its transparent background'
  );
  assert(
    !/\.brand-logo\s*\{[^}]*padding:/i.test(html),
    'logo should not rely on a padded backing'
  );
 }
 function assertFramedLogoSupportsDarkTheme(html) {
  assert(
    /@media\s*\(prefers-color-scheme:\s*dark\)[\s\S]*\.brand-logo\s*\{[^}]*filter:\s*none/i.test(html),
    'framed screens should leave the white logo unfiltered in dark mode'
  );
 }
 function assertFramedScreenUsesBrandHeader(html) {
  const logoCount = (html.match(/class="brand-logo"/g) || []).length;
  assert.strictEqual(logoCount, 1, 'framed screens should render the logo only in the header');
  assert(!html.includes('<div class="indicator-bar">'), 'framed screens should not render footer chrome');
  assert(
    /<div class="header">[\s\S]*<div class="brand">[\s\S]*<div class="status">Connecting…<\/div>/.test(html),
    'header should contain branding and connection status'
  );
  assert(!html.includes('id="indicator-text"'), 'header should not render the selection indicator text');
  assert(!html.includes('Click an option above'), 'header should not render the selection instruction');
 }
 function assertHeaderAvoidsNarrowOverlap(html) {
  assert(
    /grid-template-columns:\s*minmax\(0,\s*1fr\)\s*auto/i.test(html),
    'header should allocate shrinkable space to branding before the status column'
  );
  assert(
    /\.header \.status\s*\{[^}]*grid-column:\s*2/i.test(html),
    'status should live in the final fixed-width grid column'
  );
  assert(
    /\.header \.brand\s*\{[^}]*width:\s*100%/i.test(html),
    'header brand should fill its grid track so overflow clipping prevents overlap'
  );
 }
 async function main() {
  console.log('\n--- Visual Companion Branding ---');
  await test('framed screens render versioned Prime Radiant logo by default', async () => {
    const port = 3451;
    const dir = '/tmp/brainstorm-branding-default';
    await withServer({ port, dir }, async () => {
      writeFragment(dir);
      await sleep(300);
      const html = await fetchHtml(port);
      assertBrandedWithLogo(html);
      assertTelemetryImage(html);
      assertLogoKeepsTransparentBackground(html);
      assertFramedLogoSupportsDarkTheme(html);
      assertFramedScreenUsesBrandHeader(html);
      assertHeaderAvoidsNarrowOverlap(html);
    });
  });
  await test('waiting screen renders versioned Prime Radiant logo by default', async () => {
    const port = 3452;
    const dir = '/tmp/brainstorm-branding-waiting';
    await withServer({ port, dir }, async () => {
      const html = await fetchHtml(port);
      assert(html.includes('Waiting for the agent'), 'waiting page should still render');
      assertBrandedWithLogo(html);
      assertTelemetryImage(html);
      assertLogoKeepsTransparentBackground(html);
    });
  });
  await test('SUPERPOWERS_DISABLE_TELEMETRY=true omits remote image but keeps local branding', async () => {
    const port = 3453;
    const dir = '/tmp/brainstorm-branding-disabled';
    await withServer({ port, dir, env: { SUPERPOWERS_DISABLE_TELEMETRY: 'true' } }, async () => {
      writeFragment(dir);
      await sleep(300);
      const html = await fetchHtml(port);
      assertBrandedFallbackText(html);
      assert(!html.includes(ASSET_URL), 'disabled telemetry should omit the remote image');
    });
  });
  await test('SUPERPOWERS_DISABLE_TELEMETRY=yes also omits the remote image on the waiting screen', async () => {
    const port = 3454;
    const dir = '/tmp/brainstorm-branding-disabled-waiting';
    await withServer({ port, dir, env: { SUPERPOWERS_DISABLE_TELEMETRY: 'yes' } }, async () => {
      const html = await fetchHtml(port);
      assertBrandedFallbackText(html);
      assert(!html.includes(ASSET_URL), 'disabled telemetry should omit the remote image');
    });
  });
  await test('DISABLE_TELEMETRY=true omits remote image for Claude Code telemetry opt-out', async () => {
    const port = 3455;
    const dir = '/tmp/brainstorm-branding-claude-disable-telemetry';
    await withServer({ port, dir, env: { DISABLE_TELEMETRY: 'true' } }, async () => {
      writeFragment(dir);
      await sleep(300);
      const html = await fetchHtml(port);
      assertBrandedFallbackText(html);
      assert(!html.includes(ASSET_URL), 'Claude Code telemetry opt-out should omit the remote image');
    });
  });
  await test('CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 omits remote image for Claude Code traffic opt-out', async () => {
    const port = 3456;
    const dir = '/tmp/brainstorm-branding-claude-disable-nonessential';
    await withServer({ port, dir, env: { CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC: '1' } }, async () => {
      const html = await fetchHtml(port);
      assertBrandedFallbackText(html);
      assert(!html.includes(ASSET_URL), 'Claude Code non-essential traffic opt-out should omit the remote image');
    });
  });
  console.log(`\n--- Results: ${passed} passed, ${failed} failed ---`);
  if (failed > 0) process.exitCode = 1;
 }
 main().catch((err) => {
  console.error('Test failed:', err);
  process.exit(1);
 });
--- a/tests/brainstorm-server/package.json
+++ b/tests/brainstorm-server/package.json
@@ -2,7 +2,7 @@
  "name": "brainstorm-server-tests",
  "version": "1.0.0",
  "scripts": {
-    "test": "node ws-protocol.test.js && node helper.test.js && node browser-launcher.test.js && node auth.test.js && node server.test.js && node lifecycle.test.js && bash start-server.test.sh && bash stop-server.test.sh"
+    "test": "node ws-protocol.test.js && node helper.test.js && node browser-launcher.test.js && node auth.test.js && node branding.test.js && node server.test.js && node lifecycle.test.js && bash start-server.test.sh && bash stop-server.test.sh"
  },
  "dependencies": {
    "ws": "^8.19.0"
--- a/tests/brainstorm-server/server.test.js
+++ b/tests/brainstorm-server/server.test.js
@@ -196,7 +196,7 @@ async function runTests() {
      const res = await fetch(`http://localhost:${TEST_PORT}/`);
      assert(res.body.includes('<h1>Custom Page</h1>'), 'Should contain original content');
      assert(res.body.includes('WebSocket'), 'Should still inject helper.js');
-      assert(!res.body.includes('indicator-bar'), 'Should NOT wrap in frame template');
+      assert(!res.body.includes('<div class="header">'), 'Should NOT wrap in frame template');
    });
    await test('wraps content fragments in frame template', async () => {
@@ -205,7 +205,7 @@ async function runTests() {
      await sleep(300);
      const res = await fetch(`http://localhost:${TEST_PORT}/`);
-      assert(res.body.includes('indicator-bar'), 'Fragment should get indicator bar');
+      assert(res.body.includes('<div class="header">'), 'Fragment should get header chrome');
      assert(!res.body.includes('<!-- CONTENT -->'), 'Placeholder should be replaced');
      assert(res.body.includes('Pick a layout'), 'Fragment content should be present');
      assert(res.body.includes('data-choice="a"'), 'Fragment interactive elements intact');
@@ -560,8 +560,16 @@ async function runTests() {
      const template = fs.readFileSync(
        path.join(__dirname, '../../skills/brainstorming/scripts/frame-template.html'), 'utf-8'
      );
-      assert(template.includes('indicator-bar'), 'Should have indicator bar');
+      assert(template.includes('<div class="header">'), 'Should have top header markup');
-      assert(template.includes('indicator-text'), 'Should have indicator text');
+      assert(!template.includes('indicator-bar'), 'Should not have footer chrome');
      assert(!template.includes('indicator-text'), 'Header should not render selection indicator text');
      assert(template.includes('<!-- BRANDING -->'), 'Should have branding placeholder');
      assert(template.includes('<div class="status">Connecting…</div>'), 'Header should include connection status');
      assert(template.includes('grid-template-columns: minmax(0, 1fr) auto;'), 'Header should let brand text shrink before the status column');
      assert(template.includes('padding: 0.5rem 1.5rem;'), 'Header should keep equal left and right edge padding');
      assert(template.includes('.header .brand { justify-self: start; width: 100%; font-size: 0.75rem; line-height: 1; }'), 'Header brand should align left, fill its grid track, and match header text size');
      assert(template.includes('.header .status { grid-column: 2; line-height: 1; }'), 'Header status should sit in the right column');
      assert(!template.includes('<div></div>'), 'Header should not use an empty spacer before branding');
      assert(template.includes('<!-- CONTENT -->'), 'Should have content placeholder');
      assert(template.includes('frame-content'), 'Should have content container');
      return Promise.resolve();
Author	SHA1	Message	Date
Jesse Vincent	9171439d2b	Bump version to 6.0.0	2026-06-15 21:19:07 -07:00
Drew Ritter	529e192c32	Add visual companion Prime Radiant branding	2026-06-15 20:56:31 -07:00
Jesse Vincent	985434ddb0	Draft Superpowers 6 release notes	2026-06-15 20:27:43 -07:00
Jesse Vincent	3e20a04ae5	Job posting	2026-06-15 13:48:30 -07:00
Jesse Vincent	71489c8160	E37: pre-flight plan review — surface plan conflicts as one batched question before Task 1	2026-06-15 12:17:46 -07:00
Jesse Vincent	97c9ea3f7d	Spec: L2b tested — opus structural win, sonnet transmission+attention gap (E35/E36); bump evals to 9919b27	2026-06-15 12:17:46 -07:00
Jesse Vincent	afecfcd239	L2b: plan-mandated defects are findings the human adjudicates Reviewer tripwire (Calibration): a plan-mandated defect IS a finding, reported as Important and labeled plan-mandated — the plan's authorship does not grade its own work. Controller rule (review loop): a plan-mandated finding, or any finding conflicting with the plan's text, escalates to the human like any plan contradiction — never dismissed because the plan mandates it. E35 micro (frozen 0a98 replay, sonnet reviewer, 6v6): without the tripwire 0/6 reports give the controller anything to escalate on (all Approved, defect endorsed as spec-required); with it 6/6 report the defect as a labeled finding.	2026-06-15 12:17:46 -07:00
Jesse Vincent	2989810931	E27 stack: conditional impl tier + final-review tier pin + narration recipe + terse reviewer contract	2026-06-15 12:17:46 -07:00
Jesse Vincent	1588b949f2	E03: cheapest-tier implementers when plan carries complete code (transcription hypothesis)	2026-06-15 12:17:46 -07:00