Commit Graph

  • a2e67bbd9b Harden brainstorm companion auth regressions Drew Ritter 2026-06-10 14:58:16 -07:00
  • fe812c418f Document visual companion auth hardening plan Drew Ritter 2026-06-10 14:14:15 -07:00
  • f4d1788ffb fix(brainstorm-server): fix auth-integration bugs from full-branch review Jesse Vincent 2026-06-09 19:13:52 -07:00
  • 4341c3f4d5 test(brainstorm-server): thread session key through tests after auth merge Jesse Vincent 2026-06-09 18:33:00 -07:00
  • c64c4ea6f4 feat(brainstorm-server): gate every endpoint behind a per-session key Jesse Vincent 2026-06-09 12:22:53 -07:00
  • de05e020d8 docs(brainstorm): catalog visual companion issues; choose session-key for security Jesse Vincent 2026-06-09 12:13:54 -07:00
  • eee4f87471 fix(brainstorm-server): tie stop-server PID check to the session's port Jesse Vincent 2026-06-09 17:27:30 -07:00
  • bac46a5dcb fix(brainstorm-server): address adversarial review findings Jesse Vincent 2026-06-09 15:59:59 -07:00
  • daa41c0670 feat(brainstorming): offer the visual companion just-in-time; harden lifecycle guidance Jesse Vincent 2026-06-09 15:32:58 -07:00
  • 0d37ff6505 feat(brainstorm-server): opt-in auto-open of the browser on the first screen Jesse Vincent 2026-06-09 15:26:19 -07:00
  • 13da997ac7 feat(brainstorm-server): reuse the same port on session restart Jesse Vincent 2026-06-09 15:22:23 -07:00
  • 31a0de857b feat(brainstorm-companion): resilient reconnect, live status, paused overlay Jesse Vincent 2026-06-09 15:18:19 -07:00
  • c292421627 feat(brainstorm-server): 4h configurable idle timeout; close WS on shutdown Jesse Vincent 2026-06-09 15:08:09 -07:00
  • 9b00cc298d fix(brainstorm-server): verify PID ownership before stopping Jesse Vincent 2026-06-09 14:57:44 -07:00
  • 88fe1e7e15 fix(brainstorm-server): ignore macOS resource-fork dotfiles Jesse Vincent 2026-06-09 14:53:48 -07:00
  • e6c983888f chore(evals): bump submodule to SUP-333 boundary + plumbing scenarios (7f8e80c) Drew Ritter 2026-06-11 13:42:58 -07:00
  • 1280585826 chore(evals): bump submodule to SUP-333 boundary + plumbing scenarios (7f8e80c) drew/bump-evals-boundary-scenarios Drew Ritter 2026-06-11 13:42:58 -07:00
  • 35464d67c0 E27 stack: conditional impl tier + final-review tier pin + narration recipe + terse reviewer contract sdd-e27-stack Jesse Vincent 2026-06-10 23:34:18 -07:00
  • 90b5433f59 E03: cheapest-tier implementers when plan carries complete code (transcription hypothesis) Jesse Vincent 2026-06-10 22:13:18 -07:00
  • 420c234a2c Bump evals submodule: E29-E34 quality investigation + L2 gate results (af05326) Jesse Vincent 2026-06-11 13:17:09 -07:00
  • d1fcc9889a Strict-cost spec: L2 final — died at gates; explicit escalation holds at sonnet, implicit adjudication does not Jesse Vincent 2026-06-11 13:11:32 -07:00
  • 74f85a7709 fix(writing-skills): hang backfire mechanism on the separated prohibition-vs-recipe comparison (NEW-4); control comparison stated as trend Jesse Vincent 2026-06-11 11:30:31 -07:00
  • b148b648eb fix(writing-skills): scope empirical claims, honest noise reporting, conditionalize micro-test checklist line Jesse Vincent 2026-06-11 11:10:33 -07:00
  • 3e565ca2ad feat(writing-skills): form-selection table + micro-test wording method Jesse Vincent 2026-06-11 10:20:24 -07:00
  • ac11700642 Bump evals submodule: L1 elicitation + autoresearch scenarios and logs (649b1f8) Jesse Vincent 2026-06-11 11:37:41 -07:00
  • 710f031ad0 writing-plans: task right-sizing, Global Constraints header, per-task Interfaces blocks Jesse Vincent 2026-06-10 20:44:48 -07:00
  • 72cb21b82c Constraints block is the reviewer's attention lens: copy spec verbatim, never improvise process rules Jesse Vincent 2026-06-11 10:31:48 -07:00
  • 9887c17b31 fix(writing-skills): hang backfire mechanism on the separated prohibition-vs-recipe comparison (NEW-4); control comparison stated as trend writing-skills-composition-doctrine Jesse Vincent 2026-06-11 11:30:31 -07:00
  • 391c33edf8 fix(writing-skills): scope empirical claims, honest noise reporting, conditionalize micro-test checklist line Jesse Vincent 2026-06-11 11:10:33 -07:00
  • ab25b904fe Fix companion stop metadata and token permissions brainstorming-companion Drew Ritter 2026-06-11 10:25:19 -07:00
  • 95795c2e7b feat(writing-skills): form-selection table + micro-test wording method Jesse Vincent 2026-06-11 10:20:24 -07:00
  • 5c3af5f195 fix(skills): brainstorming gate exempts nothing-to-design requests; description exceptions are authoritative (SUP-333 C) drew/sup-333-3-brainstorming-triviality-gate Drew Ritter 2026-06-10 23:48:44 -07:00
  • f9d11b3c2f fix(skills): SDD review fanout scales with the change (SUP-333 B) drew/sup-333-2-sdd-proportionality Drew Ritter 2026-06-10 23:47:45 -07:00
  • e5f337b89e fix(skills): plans reference the spec instead of restating it — end to end (SUP-333 A) drew/sup-333-1-plans-reference-spec Drew Ritter 2026-06-10 23:45:30 -07:00
  • de1d35e5e7 Strict-cost spec: L1 final — cost win re-attributed to complete-code plans; guidance owns fidelity/variance Jesse Vincent 2026-06-10 21:44:23 -07:00
  • ff46d10754 Harden Windows browser launcher Drew Ritter 2026-06-10 20:33:56 -07:00
  • 6cd45481c6 Fix Windows lifecycle validation Drew Ritter 2026-06-10 20:09:55 -07:00
  • 0c87353a6d Align visual companion docs with shipped scope Drew Ritter 2026-06-10 19:41:28 -07:00
  • ec014e7a7f Bump evals submodule to merged superpowers-evals main (ac264b1) Jesse Vincent 2026-06-10 19:39:02 -07:00
  • fa0b76b1f2 Fix companion test cleanup and argv assertions Drew Ritter 2026-06-10 19:37:30 -07:00
  • 3f047bdb64 Harden companion platform tests Drew Ritter 2026-06-10 19:26:53 -07:00
  • 84217fba50 Fix companion lifecycle test ownership metadata Drew Ritter 2026-06-10 19:12:17 -07:00
  • e3955d1cfa Harden companion stop ownership proof Drew Ritter 2026-06-10 18:49:38 -07:00
  • d02a9fb55c Isolate companion fallback tokens Drew Ritter 2026-06-10 18:39:37 -07:00
  • e82641d1e5 Fix server test fallback cleanup Drew Ritter 2026-06-10 18:33:38 -07:00
  • 9e0ec508c6 Harden root screen containment Drew Ritter 2026-06-10 18:25:03 -07:00
  • ab8e65c356 Plan visual companion final hardening fixup Drew Ritter 2026-06-10 18:19:31 -07:00
  • dc3b43b4d6 Tighten visual companion hardening spec Drew Ritter 2026-06-10 18:13:18 -07:00
  • e0d991c26c Document visual companion final hardening fixup Drew Ritter 2026-06-10 18:05:55 -07:00
  • d47489f9d0 Harden companion Windows lifecycle coverage Drew Ritter 2026-06-10 16:23:13 -07:00
  • 7b757c6de6 Harden brainstorm companion auth regressions Drew Ritter 2026-06-10 14:58:16 -07:00
  • f656a28e6e Document visual companion auth hardening plan Drew Ritter 2026-06-10 14:14:15 -07:00
  • 08219eb495 fix(brainstorm-server): fix auth-integration bugs from full-branch review Jesse Vincent 2026-06-09 19:13:52 -07:00
  • 6654e76be2 test(brainstorm-server): thread session key through tests after auth merge Jesse Vincent 2026-06-09 18:33:00 -07:00
  • 069cec421a feat(brainstorm-server): gate every endpoint behind a per-session key Jesse Vincent 2026-06-09 12:22:53 -07:00
  • 712ff4431f docs(brainstorm): catalog visual companion issues; choose session-key for security Jesse Vincent 2026-06-09 12:13:54 -07:00
  • 8ca01594b6 fix(brainstorm-server): tie stop-server PID check to the session's port Jesse Vincent 2026-06-09 17:27:30 -07:00
  • bfe7b80536 fix(brainstorm-server): address adversarial review findings Jesse Vincent 2026-06-09 15:59:59 -07:00
  • 3b4f91b57d feat(brainstorming): offer the visual companion just-in-time; harden lifecycle guidance Jesse Vincent 2026-06-09 15:32:58 -07:00
  • 8ff9d079b5 feat(brainstorm-server): opt-in auto-open of the browser on the first screen Jesse Vincent 2026-06-09 15:26:19 -07:00
  • d8996abf8d feat(brainstorm-server): reuse the same port on session restart Jesse Vincent 2026-06-09 15:22:23 -07:00
  • 67574367d3 feat(brainstorm-companion): resilient reconnect, live status, paused overlay Jesse Vincent 2026-06-09 15:18:19 -07:00
  • f77742281e feat(brainstorm-server): 4h configurable idle timeout; close WS on shutdown Jesse Vincent 2026-06-09 15:08:09 -07:00
  • e357baca01 fix(brainstorm-server): verify PID ownership before stopping Jesse Vincent 2026-06-09 14:57:44 -07:00
  • 2aff60ba3f fix(brainstorm-server): ignore macOS resource-fork dotfiles Jesse Vincent 2026-06-09 14:53:48 -07:00
  • eba16f6b91 Strict-cost spec: L2 recon n=2 (sonnet controller $6.68/$8.05, judgment clean, escalation points unstressed) Jesse Vincent 2026-06-10 17:11:26 -07:00
  • 27788fdef9 Strict-cost spec: record batch A-E rung verdicts (L1 validated, L2 recon positive, L3 dead) Jesse Vincent 2026-06-10 16:59:43 -07:00
  • 0cb1960068 chore(evals): bump submodule for Claude Haiku target Drew Ritter 2026-06-10 16:13:55 -07:00
  • 5cd1a9d5f2 chore(evals): bump submodule for Claude Haiku target codex/pri-2158-bump-evals-submodule Drew Ritter 2026-06-10 16:13:55 -07:00
  • 9a25a75bac Spec: strict-cost SDD experiment ladder — judgment as co-invariant, plan-side crispness first Jesse Vincent 2026-06-10 14:35:00 -07:00
  • 60fa4f6fc4 Record writing-plans micro-test result: resolved, no change needed Jesse Vincent 2026-06-10 14:31:50 -07:00
  • 43a6ee23f7 Spec: record iterations 4-5 (variance honesty, structural fixes, final validated ranges) Jesse Vincent 2026-06-10 13:08:40 -07:00
  • fe90d6c469 Adopt audited positive phrasings: evidence rule leads positive; fix-report completeness as checklist Jesse Vincent 2026-06-10 13:08:19 -07:00
  • b81f35bb1e Land eval-tuned combo: file handoffs, progress ledger, final-review package, REQUIRED model lines, reviewer risk budget Jesse Vincent 2026-06-10 13:08:06 -07:00
  • 926096a1d7 Spec: positive-instruction redesign — audit results, micro-test method, writing-plans variants Jesse Vincent 2026-06-10 12:32:06 -07:00
  • a995af2e24 Shared: unique review-package collateral names Jesse Vincent 2026-06-10 09:39:21 -07:00
  • d4dbf44162 Add review-package script; close fix-dispatch test gap Jesse Vincent 2026-06-10 08:51:16 -07:00
  • 2434ef7f35 Describe the review design as current state, not as a delta Jesse Vincent 2026-06-10 08:28:28 -07:00
  • 7cf78437e2 Spec: record iterations 2-3 results and final frozen-config matrix Jesse Vincent 2026-06-10 05:06:59 -07:00
  • e355795625 Hand reviewers the diff as a file, not a paste Jesse Vincent 2026-06-10 03:44:19 -07:00
  • 29ee4e8e44 Reviewer skepticism covers the implementer's design rationales Jesse Vincent 2026-06-10 02:20:28 -07:00
  • 28498a5cde Make diff-pasting non-optional for task reviewer dispatch Jesse Vincent 2026-06-10 02:10:34 -07:00
  • 5e2907fc4f Close the Minor-severity escape hatch Jesse Vincent 2026-06-10 02:09:10 -07:00
  • e532f24df7 Spec: document cost iterations and the per-task review consolidation Jesse Vincent 2026-06-09 23:59:22 -07:00
  • e3c74fc1c9 Merge per-task reviews into one task reviewer (iteration 2) Jesse Vincent 2026-06-09 23:58:28 -07:00
  • 3e3e1e701e Cut review-cost drivers: turn-aware models, inline diffs, scoped evidence Jesse Vincent 2026-06-09 22:42:54 -07:00
  • 853396e3ae Add phrase-level pre-judging triggers to reviewer prompt rule Jesse Vincent 2026-06-09 21:49:51 -07:00
  • 83d54f7ddd Red Flags: never tell a reviewer what not to flag or pre-rate severity Jesse Vincent 2026-06-09 21:47:41 -07:00
  • c7900f1698 Close three review blind spots found by defect tracing Jesse Vincent 2026-06-09 21:19:08 -07:00
  • 5cfdb75b94 Require explicit model on subagent dispatch Jesse Vincent 2026-06-09 21:11:45 -07:00
  • 87825ff193 Forbid controllers pre-judging reviewer findings Jesse Vincent 2026-06-09 18:28:24 -07:00
  • 09cb4d7361 Sync plan: escaped pre() pattern in Task 5 checks block Jesse Vincent 2026-06-09 18:19:00 -07:00
  • b3bb9a68d7 Fix plan doc: correct Task 1 grep expectation; sync Task 5 story block Jesse Vincent 2026-06-09 17:21:06 -07:00
  • 71dc271a08 Sync plan's Task 5 blocks with review fixes Jesse Vincent 2026-06-09 17:13:03 -07:00
  • 5aea3dca31 SDD controller: reviewer prompt budgets, ⚠️ handling, final-review pointer, model judgment Jesse Vincent 2026-06-09 16:59:05 -07:00
  • b3281c0227 Implementer prompt: re-run covering tests after fixing review findings Jesse Vincent 2026-06-09 16:56:28 -07:00
  • c14c1de552 Scope spec reviewer's Your Job wording to the diff Jesse Vincent 2026-06-09 16:55:28 -07:00
  • be8a6269c4 Spec reviewer: judge from the diff, grounded skepticism, ⚠️ verdict channel Jesse Vincent 2026-06-09 16:53:30 -07:00
  • da41209243 Use bare placeholder names in quality reviewer prompt body Jesse Vincent 2026-06-09 16:51:54 -07:00
  • 2cc449b6d4 Make per-task quality reviewer prompt self-contained and task-scoped Jesse Vincent 2026-06-09 16:47:27 -07:00