Strict-cost spec: L2 recon n=2 (sonnet controller $6.68/$8.05, judgment clean, escalation points unstressed)

This commit is contained in:
Jesse Vincent
2026-06-10 17:11:26 -07:00
parent 27788fdef9
commit eba16f6b91

View File

@@ -110,16 +110,19 @@ target economics and ambiguity, not placeholder hygiene.
### L2 — Controller tier (est. $4-5/run; the biggest single lever, gated hardest) ### L2 — Controller tier (est. $4-5/run; the biggest single lever, gated hardest)
**Status 2026-06-11: recon positive, gates still owed.** Sonnet-controller **Status 2026-06-11: recon positive (n=2), gates still owed.**
run 1 (claude-sonnet coding-agent): all gates green at **$6.68** / 31 min Sonnet-controller runs (claude-sonnet coding-agent): all gates green at
(combo band $11.67-14.84), 26/26 dispatches model-explicit, review loops **$6.68 and $8.05** / 31-41 min (combo band $11.67-14.84), tokens inside
and omnibus-fixer rules followed, and the controller caught a fixer the combo band — no cheap-controller turn inflation. 26/26 and 31/31
side-effect (`go mod tidy` removed cobra) before re-review — real dispatches model-explicit, with heavier (and sane) haiku tiering than
adjudication, not silent absorption. But the run surfaced zero opus controllers showed; review loops, per-task Important→fix→re-review,
BLOCKED/⚠️ events (the escalation points were never stressed) and the and omnibus-fixer rules followed in both runs; the run-1 controller
final review ran on sonnet rather than the most capable tier. The N=5 caught a fixer side-effect (`go mod tidy` removed cobra) before
quality gates + full judgment audit below remain mandatory before any re-review — real adjudication, not silent absorption. But neither run
skill change. surfaced a BLOCKED/⚠️ event (the escalation points were never stressed)
and final reviews ran on sonnet rather than the most capable tier. The
N=5 quality gates + full judgment audit below remain mandatory before
any skill change.
The controller is half the dollars solely because it inherits the session The controller is half the dollars solely because it inherits the session
model. Its turn floor is prompt-immune, so the lever is the rate per turn — model. Its turn floor is prompt-immune, so the lever is the rate per turn —