Strict-cost spec: L1 final — cost win re-attributed to complete-code plans; guidance owns fidelity/variance

This commit is contained in:
Jesse Vincent
2026-06-10 21:44:23 -07:00
parent ec014e7a7f
commit de1d35e5e7

View File

@@ -65,13 +65,21 @@ fewer, better-sized tasks, SDD still runs one fresh subagent per task.
### L1 — Plan-side crispness (writing-plans changes; est. $1.5-3/run, plus variance reduction) ### L1 — Plan-side crispness (writing-plans changes; est. $1.5-3/run, plus variance reduction)
**Status 2026-06-11: validated in effect.** A hand-crisped fractals plan **Status 2026-06-11 (final): elicitation tested end-to-end; claims
(10 → 7 tasks, `## Global Constraints` header, per-task `Interfaces:` re-attributed.** Micro-tests: constraints header and Interfaces blocks
lines — scenario `sdd-go-fractals-crisp`) ran 3/3 green at $9.51-12.65 elicit deterministically (0→5/5, 0→100% of tasks, exact values);
(mean $11.60 vs combo band $11.67-14.84), 20-24 dispatches vs 28, fix right-sizing is modest and scale-dependent (9.4→8.4 tasks at svelte
waves flat. What remains is elicitation: getting writing-plans guidance scale, nothing to move at fractals scale). Full runs: an elicited plan
to *produce* such plans (micro-test per the doctrine, then the follow-up executed at $6.34/$8.49 — but the no-guidance control (opus plan,
PR). See the experiments log, Batch A-E. complete code) hit $7.59/$7.73, inside that range. **The cost win
belongs to opus-written complete-code plans; the hand-written prose
fixture plans all prior numbers used are unrepresentative and ~2×
costlier to execute.** The guidance owns fidelity and variance instead:
deterministic constraints propagation (the one elicited-run fix was a
version-floor catch), exact cross-task interfaces, fix waves 1 vs 2-4
(the control plan shipped a real Sierpinski bug both runs had to fix).
The writing-plans PR claims those grounds, not dollars. Draft at
/tmp/sdd-exp/writing-plans-l1 (branch writing-plans-crisp).
The plan is upstream of every cost: task count sets dispatch count; plan The plan is upstream of every cost: task count sets dispatch count; plan
ambiguity sets review-loop count; plan completeness sets implementer ambiguity sets review-loop count; plan completeness sets implementer