scenario: spec-reviewer-catches-planted-flaws description: > Lifted from superpowers/tests/claude-code/test-document-review-system.sh. The setup plants a deliberately incomplete spec at docs/superpowers/specs/test-feature-design.md with three classes of flaws the brainstorming skill's spec-document-reviewer is meant to catch: a literal TODO in Requirements, a "specified later" deferral in Architecture, and a vague non-actionable Testing Strategy section. Spec-aware user prompt: explicitly invoke the brainstorming skill's spec-document-reviewer template (matching the bash test's explicitness). The dispatched reviewer subagent must catch the flaws and refuse to approve the spec. user_posture: spec-aware setup: helpers: - create_base_repo - add_flawed_spec_for_review assertions: - "git rev-parse --is-inside-work-tree" - "git branch --show-current | grep -q main" - "test -f docs/superpowers/specs/test-feature-design.md" - "grep -q 'TODO: Add more requirements here' docs/superpowers/specs/test-feature-design.md" - "grep -q 'specified later' docs/superpowers/specs/test-feature-design.md" turns: - intent: > You drafted a spec at docs/superpowers/specs/test-feature-design.md and want it reviewed for completeness before you start planning the implementation. You are spec-aware: ask the agent to use the brainstorming skill's spec-document-reviewer template to evaluate it. Use phrasing like: "I drafted a spec at docs/superpowers/specs/test-feature-design.md. Use the spec-document-reviewer template from the superpowers:brainstorming skill (skills/brainstorming/spec-document-reviewer-prompt.md) to dispatch a reviewer subagent and evaluate the spec for completeness, consistency, clarity, scope, and YAGNI. Report back what it found." Do NOT name the planted flaws (TODO, "specified later", vague testing). Do NOT volunteer hints about completeness. The reviewer subagent should discover them. - intent: > Once the agent has produced a review (status + issues + any recommendations), you are done — use the "done" action. If the agent says "Approved" without issues, that is also a complete review (and a fail of the criteria below — but the run itself is complete). limits: max_turns: 6 turn_timeout: 300 verify: assertions: - "tool-called Agent" criteria: - > Reviewer flagged the literal TODO in the Requirements section ("TODO: Add more requirements here"). Evidence: the review output names the TODO, the Requirements section, or both, and lists it as an issue (not just an observation). - > Reviewer flagged the "specified later" deferral in the Architecture section. Either explicit naming of "specified later" / "deferred" / "later" / "TBD" wording, or a finding that error handling is incomplete / unspecified, counts as catching it. - > Reviewer's status is "Issues Found" (or equivalent — "needs revision", "blocking", "do not proceed", "not ready for planning"). A reviewer that returns Approved without issues fails this criterion. A reviewer that returns Approved while *also* listing issues elsewhere also fails — the verdict has to match the findings. observe: true