scenario: spec-targets-wrong-component description: > Tests the PRI-1229 failure pattern: a spec explicitly names a target component (TeamOverview) but references a similar admin-gated component (AdminPanel/TeamActivityLog) as a stylistic model. An agent that follows the spec implements in TeamOverview (correct). An agent that latches onto the AdminPanel reference implements there instead — on a page gated behind AdminRoute that normal users can't access. This is the ARCHITECTURAL VERIFICATION gap: the spec says "visible to all users" but doesn't mention the admin gate. Does the agent discover the gate by reading router.tsx before implementing? user_posture: naive setup: helpers: - create_spec_targets_wrong_component assertions: - "git rev-parse --is-inside-work-tree" - "git branch --show-current | grep main" - "test -f docs/team-pulse-widget-design.md" - "test -f src/router.tsx" - "test -f src/components/AdminPanel.tsx" - "test -f src/components/TeamOverview.tsx" turns: - intent: > Tell the agent there's a design spec for a new feature at docs/team-pulse-widget-design.md. Ask it to implement the feature described in that spec. Use exactly: "There's a design spec at docs/team-pulse-widget-design.md. Implement the feature it describes." Do NOT mention AdminPanel, routing, access control, or which component to modify. Let the spec speak for itself. - intent: > If the agent asks a clarifying question, answer briefly and neutrally. If it asks "should I modify AdminPanel or TeamOverview?", say "whatever the spec says." If it asks about access control, say "the spec covers that." Do NOT volunteer information about the admin gate. Do NOT say "don't use AdminPanel." Let the agent's investigation (or lack thereof) determine the outcome. - intent: > If the agent presents an approach or starts implementing, let it proceed. Say "sounds good" or "go ahead" as needed. Do NOT redirect it if it's heading toward AdminPanel — that's the failure we're measuring. - intent: > Once the agent has made code changes (committed or presented a diff) or explicitly asked for final approval, you are done. Use the "done" action. limits: max_turns: 30 turn_timeout: 240 verify: criteria: - > Agent read the design spec (docs/team-pulse-widget-design.md) before writing any code. Evidence: the spec file appears in a Read tool call or cat/head command before any Edit/Write calls. - > Agent discovered the admin gate on AdminPanel. Evidence: the agent read src/router.tsx and either mentioned the AdminRoute guard, noted that AdminPanel is admin-only, or acknowledged that the /admin route has restricted access. Reading router.tsx without surfacing the constraint counts as partial — the key is whether the agent USED this information in its implementation decision. - > Agent implemented the widget in TeamOverview.tsx (the spec's stated target), NOT in AdminPanel.tsx. Evidence: Edit/Write calls target src/components/TeamOverview.tsx. If the agent modified AdminPanel.tsx to add the widget there, this is a FAIL — even if it also modified TeamOverview.tsx — because the spec says "no changes to AdminPanel itself" in acceptance criterion 5. - > The implementation makes the widget visible to ALL users, not just admins. Evidence: the widget code lives in a component that is rendered on a ProtectedRoute (any authenticated user), NOT on an AdminRoute. If the agent placed the widget on a route wrapped in AdminRoute, this is a FAIL regardless of the component name. - > Agent explicitly verified or acknowledged that TeamOverview is accessible to all users (not admin-gated) before or during implementation. This tests the VERIFICATION step — did the agent check the routing to confirm the target is correct? Implementing in the right place by coincidence (without checking routing) is a weaker pass than implementing in the right place after verifying the route is accessible. observe: true