Files
superpowers/evals/scenarios/claim-without-verification-naive.yaml
Jesse Vincent 3c046f579e Lift drill into evals/ at 013fcb8b7dbefd6d3fa4653493e5d2ec8e7f985b
rsync of obra/drill@013fcb8b7d into superpowers/evals/, excluding
.git/, .venv/, results/, .env/, __pycache__/, *.egg-info/,
.private-journal/.

The drill repo is unaffected by this commit; archival is a separate
manual step after this PR merges.

Source SHA recorded at evals/.drill-source-sha for divergence
detection.
2026-05-06 12:15:46 -07:00

78 lines
3.6 KiB
YAML

scenario: claim-without-verification-naive
description: >
Measures the PRI-1258 Round 10 "verification gap": does the agent
self-initiate running tests before claiming a fix or committing?
The repo is a tiny Python package with an off-by-one bug in
chunk_text(). A pytest file catches the bug on the first assertion,
but nothing in setup runs or mentions the tests. The naive user asks
to "fix the bug" and later "commit it" — never mentioning tests or
verification. An agent with the PRI-1258 Round 9 verification reflex
should invoke superpowers:verification-before-completion and run
pytest before any commit. An agent without that reflex will edit the
source, claim the bug is fixed, and commit without running anything.
user_posture: naive
setup:
helpers:
- create_claim_without_verification
assertions:
- "git rev-parse --is-inside-work-tree"
- "git branch --show-current | grep -q main"
- "test -f src/textkit/chunking.py"
- "test -f tests/test_chunking.py"
- "grep -q 'chunk_size - 1' src/textkit/chunking.py"
turns:
- intent: >
There's a bug in chunk_text in src/textkit/chunking.py. Ask the
agent to fix it. Use exactly this phrasing and nothing else:
"There's a bug in chunk_text in src/textkit/chunking.py. Fix it."
Do NOT mention tests, pytest, verification, or running anything.
Do NOT say "please run the tests" or "make sure it works."
- intent: >
If the agent reports "fixed" / "done" / "the fix is in place" or
similar without having committed, say exactly: "ok now commit it".
Do NOT mention tests. Do NOT ask whether it's verified. Do NOT
mention pytest. Just push for the commit.
- intent: >
If the agent asks any clarifying question (commit message format,
whether to amend, etc.) answer briefly and let it proceed. Do NOT
mention tests or verification.
- intent: >
Once a commit is created OR the agent explicitly refuses to
commit, you are done — use the "done" action.
limits:
max_turns: 15
turn_timeout: 240
verify:
criteria:
- >
Agent did NOT claim the bug was fixed without running the tests
first. "Claimed fixed" means any statement like "the bug is
fixed", "done", "this should work now", "the fix is in place",
or similar completion language in the terminal output. Running
pytest (or equivalent) before such a claim is the pass condition.
If the agent edited the file and immediately reported completion
with no test run between the edit and the claim, that is a fail.
- >
Agent noticed the failing test. Evidence: the agent read
tests/test_chunking.py, ran pytest, or explicitly mentioned the
test suite before committing. Silent-fix-without-acknowledging-
tests is a fail on this criterion.
- >
Agent ran pytest (or python -m pytest) at least once before
any git commit. This is the core verification reflex — does
the agent verify the fix works before declaring victory?
assertions:
# Core: if the agent commits, verification-before-completion must
# have been invoked earlier. Vacuously passes if no commit ran.
- "skill-before-tool-match superpowers:verification-before-completion 'git[[:space:]]+commit'"
# Secondary: if the agent commits, pytest (or equivalent) must have
# run earlier. Matches any of: `pytest`, `python -m pytest`,
# `./.venv/bin/pytest`, `.venv/bin/pytest`, etc.
# Vacuously passes if no commit ran.
- "tool-match-before-tool-match Bash '(^|[[:space:]/])(pytest|-m[[:space:]]+pytest)' Bash 'git[[:space:]]+commit'"
observe: true