mirror of
https://github.com/obra/superpowers.git
synced 2026-05-12 20:19:06 +08:00
Lift drill into evals/ at 013fcb8b7dbefd6d3fa4653493e5d2ec8e7f985b
rsync of obra/drill@013fcb8b7d into superpowers/evals/, excluding .git/, .venv/, results/, .env/, __pycache__/, *.egg-info/, .private-journal/. The drill repo is unaffected by this commit; archival is a separate manual step after this PR merges. Source SHA recorded at evals/.drill-source-sha for divergence detection.
This commit is contained in:
37
evals/scenarios/decomposing-into-waves-naive.yaml
Normal file
37
evals/scenarios/decomposing-into-waves-naive.yaml
Normal file
@@ -0,0 +1,37 @@
|
||||
scenario: decomposing-into-waves-naive
|
||||
description: "Agent decomposes a multi-task plan into waves without user prompting"
|
||||
user_posture: naive
|
||||
|
||||
setup:
|
||||
helpers:
|
||||
- create_wave_test_repo
|
||||
assertions:
|
||||
- "git rev-parse --is-inside-work-tree"
|
||||
- "test -f docs/superpowers/plans/test-plan.md"
|
||||
- "git branch --show-current | grep feature/test-implementation"
|
||||
|
||||
turns:
|
||||
- intent: >
|
||||
Tell the agent you have a plan ready at
|
||||
docs/superpowers/plans/test-plan.md and you want to start
|
||||
implementing. Do NOT mention waves, parallelism, or
|
||||
decomposition. Let the agent decide how to execute.
|
||||
- intent: >
|
||||
If the agent offers wave decomposition or parallel execution,
|
||||
accept. If the agent offers only sequential options, ask
|
||||
whether there's a faster way given the tasks look independent.
|
||||
- intent: >
|
||||
If a .waves.md file is produced, review it briefly and confirm
|
||||
it looks reasonable. If not, ask what happened.
|
||||
|
||||
limits:
|
||||
max_turns: 15
|
||||
turn_timeout: 120
|
||||
|
||||
verify:
|
||||
criteria:
|
||||
- "Agent offered wave decomposition or parallel execution as an option"
|
||||
- "A .waves.md file exists in the filesystem"
|
||||
- "The .waves.md file content includes a Waves Overview table"
|
||||
- "The .waves.md file content includes file ownership blocks for parallel waves"
|
||||
observe: true
|
||||
Reference in New Issue
Block a user