docs: introduce evals/ as the canonical skill-behavior eval harness

- docs/testing.md split into Plugin tests + Skill behavior evals.
  Plugin tests section enumerates the bash tests that survive
  (kept by drill-coverage analysis or as describe-skill tests).
- CLAUDE.md adds Eval harness section pointing at evals/.
- README.md Contributing section mentions evals/ alongside tests/.
- .gitignore adds evals/{results,.venv,.env} as belt-and-suspenders
  (evals/.gitignore covers these locally; root-level entries help
  tooling that does not recurse into nested ignore files).
This commit is contained in:
Jesse Vincent
2026-05-06 12:33:10 -07:00
parent b43d14f87f
commit d545612825
4 changed files with 34 additions and 291 deletions

View File

@@ -214,6 +214,8 @@ The general contribution process for Superpowers is below. Keep in mind that we
4. Follow the `writing-skills` skill for creating and testing new and modified skills
5. Submit a PR, being sure to fill in the pull request template.
Skill-behavior tests use the eval harness at `evals/`. See `evals/README.md` for setup. Plugin-infrastructure tests live at `tests/` and run via the relevant `run-*.sh` or `npm test`.
See `skills/writing-skills/SKILL.md` for the complete guide.
## Updating