superpowers

mirror of https://github.com/obra/superpowers.git synced 2026-05-08 18:19:04 +08:00

Author	SHA1	Message	Date
Jesse Vincent	e4191c3609	Address adversarial review findings - evals/README.md, evals/CLAUDE.md: fix uv install command from 'uv sync --dev' to 'uv sync --extra dev'. Drill's pyproject.toml uses [project.optional-dependencies], so --dev is a no-op for pytest/ruff/ty; --extra dev is the correct invocation. - tests/claude-code/run-skill-tests.sh: drop test-requesting-code-review.sh from integration_tests array (file deleted earlier in this branch). - tests/claude-code/README.md: replace test-requesting-code-review.sh section with test-worktree-native-preference.sh (the worktree test is kept; the code-review test was lifted into drill). - docs/testing.md, CLAUDE.md: remove "Copilot CLI" from the harness list. evals/backends/ has claude*, codex, gemini configs but no copilot.yaml, so the claim was unsupported. Adversarial review credit: reviewer #2 found four legitimate issues (uv-sync, run-skill-tests stale ref, README stale ref via #1, and Copilot CLI fabrication); reviewer #1 found two distinct issues (run-skill-tests + tests/claude-code/README.md). Reviewer #2 wins this round.	2026-05-06 12:41:28 -07:00
Jesse Vincent	d545612825	docs: introduce evals/ as the canonical skill-behavior eval harness - docs/testing.md split into Plugin tests + Skill behavior evals. Plugin tests section enumerates the bash tests that survive (kept by drill-coverage analysis or as describe-skill tests). - CLAUDE.md adds Eval harness section pointing at evals/. - README.md Contributing section mentions evals/ alongside tests/. - .gitignore adds evals/{results,.venv,.env} as belt-and-suspenders (evals/.gitignore covers these locally; root-level entries help tooling that does not recurse into nested ignore files).	2026-05-06 12:33:10 -07:00
Jesse Vincent	e7a2d16476	Require session transcript for new-harness PRs Most new-harness PRs ship integrations that copy skill files or wrap with `npx skills` instead of loading the using-superpowers bootstrap at session start. Those integrations look like they work but skills never auto-trigger. Add an acceptance test ("Let's make a react todo list" must auto-trigger brainstorming in a clean session) and require the transcript in the PR.	2026-04-30 14:08:41 -07:00
Jesse Vincent	dd237283db	Add agent-facing guardrails to contributor guidelines Speak directly to AI agents at the top of CLAUDE.md: reframe slop PRs as harmful to their human partner, give a concrete pre-submission checklist, and explicitly authorize pushing back on vague instructions.	2026-03-31 14:37:13 -07:00
Jesse Vincent	c0b417e409	Add contributor guidelines to reduce agentic slop PRs CLAUDE.md (symlinked to AGENTS.md) covers every major rejection pattern from auditing the last 100 closed PRs (94% rejection rate): AI slop, ignored PR template, duplicates, speculative fixes, domain- specific skills, fork confusion, fabricated content, bundled changes, and misunderstanding project philosophy.	2026-03-31 14:14:19 -07:00

5 Commits