superpowers

mirror of https://github.com/obra/superpowers.git synced 2026-06-11 21:29:07 +08:00

Author	SHA1	Message	Date
Jesse Vincent	f55642e0dd	Require contributors to disclose authoring environment and target dev Add a mandatory self-identification disclosure (model, harness, harness version, all installed plugins) to the PR template and all three issue templates, and document the requirement in the contributor guidelines. We weigh contributions differently depending on what produced them: content reasoned from documentation is held to a different bar than work grounded in a real session. Also state explicitly, in both CLAUDE.md and the PR template, that all PRs must target the dev branch rather than main.	2026-06-08 22:14:34 -07:00
Drew Ritter	d25618db58	Move eval harness to submodule (#1541 )	2026-05-13 12:25:41 -07:00
Jesse Vincent	0bf37499b4	Address adversarial review findings - evals/README.md, evals/CLAUDE.md: fix uv install command from 'uv sync --dev' to 'uv sync --extra dev'. Drill's pyproject.toml uses [project.optional-dependencies], so --dev is a no-op for pytest/ruff/ty; --extra dev is the correct invocation. - tests/claude-code/run-skill-tests.sh: drop test-requesting-code-review.sh from integration_tests array (file deleted earlier in this branch). - tests/claude-code/README.md: replace test-requesting-code-review.sh section with test-worktree-native-preference.sh (the worktree test is kept; the code-review test was lifted into drill). - docs/testing.md, CLAUDE.md: remove "Copilot CLI" from the harness list. evals/backends/ has claude*, codex, gemini configs but no copilot.yaml, so the claim was unsupported. Adversarial review credit: reviewer #2 found four legitimate issues (uv-sync, run-skill-tests stale ref, README stale ref via #1, and Copilot CLI fabrication); reviewer #1 found two distinct issues (run-skill-tests + tests/claude-code/README.md). Reviewer #2 wins this round.	2026-05-06 15:47:39 -07:00
Jesse Vincent	f7c5312265	docs: introduce evals/ as the canonical skill-behavior eval harness - docs/testing.md split into Plugin tests + Skill behavior evals. Plugin tests section enumerates the bash tests that survive (kept by drill-coverage analysis or as describe-skill tests). - CLAUDE.md adds Eval harness section pointing at evals/. - README.md Contributing section mentions evals/ alongside tests/. - .gitignore adds evals/{results,.venv,.env} as belt-and-suspenders (evals/.gitignore covers these locally; root-level entries help tooling that does not recurse into nested ignore files).	2026-05-06 15:47:39 -07:00
Jesse Vincent	e7a2d16476	Require session transcript for new-harness PRs Most new-harness PRs ship integrations that copy skill files or wrap with `npx skills` instead of loading the using-superpowers bootstrap at session start. Those integrations look like they work but skills never auto-trigger. Add an acceptance test ("Let's make a react todo list" must auto-trigger brainstorming in a clean session) and require the transcript in the PR.	2026-04-30 14:08:41 -07:00
Jesse Vincent	dd237283db	Add agent-facing guardrails to contributor guidelines Speak directly to AI agents at the top of CLAUDE.md: reframe slop PRs as harmful to their human partner, give a concrete pre-submission checklist, and explicitly authorize pushing back on vague instructions.	2026-03-31 14:37:13 -07:00
Jesse Vincent	c0b417e409	Add contributor guidelines to reduce agentic slop PRs CLAUDE.md (symlinked to AGENTS.md) covers every major rejection pattern from auditing the last 100 closed PRs (94% rejection rate): AI slop, ignored PR template, duplicates, speculative fixes, domain- specific skills, fork confusion, fabricated content, bundled changes, and misunderstanding project philosophy.	2026-03-31 14:14:19 -07:00

7 Commits