mirror of
https://github.com/obra/superpowers.git
synced 2026-04-21 00:49:06 +08:00
Add Claude Code skills test framework
Created automated test suite for testing superpowers skills using Claude Code CLI in headless mode. New files: - tests/claude-code/run-skill-tests.sh - Main test runner - tests/claude-code/test-helpers.sh - Helper functions for testing - tests/claude-code/test-subagent-driven-development.sh - First test - tests/claude-code/README.md - Documentation Test framework features: - Run Claude Code with prompts and capture output - Assertion helpers (contains, not_contains, count, order) - Test project creation helpers - Timeout support (default 5 minutes) - Verbose mode for debugging - Specific test selection First test verifies subagent-driven-development skill: - Skill loading - Workflow ordering (spec compliance before code quality) - Self-review requirements - Plan reading efficiency (read once) - Spec compliance reviewer skepticism - Review loops - Task context provision Run with: cd tests/claude-code && ./run-skill-tests.sh
This commit is contained in:
128
tests/claude-code/README.md
Normal file
128
tests/claude-code/README.md
Normal file
@@ -0,0 +1,128 @@
|
||||
# Claude Code Skills Tests
|
||||
|
||||
Automated tests for superpowers skills using Claude Code CLI.
|
||||
|
||||
## Overview
|
||||
|
||||
This test suite verifies that skills are loaded correctly and Claude follows them as expected. Tests invoke Claude Code in headless mode (`claude -p`) and verify the behavior.
|
||||
|
||||
## Requirements
|
||||
|
||||
- Claude Code CLI installed and in PATH (`claude --version` should work)
|
||||
- Local superpowers plugin installed (see main README for installation)
|
||||
|
||||
## Running Tests
|
||||
|
||||
### Run all tests:
|
||||
```bash
|
||||
./run-skill-tests.sh
|
||||
```
|
||||
|
||||
### Run specific test:
|
||||
```bash
|
||||
./run-skill-tests.sh --test test-subagent-driven-development.sh
|
||||
```
|
||||
|
||||
### Run with verbose output:
|
||||
```bash
|
||||
./run-skill-tests.sh --verbose
|
||||
```
|
||||
|
||||
### Set custom timeout (default 300s):
|
||||
```bash
|
||||
./run-skill-tests.sh --timeout 600
|
||||
```
|
||||
|
||||
## Test Structure
|
||||
|
||||
### test-helpers.sh
|
||||
Common functions for skills testing:
|
||||
- `run_claude "prompt" [timeout]` - Run Claude with prompt
|
||||
- `assert_contains output pattern name` - Verify pattern exists
|
||||
- `assert_not_contains output pattern name` - Verify pattern absent
|
||||
- `assert_count output pattern count name` - Verify exact count
|
||||
- `assert_order output pattern_a pattern_b name` - Verify order
|
||||
- `create_test_project` - Create temp test directory
|
||||
- `create_test_plan project_dir` - Create sample plan file
|
||||
|
||||
### Test Files
|
||||
|
||||
Each test file:
|
||||
1. Sources `test-helpers.sh`
|
||||
2. Runs Claude Code with specific prompts
|
||||
3. Verifies expected behavior using assertions
|
||||
4. Returns 0 on success, non-zero on failure
|
||||
|
||||
## Example Test
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||
source "$SCRIPT_DIR/test-helpers.sh"
|
||||
|
||||
echo "=== Test: My Skill ==="
|
||||
|
||||
# Ask Claude about the skill
|
||||
output=$(run_claude "What does the my-skill skill do?" 30)
|
||||
|
||||
# Verify response
|
||||
assert_contains "$output" "expected behavior" "Skill describes behavior"
|
||||
|
||||
echo "=== All tests passed ==="
|
||||
```
|
||||
|
||||
## Current Tests
|
||||
|
||||
### test-subagent-driven-development.sh
|
||||
Tests the updated subagent-driven-development workflow:
|
||||
- Skill loading
|
||||
- Workflow ordering (spec compliance before code quality)
|
||||
- Self-review requirements
|
||||
- Plan reading efficiency (read once)
|
||||
- Spec compliance reviewer skepticism
|
||||
- Review loops
|
||||
- Task context provision (full text, not file reading)
|
||||
|
||||
## Adding New Tests
|
||||
|
||||
1. Create new test file: `test-<skill-name>.sh`
|
||||
2. Source test-helpers.sh
|
||||
3. Write tests using `run_claude` and assertions
|
||||
4. Add to test list in `run-skill-tests.sh`
|
||||
5. Make executable: `chmod +x test-<skill-name>.sh`
|
||||
|
||||
## Timeout Considerations
|
||||
|
||||
- Default timeout: 5 minutes per test
|
||||
- Claude Code may take time to respond
|
||||
- Adjust with `--timeout` if needed
|
||||
- Tests should be focused to avoid long runs
|
||||
|
||||
## Debugging Failed Tests
|
||||
|
||||
With `--verbose`, you'll see full Claude output:
|
||||
```bash
|
||||
./run-skill-tests.sh --verbose --test test-subagent-driven-development.sh
|
||||
```
|
||||
|
||||
Without verbose, only failures show output.
|
||||
|
||||
## CI/CD Integration
|
||||
|
||||
To run in CI:
|
||||
```bash
|
||||
# Run with explicit timeout for CI environments
|
||||
./run-skill-tests.sh --timeout 900
|
||||
|
||||
# Exit code 0 = success, non-zero = failure
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
- Tests verify skill *instructions*, not full execution
|
||||
- Full workflow tests would be very slow
|
||||
- Focus on verifying key skill requirements
|
||||
- Tests should be deterministic
|
||||
- Avoid testing implementation details
|
||||
Reference in New Issue
Block a user