Add integration test for subagent-driven-development

Created full end-to-end integration test that executes a real plan and verifies the new workflow improvements actually work. New test: test-subagent-driven-development-integration.sh - Creates real Node.js test project - Generates implementation plan (2 tasks) - Executes using subagent-driven-development skill - Verifies 8 key behaviors: 1. Plan read once at beginning (not per task) 2. Full task text provided to subagents (not file reading) 3. Subagents perform self-review 4. Spec compliance review before code quality 5. Spec reviewer reads code independently 6. Working implementation produced 7. Tests pass 8. No extra features added (spec compliance) Integration tests are opt-in (--integration flag) due to 10-30 min runtime. Updated run-skill-tests.sh: - Added --integration flag - Separates fast tests from integration tests - Shows note when integration tests skipped Updated README with integration test documentation. Run with: ./run-skill-tests.sh # Fast tests only ./run-skill-tests.sh --integration # Include integration tests
2026-06-12 13:49:05 +08:00 · 2025-11-28 15:06:10 -08:00
parent 51a171cd14
commit fa946ae465
3 changed files with 410 additions and 17 deletions
--- a/tests/claude-code/README.md
+++ b/tests/claude-code/README.md
@@ -13,11 +13,16 @@ This test suite verifies that skills are loaded correctly and Claude follows the

 ## Running Tests

-### Run all tests:
+### Run all fast tests (recommended):
 ```bash
 ./run-skill-tests.sh
 ```

+### Run integration tests (slow, 10-30 minutes):
+```bash
+./run-skill-tests.sh --integration
+```
+
 ### Run specific test:
 ```bash
 ./run-skill-tests.sh --test test-subagent-driven-development.sh
@@ -28,9 +33,9 @@ This test suite verifies that skills are loaded correctly and Claude follows the
 ./run-skill-tests.sh --verbose
 ```

-### Set custom timeout (default 300s):
+### Set custom timeout:
 ```bash
-./run-skill-tests.sh --timeout 600
+./run-skill-tests.sh --timeout 1800  # 30 minutes for integration tests
 ```

 ## Test Structure
@@ -75,15 +80,40 @@ echo "=== All tests passed ==="

 ## Current Tests

-### test-subagent-driven-development.sh
-Tests the updated subagent-driven-development workflow:
- Skill loading
+### Fast Tests (run by default)
+
+#### test-subagent-driven-development.sh
+Tests skill content and requirements (~2 minutes):
+- Skill loading and accessibility
 - Workflow ordering (spec compliance before code quality)
- Self-review requirements
- Plan reading efficiency (read once)
- Spec compliance reviewer skepticism
- Review loops
- Task context provision (full text, not file reading)
+- Self-review requirements documented
+- Plan reading efficiency documented
+- Spec compliance reviewer skepticism documented
+- Review loops documented
+- Task context provision documented
+
+### Integration Tests (use --integration flag)
+
+#### test-subagent-driven-development-integration.sh
+Full workflow execution test (~10-30 minutes):
+- Creates real test project with Node.js setup
+- Creates implementation plan with 2 tasks
+- Executes plan using subagent-driven-development
+- Verifies actual behaviors:
+  - Plan read once at start (not per task)
+  - Full task text provided in subagent prompts
+  - Subagents perform self-review before reporting
+  - Spec compliance review happens before code quality
+  - Spec reviewer reads code independently
+  - Working implementation is produced
+  - Tests pass
+  - Proper git commits created
+
+**What it tests:**
+- The workflow actually works end-to-end
+- Our improvements are actually applied
+- Subagents follow the skill correctly
+- Final code is functional and tested

 ## Adding New Tests