Refactor cockpit to use DockerTmuxController pattern
Based on claude-code-tools TmuxCLIController, this refactor: - Added DockerTmuxController class for robust tmux session management - Implements send_keys() with configurable delay_enter - Implements capture_pane() for output retrieval - Implements wait_for_prompt() for pattern-based completion detection - Implements wait_for_idle() for content-hash-based idle detection - Implements wait_for_shell_prompt() for shell prompt detection Also includes workflow improvements: - Pre-task git snapshot before agent execution - Post-task commit protocol in agent guidelines Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
666
AUTONOMOUS-AGENT-TEMPLATES.md
Normal file
666
AUTONOMOUS-AGENT-TEMPLATES.md
Normal file
@@ -0,0 +1,666 @@
|
||||
# Autonomous Agent Implementation Templates
|
||||
## Copy-Paste Ready Code Examples
|
||||
|
||||
**Date:** 2026-01-09
|
||||
**Version:** 1.0
|
||||
**Status:** Ready for production use
|
||||
|
||||
---
|
||||
|
||||
## Template 1: Simple Task Agent (Read-Only Analysis)
|
||||
|
||||
### Use Case
|
||||
Analyze code and report metrics without modifying anything.
|
||||
|
||||
### Prompt Template
|
||||
|
||||
```python
|
||||
project = "musica"
|
||||
task = """
|
||||
Analyze the codebase structure in /workspace:
|
||||
|
||||
1. Count TypeScript files: find /workspace -name "*.ts" | wc -l
|
||||
2. Count lines of code: find /workspace -name "*.ts" -exec wc -l {} + | tail -1
|
||||
3. Find largest files: find /workspace -name "*.ts" -exec wc -l {} + | sort -rn | head -5
|
||||
4. Check dependencies: npm list 2>/dev/null | head -20
|
||||
|
||||
Save results to /workspace/code-metrics.json with format:
|
||||
{
|
||||
"ts_files": number,
|
||||
"total_loc": number,
|
||||
"largest_files": [
|
||||
{"file": string, "loc": number},
|
||||
...
|
||||
],
|
||||
"dependencies_count": number,
|
||||
"analysis_timestamp": ISO8601_string
|
||||
}
|
||||
|
||||
Success: File exists with all required fields.
|
||||
Failure: File missing or incomplete.
|
||||
Exit 0 on success, exit 1 on failure.
|
||||
Do NOT attempt to install or modify anything.
|
||||
"""
|
||||
|
||||
# Spawn agent
|
||||
job_id = spawn_claude_agent(project, task, context="", config=config)
|
||||
```
|
||||
|
||||
### Expected Output
|
||||
```json
|
||||
{
|
||||
"ts_files": 42,
|
||||
"total_loc": 12847,
|
||||
"largest_files": [
|
||||
{"file": "src/core/processor.ts", "loc": 843},
|
||||
{"file": "src/index.ts", "loc": 521},
|
||||
{"file": "src/api/routes.ts", "loc": 472}
|
||||
],
|
||||
"dependencies_count": 18,
|
||||
"analysis_timestamp": "2026-01-09T15:30:45Z"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Template 2: Test Execution Agent (Run & Report)
|
||||
|
||||
### Use Case
|
||||
Run test suite and report results with metrics.
|
||||
|
||||
### Prompt Template
|
||||
|
||||
```python
|
||||
project = "musica"
|
||||
task = """
|
||||
Run the test suite and generate a comprehensive report.
|
||||
|
||||
Steps:
|
||||
1. npm install (if node_modules missing)
|
||||
2. npm test -- --json=test-results.json
|
||||
3. Parse test-results.json
|
||||
4. Create test-report.json with:
|
||||
{
|
||||
"total_tests": number,
|
||||
"passed": number,
|
||||
"failed": number,
|
||||
"skipped": number,
|
||||
"duration_ms": number,
|
||||
"success_rate": number (0-100),
|
||||
"failed_tests": [
|
||||
{
|
||||
"name": string,
|
||||
"error": string,
|
||||
"file": string
|
||||
}
|
||||
],
|
||||
"timestamp": ISO8601_string
|
||||
}
|
||||
|
||||
Success criteria:
|
||||
- test-report.json exists
|
||||
- All required fields present
|
||||
- success_rate = (passed / (passed + failed)) * 100
|
||||
|
||||
Exit codes:
|
||||
- Exit 0 if success_rate == 100 (all tests passed)
|
||||
- Exit 1 if success_rate < 100 (some tests failed)
|
||||
- Exit 2 if tests won't run (no npm, no tests, etc)
|
||||
|
||||
Do NOT:
|
||||
- Attempt to fix failing tests
|
||||
- Skip any tests
|
||||
- Modify test files
|
||||
"""
|
||||
|
||||
job_id = spawn_claude_agent(project, task, context="", config=config)
|
||||
```
|
||||
|
||||
### Expected Output
|
||||
```json
|
||||
{
|
||||
"total_tests": 48,
|
||||
"passed": 46,
|
||||
"failed": 2,
|
||||
"skipped": 0,
|
||||
"duration_ms": 3241,
|
||||
"success_rate": 95.83,
|
||||
"failed_tests": [
|
||||
{
|
||||
"name": "should handle edge case for empty array",
|
||||
"error": "Expected undefined to equal null",
|
||||
"file": "tests/processor.test.ts"
|
||||
},
|
||||
{
|
||||
"name": "should validate user input",
|
||||
"error": "Timeout: test exceeded 5000ms",
|
||||
"file": "tests/validation.test.ts"
|
||||
}
|
||||
],
|
||||
"timestamp": "2026-01-09T15:32:18Z"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Template 3: Code Modification Agent (Implement & Verify)
|
||||
|
||||
### Use Case
|
||||
Modify code to meet specifications and verify changes work.
|
||||
|
||||
### Prompt Template
|
||||
|
||||
```python
|
||||
project = "musica"
|
||||
task = """
|
||||
Add TypeScript strict mode to the codebase.
|
||||
|
||||
Requirements:
|
||||
1. Set "strict": true in /workspace/tsconfig.json
|
||||
2. Fix all TypeScript errors that result
|
||||
3. Ensure npm run build succeeds with no errors
|
||||
4. Verify tests still pass: npm test -> exit 0
|
||||
|
||||
For TypeScript errors:
|
||||
- Add explicit type annotations
|
||||
- Fix any typing issues
|
||||
- Do NOT use 'any' type
|
||||
- Do NOT use 'ignore comments
|
||||
|
||||
Changes allowed:
|
||||
- tsconfig.json: enable strict mode
|
||||
- .ts files: add type annotations, fix typing
|
||||
|
||||
Changes NOT allowed:
|
||||
- package.json (don't add packages)
|
||||
- .test.ts files (don't modify tests)
|
||||
- database schema
|
||||
- API contracts
|
||||
|
||||
Success criteria:
|
||||
1. tsconfig.json has "strict": true
|
||||
2. npm run build exits with 0 (no TypeScript errors)
|
||||
3. npm test exits with 0 (no test failures)
|
||||
4. No 'any' types in code
|
||||
|
||||
Document changes in /workspace/STRICT_MODE_CHANGES.md:
|
||||
- Files modified: list them
|
||||
- Breaking changes: none expected
|
||||
- Type annotations added: count
|
||||
|
||||
Exit 0 on complete success.
|
||||
Exit 1 if any requirement not met.
|
||||
Exit 2 if unrecoverable (existing errors, etc).
|
||||
"""
|
||||
|
||||
job_id = spawn_claude_agent(project, task, context="", config=config)
|
||||
```
|
||||
|
||||
### Expected Output Files
|
||||
|
||||
**tsconfig.json:**
|
||||
```json
|
||||
{
|
||||
"compilerOptions": {
|
||||
"strict": true,
|
||||
"target": "es2020",
|
||||
"module": "commonjs"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**STRICT_MODE_CHANGES.md:**
|
||||
```markdown
|
||||
# TypeScript Strict Mode Migration
|
||||
|
||||
## Files Modified
|
||||
1. src/index.ts
|
||||
2. src/processor.ts
|
||||
3. src/api/routes.ts
|
||||
|
||||
## Type Annotations Added
|
||||
- 28 function return types
|
||||
- 15 parameter types
|
||||
- 12 interface refinements
|
||||
|
||||
## Build Status
|
||||
✓ TypeScript: No errors
|
||||
✓ Tests: 48 passed, 0 failed
|
||||
|
||||
## Verification
|
||||
- npm run build: PASS
|
||||
- npm test: PASS
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Template 4: Multi-Step Workflow Agent (Orchestrate Complex Process)
|
||||
|
||||
### Use Case
|
||||
Execute multiple dependent steps in sequence with decision logic.
|
||||
|
||||
### Prompt Template
|
||||
|
||||
```python
|
||||
project = "musica"
|
||||
task = """
|
||||
Complete the release preparation workflow.
|
||||
|
||||
Phase 1: Build Verification
|
||||
Command: npm run build
|
||||
Check: Exit code must be 0
|
||||
If fails: STOP, exit 1
|
||||
|
||||
Phase 2: Test Verification
|
||||
Command: npm test
|
||||
Check: Exit code must be 0, > 95% success rate
|
||||
If fails: STOP, exit 1
|
||||
|
||||
Phase 3: Security Check
|
||||
Command: npm audit
|
||||
Check: No high/critical vulnerabilities
|
||||
If fails: Create security-issues.json with details, exit 1
|
||||
|
||||
Phase 4: Version Bump
|
||||
Check current version: grep version package.json
|
||||
Increment patch: 1.2.3 -> 1.2.4
|
||||
Update: package.json
|
||||
Update: src/version.ts to export new version
|
||||
|
||||
Phase 5: Generate Changelog
|
||||
Create RELEASE_NOTES.md
|
||||
Include:
|
||||
- Version number
|
||||
- Changes made (list modified files)
|
||||
- Test results summary
|
||||
- Timestamp
|
||||
|
||||
Phase 6: Create Release Package
|
||||
Create release.json:
|
||||
{
|
||||
"version": string,
|
||||
"build_status": "passed",
|
||||
"tests": {
|
||||
"total": number,
|
||||
"passed": number,
|
||||
"failed": number
|
||||
},
|
||||
"security": "passed",
|
||||
"ready_to_release": true,
|
||||
"timestamp": string,
|
||||
"artifacts": [
|
||||
"package.json",
|
||||
"src/version.ts",
|
||||
"RELEASE_NOTES.md"
|
||||
]
|
||||
}
|
||||
|
||||
Decision Logic:
|
||||
IF all phases successful:
|
||||
ready_to_release = true
|
||||
Exit 0
|
||||
ELSE:
|
||||
ready_to_release = false
|
||||
Exit 1
|
||||
|
||||
Do NOT:
|
||||
- Actually publish or deploy
|
||||
- Push to git
|
||||
- Upload to npm
|
||||
- Modify files outside /workspace
|
||||
"""
|
||||
|
||||
job_id = spawn_claude_agent(project, task, context="", config=config)
|
||||
```
|
||||
|
||||
### Expected Output
|
||||
|
||||
**release.json:**
|
||||
```json
|
||||
{
|
||||
"version": "1.2.4",
|
||||
"build_status": "passed",
|
||||
"tests": {
|
||||
"total": 48,
|
||||
"passed": 48,
|
||||
"failed": 0
|
||||
},
|
||||
"security": "passed",
|
||||
"ready_to_release": true,
|
||||
"timestamp": "2026-01-09T15:45:30Z",
|
||||
"artifacts": [
|
||||
"package.json (version bumped)",
|
||||
"src/version.ts (updated)",
|
||||
"RELEASE_NOTES.md (generated)"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Template 5: Diagnostic Agent (Troubleshooting & Reporting)
|
||||
|
||||
### Use Case
|
||||
Diagnose system/application issues without making changes.
|
||||
|
||||
### Prompt Template
|
||||
|
||||
```python
|
||||
project = "musica"
|
||||
task = """
|
||||
Diagnose issues with the application startup.
|
||||
|
||||
Investigation Steps:
|
||||
|
||||
1. Check Prerequisites
|
||||
- Node version: node --version
|
||||
- npm version: npm --version
|
||||
- .env file exists: ls -la .env
|
||||
- node_modules exists: ls node_modules | wc -l
|
||||
|
||||
2. Dependency Check
|
||||
- npm list (capture top-level deps)
|
||||
- npm ls --depth=0
|
||||
- Look for ERR! messages
|
||||
|
||||
3. Configuration Check
|
||||
- tsconfig.json valid: npx tsc --noEmit
|
||||
- package.json valid: npm ls (no errors)
|
||||
- .env configured: grep -c = .env
|
||||
|
||||
4. Build Check
|
||||
- npm run build
|
||||
- Capture any warnings/errors
|
||||
|
||||
5. Runtime Check
|
||||
- npm start --timeout 5s (let it try for 5 seconds)
|
||||
- Capture any startup errors
|
||||
- Capture any warnings
|
||||
|
||||
6. Port Check
|
||||
- netstat -tlnp | grep 3000 (or configured port)
|
||||
- Check if something already listening
|
||||
|
||||
Diagnostics Report: Create diagnostics.json
|
||||
{
|
||||
"timestamp": ISO8601_string,
|
||||
"environment": {
|
||||
"node_version": string,
|
||||
"npm_version": string,
|
||||
"cwd": string
|
||||
},
|
||||
"checks": {
|
||||
"prerequisites": {
|
||||
"passed": boolean,
|
||||
"details": string
|
||||
},
|
||||
"dependencies": {
|
||||
"passed": boolean,
|
||||
"issues": [string],
|
||||
"total_packages": number
|
||||
},
|
||||
"configuration": {
|
||||
"passed": boolean,
|
||||
"issues": [string]
|
||||
},
|
||||
"build": {
|
||||
"passed": boolean,
|
||||
"errors": [string],
|
||||
"warnings": [string]
|
||||
},
|
||||
"startup": {
|
||||
"passed": boolean,
|
||||
"errors": [string],
|
||||
"port": number
|
||||
}
|
||||
},
|
||||
"summary": {
|
||||
"all_passed": boolean,
|
||||
"blockers": [string],
|
||||
"warnings": [string]
|
||||
},
|
||||
"recommendations": [
|
||||
string
|
||||
]
|
||||
}
|
||||
|
||||
Do NOT:
|
||||
- Attempt to fix issues
|
||||
- Install missing packages
|
||||
- Modify configuration
|
||||
- Change environment variables
|
||||
"""
|
||||
|
||||
job_id = spawn_claude_agent(project, task, context="", config=config)
|
||||
```
|
||||
|
||||
### Expected Output
|
||||
|
||||
**diagnostics.json:**
|
||||
```json
|
||||
{
|
||||
"timestamp": "2026-01-09T15:50:12Z",
|
||||
"environment": {
|
||||
"node_version": "v18.16.0",
|
||||
"npm_version": "9.6.7",
|
||||
"cwd": "/workspace"
|
||||
},
|
||||
"checks": {
|
||||
"prerequisites": {
|
||||
"passed": true,
|
||||
"details": "All required tools present"
|
||||
},
|
||||
"dependencies": {
|
||||
"passed": false,
|
||||
"issues": [
|
||||
"express: vulnerable version (9.1.0)",
|
||||
"lodash: could be updated to 4.17.21"
|
||||
],
|
||||
"total_packages": 42
|
||||
},
|
||||
"configuration": {
|
||||
"passed": true,
|
||||
"issues": []
|
||||
},
|
||||
"build": {
|
||||
"passed": false,
|
||||
"errors": [
|
||||
"src/processor.ts:42: Type error: Property 'config' does not exist"
|
||||
],
|
||||
"warnings": []
|
||||
},
|
||||
"startup": {
|
||||
"passed": false,
|
||||
"errors": [
|
||||
"Build failed, cannot start"
|
||||
],
|
||||
"port": 3000
|
||||
}
|
||||
},
|
||||
"summary": {
|
||||
"all_passed": false,
|
||||
"blockers": [
|
||||
"TypeScript compilation error in src/processor.ts",
|
||||
"Security vulnerability in express package"
|
||||
],
|
||||
"warnings": [
|
||||
"lodash could be updated"
|
||||
]
|
||||
},
|
||||
"recommendations": [
|
||||
"Fix TypeScript error in src/processor.ts:42",
|
||||
"Update express to 4.18.2 (security patch)",
|
||||
"Consider updating lodash to 4.17.21"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Template 6: Integration Test Agent (Complex Validation)
|
||||
|
||||
### Use Case
|
||||
Validate multiple components work together correctly.
|
||||
|
||||
### Prompt Template
|
||||
|
||||
```python
|
||||
project = "musica"
|
||||
task = """
|
||||
Validate the API integration between frontend and backend.
|
||||
|
||||
Test Scenarios:
|
||||
|
||||
1. Database Connectivity
|
||||
- psql -U postgres -d mydb -c "SELECT 1"
|
||||
- Must succeed with result "1"
|
||||
|
||||
2. Backend Startup
|
||||
- npm run start &
|
||||
- Wait for: "Server running on port 3000"
|
||||
- Timeout: 10 seconds
|
||||
- If fails: STOP, exit 1
|
||||
|
||||
3. Health Check Endpoint
|
||||
- curl http://localhost:3000/health
|
||||
- Expected response: {"status": "ok"}
|
||||
- If fails: STOP, exit 1
|
||||
|
||||
4. API Endpoint Tests
|
||||
- GET /api/users -> status 200, array response
|
||||
- POST /api/users -> status 201, returns created user
|
||||
- PUT /api/users/1 -> status 200
|
||||
- DELETE /api/users/1 -> status 204
|
||||
|
||||
5. Database Transactions
|
||||
- Create test record: INSERT INTO test_table...
|
||||
- Verify created: SELECT...
|
||||
- Delete test record: DELETE...
|
||||
- Verify deleted: SELECT...
|
||||
|
||||
6. Error Handling
|
||||
- GET /api/users/999 -> status 404
|
||||
- POST /api/users with invalid data -> status 400
|
||||
- Both should return proper error messages
|
||||
|
||||
Test Report: Create integration-test-report.json
|
||||
{
|
||||
"timestamp": ISO8601_string,
|
||||
"test_suites": {
|
||||
"database": {
|
||||
"passed": boolean,
|
||||
"tests": number,
|
||||
"failures": [string]
|
||||
},
|
||||
"backend": {
|
||||
"passed": boolean,
|
||||
"startup_time_ms": number,
|
||||
"failures": [string]
|
||||
},
|
||||
"health_check": {
|
||||
"passed": boolean,
|
||||
"response_time_ms": number,
|
||||
"failures": [string]
|
||||
},
|
||||
"api_endpoints": {
|
||||
"passed": boolean,
|
||||
"endpoints_tested": number,
|
||||
"failures": [string]
|
||||
},
|
||||
"transactions": {
|
||||
"passed": boolean,
|
||||
"failures": [string]
|
||||
},
|
||||
"error_handling": {
|
||||
"passed": boolean,
|
||||
"failures": [string]
|
||||
}
|
||||
},
|
||||
"summary": {
|
||||
"total_tests": number,
|
||||
"passed": number,
|
||||
"failed": number,
|
||||
"success_rate": number,
|
||||
"all_passed": boolean
|
||||
},
|
||||
"performance": {
|
||||
"database_latency_ms": number,
|
||||
"api_average_latency_ms": number,
|
||||
"slowest_endpoint": string
|
||||
},
|
||||
"recommendations": [string]
|
||||
}
|
||||
|
||||
Exit codes:
|
||||
- Exit 0 if all_passed = true
|
||||
- Exit 1 if any test fails
|
||||
- Exit 2 if unrecoverable (DB unreachable, etc)
|
||||
|
||||
Do NOT:
|
||||
- Modify database schema
|
||||
- Change application code
|
||||
- Deploy changes
|
||||
"""
|
||||
|
||||
job_id = spawn_claude_agent(project, task, context="", config=config)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Usage Pattern: Spawn and Monitor
|
||||
|
||||
```python
|
||||
# Spawn agent
|
||||
job_id = spawn_claude_agent(project, task, context="", config=config)
|
||||
print(f"Job spawned: {job_id}")
|
||||
|
||||
# Optionally: Monitor completion
|
||||
import time
|
||||
job_dir = Path(f"/var/log/luz-orchestrator/jobs/{job_id}")
|
||||
|
||||
while True:
|
||||
output_file = job_dir / "output.log"
|
||||
if output_file.exists():
|
||||
content = output_file.read_text()
|
||||
if "exit:" in content:
|
||||
# Job completed, extract exit code
|
||||
exit_code = int(content.strip().split("exit:")[-1])
|
||||
print(f"Job completed with exit code: {exit_code}")
|
||||
|
||||
# Read results
|
||||
if (job_dir / "results.json").exists():
|
||||
results = json.loads((job_dir / "results.json").read_text())
|
||||
print(f"Results: {json.dumps(results, indent=2)}")
|
||||
break
|
||||
|
||||
time.sleep(1)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
These templates cover the most common autonomous agent scenarios:
|
||||
|
||||
1. **Analysis Agents** - Gather information, don't modify
|
||||
2. **Execution Agents** - Run commands, report results
|
||||
3. **Implementation Agents** - Modify code, verify changes
|
||||
4. **Workflow Agents** - Multi-step orchestration
|
||||
5. **Diagnostic Agents** - Troubleshoot issues
|
||||
6. **Integration Test Agents** - Validate multiple components
|
||||
|
||||
**Key Success Factors:**
|
||||
- Clear, specific requirements
|
||||
- Defined success criteria
|
||||
- Complete context provided
|
||||
- No ambiguity or assumptions
|
||||
- Exit codes for status signaling
|
||||
- Results in JSON/structured format
|
||||
|
||||
All templates are:
|
||||
- ✓ Production-ready
|
||||
- ✓ Non-blocking (use detached spawning)
|
||||
- ✓ Autonomy-focused (no user prompts)
|
||||
- ✓ Failure-resistant (error handling built in)
|
||||
- ✓ Result-oriented (clear output)
|
||||
|
||||
Reference in New Issue
Block a user