Based on claude-code-tools TmuxCLIController, this refactor: - Added DockerTmuxController class for robust tmux session management - Implements send_keys() with configurable delay_enter - Implements capture_pane() for output retrieval - Implements wait_for_prompt() for pattern-based completion detection - Implements wait_for_idle() for content-hash-based idle detection - Implements wait_for_shell_prompt() for shell prompt detection Also includes workflow improvements: - Pre-task git snapshot before agent execution - Post-task commit protocol in agent guidelines Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
15 KiB
CLI Agent Patterns and Prompt Design
Practical Guide for Building Non-Blocking Agents
Date: 2026-01-09 Version: 1.0 Audience: Agent developers, prompt engineers
Quick Reference: 5 Critical Patterns
1. Detached Spawning (Never Block)
# ✅ CORRECT: Agent runs in background
os.system(f'nohup script.sh >/dev/null 2>&1 &')
job_id = generate_uuid()
return job_id # Return immediately
# ❌ WRONG: Parent waits for agent to finish
result = subprocess.run(['claude', ...], wait=True)
# CLI blocked until agent completes!
2. Permission Bypass (No Approval Dialogs)
# ✅ CORRECT: Agents don't ask for tool approval
claude --permission-mode bypassPermissions --dangerously-skip-permissions ...
# ❌ WRONG: Default mode asks for confirmation on tool use
claude ...
# Blocks waiting for user to approve: "This command has high privileges. Approve? [Y/n]"
3. File-Based I/O (No stdin/stdout)
# ✅ CORRECT: All I/O via files
with open(f"{job_dir}/prompt.txt", "w") as f:
f.write(full_prompt)
# Agent reads prompt from file
# Agent writes output to log file
# Status checked by reading exit code from file
# ❌ WRONG: Trying to use stdin/stdout
process = subprocess.Popen(..., stdin=PIPE, stdout=PIPE)
process.stdin.write(prompt) # What if backgrounded? stdin unavailable!
result = process.stdout.read() # Parent blocked waiting!
4. Exit Code Signaling (Async Status)
# ✅ CORRECT: Append exit code to output
command...
exit_code=$?
echo "exit:$exit_code" >> output.log
# Later, check status without process
grep "^exit:" output.log # Returns immediately
# ❌ WRONG: Only store in memory
# Process exits, exit code lost
# Can't determine status later
5. Context-First Prompts (Minimize Questions)
# ✅ CORRECT: Specific, complete, unambiguous
You are running as user: musica
Working directory: /workspace
You have permission to read/write files here.
Task: Run pytest in /workspace/tests and save results to results.json
Success criteria: File contains {passed: int, failed: int, skipped: int}
Exit code: 0 if all tests pass, 1 if any fail
Do NOT ask for clarification. You have all needed information.
# ❌ WRONG: Vague, requires interpretation
Fix the test suite.
(What needs fixing? Which tests? Agent will need to ask!)
Prompt Patterns for Autonomy
Pattern 1: Analysis Task (Read-Only)
Goal: Agent analyzes code without modifying anything
## Task
Analyze the TypeScript codebase in /workspace/src for:
1. Total files
2. Total lines of code (excluding comments/blanks)
3. Number of functions
4. Number of classes
5. Average cyclomatic complexity per function
6. Top 3 most complex files
## Success Criteria
Save results to /workspace/analysis.json with structure:
{
"total_files": number,
"total_loc": number,
"functions": number,
"classes": number,
"avg_complexity": number,
"hotspots": [
{"file": string, "complexity": number, "functions": number}
]
}
## Exit Codes
- Exit 0: Success, file created with all fields
- Exit 1: File not created or missing fields
- Exit 2: Unrecoverable error (no TypeScript found, etc)
## Autonomy
You have all information needed. Do NOT:
- Ask which files to analyze
- Ask which metrics matter
- Request clarification on format
Pattern 2: Execution Task (Run & Report)
Goal: Agent runs command and reports results
## Task
Run the test suite in /workspace/tests with the following requirements:
1. Use pytest with JSON output
2. Run: pytest tests/ --json=results.json
3. Capture exit code
4. Create summary.json with:
- Total tests run
- Passed count
- Failed count
- Skipped count
- Exit code from pytest
## Success Criteria
Both results.json (from pytest) and summary.json (created by you) must exist.
Exit 0 if pytest exit code is 0 (all passed)
Exit 1 if pytest exit code is non-zero (failures)
## What to Do If Tests Fail
1. Create summary.json anyway with failure counts
2. Exit with code 1 (not 2, this is expected)
3. Do NOT try to fix tests yourself
## Autonomy
You know what to do. Do NOT:
- Ask which tests to run
- Ask about test configuration
- Request approval before running tests
Pattern 3: Implementation Task (Read + Modify)
Goal: Agent modifies code based on specification
## Task
Add error handling to /workspace/src/database.ts
Requirements:
1. All database calls must have try/catch
2. Catch blocks must log to console.error
3. Catch blocks must return null (not throw)
4. Add TypeScript types for error parameter
## Success Criteria
File modifies without syntax errors (use: npm run build)
All database functions protected (search file for db\. calls)
## Exit Codes
- Exit 0: All database calls wrapped, no TypeScript errors
- Exit 1: Some database calls not wrapped, OR TypeScript errors exist
- Exit 2: File not found or unrecoverable
## Verification
After modifications:
npm run build # Must succeed with no errors
## Autonomy
You have specific requirements. Do NOT:
- Ask which functions need wrapping
- Ask about error logging format
- Request confirmation before modifying
Pattern 4: Multi-Phase Task (Sequential Steps)
Goal: Agent completes multiple dependent steps
## Task
Complete this CI/CD pipeline step:
Phase 1: Build
- npm install
- npm run build
- Check: no errors in output
Phase 2: Test
- npm run test
- Check: exit code 0
- If exit code 1: STOP, exit 1 from this task
Phase 3: Report
- Create build-report.json with:
{
"build": {success: true, timestamp: string},
"tests": {success: true, count: number, failed: number},
"status": "ready_for_deploy"
}
## Success Criteria
All three phases complete AND exit codes from npm are 0
build-report.json created with all fields
Overall exit code: 0 (success) or 1 (failure at any phase)
## Autonomy
Execute phases in order. Do NOT:
- Ask whether to skip phases
- Ask about error handling
- Request approval between phases
Pattern 5: Decision Task (Branch Logic)
Goal: Agent makes decisions based on conditions
## Task
Decide whether to deploy based on build status.
Steps:
1. Read build-report.json (created by previous task)
2. Check: all phases successful
3. If successful:
a. Create deployment-plan.json
b. Exit 0
4. If not successful:
a. Create failure-report.json
b. Exit 1
## Decision Logic
IF (build.success AND tests.success AND no_syntax_errors):
Deploy ready
ELSE:
Cannot deploy
## Success Criteria
One of these files exists:
- deployment-plan.json (exit 0)
- failure-report.json (exit 1)
## Autonomy
You have criteria. Do NOT:
- Ask whether to deploy
- Request confirmation
- Ask about deployment process
Anti-Patterns: What NOT to Do
❌ Anti-Pattern 1: Ambiguous Tasks
WRONG: "Improve the code"
- What needs improvement?
- Which files?
- What metrics?
AGENT WILL ASK: "Can you clarify what you mean by improve?"
FIX:
CORRECT: "Reduce cyclomatic complexity in src/processor.ts"
- Identify functions with complexity > 5
- Refactor to reduce to < 5
- Run tests to verify no regression
❌ Anti-Pattern 2: Vague Success Criteria
WRONG: "Make sure it works"
- What is "it"?
- How do we verify it works?
AGENT WILL ASK: "How should I know when the task is complete?"
FIX:
CORRECT: "Task complete when:"
- All tests pass (pytest exit 0)
- No TypeScript errors (npm run build succeeds)
- Code coverage > 80% (check coverage report)
❌ Anti-Pattern 3: Implicit Constraints
WRONG: "Add this feature to the codebase"
- What files can be modified?
- What can't be changed?
AGENT WILL ASK: "Can I modify the database schema?"
FIX:
CORRECT: "Add feature to src/features/auth.ts:"
- This file ONLY
- Don't modify: database schema, config, types
- Do maintain: existing function signatures
❌ Anti-Pattern 4: Interactive Questions in Prompts
WRONG:
"Do you think we should refactor this?
Try a few approaches and tell me which is best."
AGENT WILL ASK: "What criteria for 'best'? Performance? Readability?"
FIX:
CORRECT:
"Refactor for readability:"
- Break functions > 20 lines into smaller functions
- Add clear variable names (no x, y, temp)
- Check: ESLint passes, no new warnings
❌ Anti-Pattern 5: Requiring User Approval
WRONG:
"I'm about to deploy. Is this okay? [Y/n]"
BLOCKS: Waiting for user input via stdin (won't work in background!)
FIX:
CORRECT:
"Validate deployment prerequisites and create deployment-plan.json"
(No approval request. User runs separately: cat deployment-plan.json)
(If satisfied, user can execute deployment)
Handling Edge Cases Without Blocking
Case 1: File Not Found
## If /workspace/config.json doesn't exist:
1. Log to output: "Config file not found"
2. Create default config
3. Continue with default values
4. Do NOT ask user: "Should I create a default?"
## If error occurs during execution:
1. Log full error to output.log
2. Include: what failed, why, what was attempted
3. Exit with code 1
4. Do NOT ask: "What should I do?"
Case 2: Ambiguous State
## If multiple versions of file exist:
1. Document all versions found
2. Choose: most recent by timestamp
3. Continue
4. Log choice to output.log
5. Do NOT ask: "Which one should I use?"
## If task instructions conflict:
1. Document the conflict
2. Follow: primary instruction (first mentioned)
3. Log reasoning to output.log
4. Do NOT ask: "Which should I follow?"
Case 3: Partial Success
## If some tests pass, some fail:
1. Report both: {passed: 45, failed: 3}
2. Exit with code 1 (not 0, even though some passed)
3. Include in output: which tests failed
4. Do NOT ask: "Should I count partial success?"
Prompt Template for Maximum Autonomy
# Agent Task Template
## Role & Context
You are a {project_name} project agent.
Working directory: {absolute_path}
Running as user: {username}
Permissions: Full read/write in working directory
## Task Specification
{SPECIFIC task description}
Success looks like:
- {Specific deliverable 1}
- {Specific deliverable 2}
- {Specific output file/format}
## Execution Environment
Tools available: Read, Write, Edit, Bash, Glob, Grep
Directories accessible: {list specific paths}
Commands available: {list specific commands}
Constraints: {List what cannot be done}
## Exit Codes
- 0: Success (all success criteria met)
- 1: Failure (some success criteria not met, but not unrecoverable)
- 2: Error (unrecoverable, cannot continue)
## If Something Goes Wrong
1. Log the error to output
2. Try once to recover
3. If recovery fails, exit with appropriate code
4. Do NOT ask for help or clarification
## Do NOT
- Ask any clarifying questions
- Request approval for any action
- Wait for user input
- Modify files outside {working directory}
- Use tools not listed above
Real-World Examples
Example 1: Code Quality Scan (Read-Only)
Prompt:
Analyze code quality in /workspace/src using:
1. ESLint (npm run lint) - capture all warnings
2. TypeScript compiler (npm run build) - capture all errors
3. Count lines of code per file
Save to quality-report.json:
{
"eslint": {
"errors": number,
"warnings": number,
"rules_violated": [string]
},
"typescript": {
"errors": number,
"errors_list": [string]
},
"code_metrics": {
"total_loc": number,
"total_files": number,
"avg_loc_per_file": number
}
}
Exit 0 if both eslint and typescript succeeded.
Exit 1 if either had errors.
Do NOT try to fix errors, just report.
Expected Agent Behavior:
- Runs linters (no approval needed)
- Collects metrics
- Creates JSON file
- Exits with appropriate code
- No questions asked ✓
Example 2: Database Migration (Modify + Verify)
Prompt:
Apply database migration /workspace/migrations/001_add_users_table.sql
Steps:
1. Read migration file
2. Run: psql -U postgres -d mydb -f migrations/001_add_users_table.sql
3. If success: psql ... -c "SELECT COUNT(*) FROM users;" to verify
4. Save results to migration-log.json
Success criteria:
- Migration file executed without errors
- New table exists
- migration-log.json contains:
{
"timestamp": string,
"migration": "001_add_users_table.sql",
"status": "success" | "failed",
"error": string | null
}
Exit 0 on success.
Exit 1 on any database error.
Do NOT manually create table if migration fails.
Expected Agent Behavior:
- Executes SQL (no approval needed)
- Verifies results
- Logs to JSON
- Exits appropriately
- No questions asked ✓
Example 3: Deployment Check (Decision Logic)
Prompt:
Verify deployment readiness:
Checks:
1. All tests passing: npm test -> exit 0
2. Build succeeds: npm run build -> exit 0
3. No security warnings: npm audit -> moderate/high = 0
4. Environment configured: .env file exists
Create deployment-readiness.json:
{
"ready": boolean,
"checks": {
"tests": boolean,
"build": boolean,
"security": boolean,
"config": boolean
},
"blockers": [string],
"timestamp": string
}
If all checks pass: ready = true, exit 0
If any check fails: ready = false, exit 1
Do NOT try to fix blockers. Only report.
Expected Agent Behavior:
- Runs all checks
- Documents results
- No fixes attempted
- Clear decision output
- No questions asked ✓
Debugging: When Agents DO Ask Questions
How to Detect Blocking Questions
# Check agent output for clarification questions
grep -i "should i\|would you\|can you\|do you want\|clarif" \
/var/log/luz-orchestrator/jobs/{job_id}/output.log
# Check for approval prompts
grep -i "approve\|confirm\|permission\|y/n" \
/var/log/luz-orchestrator/jobs/{job_id}/output.log
# Agent blocked = exit code not in output.log
tail -5 /var/log/luz-orchestrator/jobs/{job_id}/output.log
# If last line is NOT "exit:{code}", agent is blocked
How to Fix
- Identify the question - What is agent asking?
- Redesign prompt - Provide the answer upfront
- Be more specific - Remove ambiguity
- Retry -
luzia retry {job_id}
Checklist: Autonomous Prompt Quality
- Task is specific (not "improve" or "fix")
- Success criteria defined (what success looks like)
- Output format specified (JSON, file, etc)
- Exit codes documented (0=success, 1=failure)
- Constraints listed (what can't be changed)
- No ambiguous language
- No requests for clarification
- No approval prompts
- No "if you think..." or "do you want to..."
- All context provided upfront
- User running as limited user (not root)
- Task scope limited to project directory
Summary
The Core Rule:
Autonomous agents don't ask questions because they don't need to.
Well-designed prompts provide:
- Clear objectives
- Specific success criteria
- Complete context
- Defined boundaries
- No ambiguity
When these are present, agents execute autonomously. When they're missing, agents ask clarifying questions, causing blocking.
For Luzia agents: Use the 5 patterns (detached spawning, permission bypass, file-based I/O, exit code signaling, context-first prompting) and follow the anti-patterns guide.