Based on claude-code-tools TmuxCLIController, this refactor: - Added DockerTmuxController class for robust tmux session management - Implements send_keys() with configurable delay_enter - Implements capture_pane() for output retrieval - Implements wait_for_prompt() for pattern-based completion detection - Implements wait_for_idle() for content-hash-based idle detection - Implements wait_for_shell_prompt() for shell prompt detection Also includes workflow improvements: - Pre-task git snapshot before agent execution - Post-task commit protocol in agent guidelines Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
390 lines
12 KiB
Markdown
390 lines
12 KiB
Markdown
# Agent Autonomy Research - Executive Summary
|
|
|
|
**Project:** Luzia Agent Autonomy Research
|
|
**Date:** 2026-01-09
|
|
**Status:** ✅ Complete
|
|
**Deliverables:** 4 comprehensive documents + shared knowledge graph
|
|
|
|
---
|
|
|
|
## What Was Researched
|
|
|
|
### Primary Questions
|
|
1. How does Luzia handle interactive prompts to prevent agent blocking?
|
|
2. What patterns enable autonomous agent execution without user input?
|
|
3. How do agents handle clarification needs without blocking?
|
|
4. What are best practices for prompt design in autonomous agents?
|
|
|
|
### Secondary Questions
|
|
5. How does the Claude Agent SDK prevent approval dialog blocking?
|
|
6. What communication patterns work for async agent-to-user interaction?
|
|
7. How can agents make decisions without asking for confirmation?
|
|
|
|
---
|
|
|
|
## Key Findings
|
|
|
|
### 1. **Blocking is Prevented Through Architecture, Not Tricks**
|
|
|
|
Luzia prevents agent blocking through four **architectural layers**:
|
|
|
|
| Layer | Implementation | Purpose |
|
|
|-------|---|---|
|
|
| **Process** | Detached spawning (`nohup ... &`) | Parent CLI returns immediately |
|
|
| **Permission** | `--permission-mode bypassPermissions` | No approval dialogs shown |
|
|
| **Communication** | File-based IPC (job directory) | No stdin/stdout dependencies |
|
|
| **Status** | Exit code signaling (append to log) | Async status queries |
|
|
|
|
**Result:** Even if an agent wanted to block, it can't because:
|
|
- It's in a separate process (parent is gone)
|
|
- It doesn't have stdin (won't wait for input)
|
|
- Permission mode prevents approval prompts
|
|
|
|
### 2. **The Golden Rule of Autonomy**
|
|
|
|
> **Autonomous agents don't ask for input because they don't need to.**
|
|
|
|
Well-designed prompts provide:
|
|
- ✓ Clear, specific objectives (not "improve code", but "reduce complexity to < 5")
|
|
- ✓ Defined success criteria (what success looks like)
|
|
- ✓ Complete context (environment, permissions, constraints)
|
|
- ✓ No ambiguity (every decision path covered)
|
|
|
|
When these are present → agents execute autonomously
|
|
When these are missing → agents ask questions → blocking occurs
|
|
|
|
### 3. **Five Critical Patterns Emerged**
|
|
|
|
1. **Detached Spawning**: Run agents as background processes
|
|
- Returns immediately to CLI
|
|
- Agents continue independently
|
|
- PID tracked for monitoring
|
|
|
|
2. **Permission Bypass**: Use `--permission-mode bypassPermissions`
|
|
- No approval dialogs for tool use
|
|
- Safe because scope limited (project user, project dir)
|
|
- Must grant pre-authorization in prompt
|
|
|
|
3. **File-Based I/O**: Use job directory as IPC channel
|
|
- Prompt input via file
|
|
- Output captured to log
|
|
- Status queries don't require process
|
|
- Works with background agents
|
|
|
|
4. **Exit Code Signaling**: Append "exit:{code}" to output
|
|
- Status persists in file after process ends
|
|
- Async status queries (no polling)
|
|
- Enables retry logic based on code
|
|
|
|
5. **Context-First Prompting**: Provide all context upfront
|
|
- Specific task descriptions
|
|
- Clear success criteria
|
|
- No ambiguity
|
|
- Minimize clarification questions
|
|
|
|
### 4. **The AskUserQuestion Problem**
|
|
|
|
Claude's `AskUserQuestion` tool blocks agent waiting for stdin:
|
|
|
|
```python
|
|
# This blocks forever if agent is backgrounded
|
|
response = await ask_user_question(
|
|
question="What should I do here?",
|
|
options=[...]
|
|
)
|
|
# stdin unavailable = agent stuck
|
|
```
|
|
|
|
**Solution:** Don't rely on user questions. Design prompts to be self-contained.
|
|
|
|
### 5. **Job Lifecycle is the Key**
|
|
|
|
Luzia's job directory structure enables full autonomy:
|
|
|
|
```
|
|
/var/log/luz-orchestrator/jobs/{job_id}/
|
|
├── prompt.txt # Agent reads from here
|
|
├── output.log # Agent writes to here (+ exit code)
|
|
├── meta.json # Job metadata
|
|
├── run.sh # Execution script
|
|
└── results.json # Agent-generated results
|
|
```
|
|
|
|
Exit code appended to output.log as: `exit:{code}`
|
|
|
|
This enables:
|
|
- Async status queries (no blocking)
|
|
- Automatic retry on specific codes
|
|
- Failure analysis without process
|
|
- Integration with monitoring systems
|
|
|
|
---
|
|
|
|
## Deliverables Created
|
|
|
|
### 1. **AGENT-AUTONOMY-RESEARCH.md** (12 sections)
|
|
Comprehensive research document covering:
|
|
- How Luzia prevents blocking (Section 1)
|
|
- Handling clarification without blocking (Section 2)
|
|
- Job state machine & exit codes (Section 3)
|
|
- Permission system details (Section 4)
|
|
- Async communication patterns (Section 5)
|
|
- Prompt patterns for autonomy (Section 6)
|
|
- Pattern summary (Section 7)
|
|
- Real implementation examples (Section 8)
|
|
- Best practices (Section 9)
|
|
- Advanced patterns (Section 10)
|
|
- Failure cases & solutions (Section 11)
|
|
- Key takeaways & checklist (Section 12)
|
|
|
|
### 2. **AGENT-CLI-PATTERNS.md** (12 patterns + 5 anti-patterns)
|
|
Practical guide covering:
|
|
- Quick reference: 5 critical patterns
|
|
- 5 prompt patterns (analysis, execution, implementation, multi-phase, decision)
|
|
- 5 anti-patterns to avoid
|
|
- Edge case handling
|
|
- Prompt template for maximum autonomy
|
|
- Real-world examples
|
|
- Quality checklist
|
|
|
|
### 3. **AUTONOMOUS-AGENT-TEMPLATES.md** (6 templates)
|
|
Production-ready code templates:
|
|
1. Simple task agent (read-only analysis)
|
|
2. Test execution agent (run & report)
|
|
3. Code modification agent (implement & verify)
|
|
4. Multi-step workflow agent (orchestrate process)
|
|
5. Diagnostic agent (troubleshoot & report)
|
|
6. Integration test agent (complex validation)
|
|
|
|
Each with:
|
|
- Use case description
|
|
- Prompt template
|
|
- Expected output example
|
|
- Usage pattern
|
|
|
|
### 4. **RESEARCH-SUMMARY.md** (this document)
|
|
Executive summary with:
|
|
- Research questions & answers
|
|
- Key findings (5 major findings)
|
|
- Deliverables list
|
|
- Implementation checklist
|
|
- Knowledge graph integration
|
|
|
|
---
|
|
|
|
## Implementation Checklist
|
|
|
|
### For Using These Patterns
|
|
|
|
- [ ] **Read** `AGENT-AUTONOMY-RESEARCH.md` sections 1-3 (understand architecture)
|
|
- [ ] **Read** `AGENT-CLI-PATTERNS.md` quick reference (5 patterns)
|
|
- [ ] **Write** prompts following `AGENT-CLI-PATTERNS.md` template
|
|
- [ ] **Use** templates from `AUTONOMOUS-AGENT-TEMPLATES.md` as starting point
|
|
- [ ] **Check** prompt against `AGENT-CLI-PATTERNS.md` checklist
|
|
- [ ] **Spawn** using `spawn_claude_agent()` function
|
|
- [ ] **Monitor** via job directory polling
|
|
|
|
### For Creating Custom Agents
|
|
|
|
When creating new autonomous agents:
|
|
|
|
1. **Define Success Clearly** - What does completion look like?
|
|
2. **Provide Context** - Full environment description
|
|
3. **Specify Format** - What output format (JSON, text, files)
|
|
4. **No Ambiguity** - Every decision path covered
|
|
5. **Document Constraints** - What can/can't be changed
|
|
6. **Define Exit Codes** - 0=success, 1=recoverable failure, 2=error
|
|
7. **No User Prompts** - All decisions made by agent alone
|
|
8. **Test in Background** - Verify no blocking
|
|
|
|
---
|
|
|
|
## Code References
|
|
|
|
### Key Implementation Files
|
|
|
|
| File | Purpose | Lines |
|
|
|------|---------|-------|
|
|
| `/opt/server-agents/orchestrator/bin/luzia` | Agent spawning implementation | 1012-1200 |
|
|
| `/opt/server-agents/orchestrator/lib/docker_bridge.py` | Container isolation | All |
|
|
| `/opt/server-agents/orchestrator/lib/queue_controller.py` | File-based task queue | All |
|
|
|
|
### Key Functions
|
|
|
|
**spawn_claude_agent(project, task, context, config)**
|
|
- Lines 1012-1200 in luzia
|
|
- Spawns detached background agent
|
|
- Returns job_id immediately
|
|
- Handles permission bypass, environment setup, exit code capture
|
|
|
|
**_get_actual_job_status(job_dir)**
|
|
- Lines 607-646 in luzia
|
|
- Determines job status by reading output.log
|
|
- Checks for "exit:" marker
|
|
- Returns: running/completed/failed/killed
|
|
|
|
**EnqueueTask (QueueController)**
|
|
- Adds task to file-based queue
|
|
- Enables load-aware scheduling
|
|
- Returns task_id and queue position
|
|
|
|
---
|
|
|
|
## Knowledge Graph Integration
|
|
|
|
Findings stored in shared knowledge graph at:
|
|
`/etc/zen-swarm/memory/projects.db`
|
|
|
|
**Relations created:**
|
|
- `luzia-agent-autonomy-research` documents patterns (5 core patterns)
|
|
- `luzia-agent-autonomy-research` defines anti-pattern (1 major anti-pattern)
|
|
- `luzia-architecture` implements pattern (detached execution)
|
|
- `agent-autonomy-best-practices` includes guidelines (2 key guidelines)
|
|
|
|
**Queryable via:**
|
|
```bash
|
|
# Search for autonomy patterns
|
|
mcp__shared-projects-memory__search_context "autonomous agent patterns"
|
|
|
|
# Query specific pattern
|
|
mcp__shared-projects-memory__query_relations \
|
|
entity_name="detached-process-execution" \
|
|
relation_type="documents_pattern"
|
|
```
|
|
|
|
---
|
|
|
|
## Quick Start: Using These Findings
|
|
|
|
### For Developers
|
|
|
|
**Problem:** "How do I write an agent that runs autonomously?"
|
|
|
|
**Solution:**
|
|
1. Read: `AGENT-CLI-PATTERNS.md` - "5 Critical Patterns" section
|
|
2. Find matching template in: `AUTONOMOUS-AGENT-TEMPLATES.md`
|
|
3. Follow the prompt pattern
|
|
4. Use: `spawn_claude_agent(project, task, context, config)`
|
|
5. Check: `luzia jobs {job_id}` for status
|
|
|
|
### For Architects
|
|
|
|
**Problem:** "Should we redesign for async agents?"
|
|
|
|
**Solution:**
|
|
1. Read: `AGENT-AUTONOMY-RESEARCH.md` sections 1-3
|
|
2. Current approach (detached + file-based IPC) is mature
|
|
3. No redesign needed; patterns work well
|
|
4. Focus: prompt design quality (see anti-patterns section)
|
|
|
|
### For Troubleshooting
|
|
|
|
**Problem:** "Agent keeps asking for clarification"
|
|
|
|
**Solution:**
|
|
1. Check: `AGENT-CLI-PATTERNS.md` - "Anti-Patterns" section
|
|
2. Redesign prompt to be more specific
|
|
3. See: "Debugging" section
|
|
4. Use: `luzia retry {job_id}` to run with new prompt
|
|
|
|
---
|
|
|
|
## Metrics & Results
|
|
|
|
### Documentation Coverage
|
|
|
|
| Topic | Coverage | Format |
|
|
|-------|----------|--------|
|
|
| **Architecture** | Complete (8 sections) | Markdown |
|
|
| **Patterns** | 5 core patterns detailed | Markdown |
|
|
| **Anti-patterns** | 5 anti-patterns with fixes | Markdown |
|
|
| **Best Practices** | 9 detailed practices | Markdown |
|
|
| **Code Examples** | 6 production templates | Python |
|
|
| **Real-world Cases** | 3 detailed examples | Markdown |
|
|
|
|
### Research Completeness
|
|
|
|
- ✅ How Luzia prevents blocking (answered with architecture details)
|
|
- ✅ Clarification handling without blocking (answered: don't rely on it)
|
|
- ✅ Prompt patterns for autonomy (5 patterns documented)
|
|
- ✅ Best practices (9 practices with examples)
|
|
- ✅ Failure cases (11 cases with solutions)
|
|
|
|
### Knowledge Graph
|
|
|
|
- ✅ 5 core patterns registered
|
|
- ✅ 1 anti-pattern registered
|
|
- ✅ 2 best practices registered
|
|
- ✅ Implementation references linked
|
|
- ✅ Queryable for future research
|
|
|
|
---
|
|
|
|
## Recommendations for Next Steps
|
|
|
|
### For Teams Using Luzia
|
|
|
|
1. **Review** the 5 critical patterns in `AGENT-CLI-PATTERNS.md`
|
|
2. **Adopt** context-first prompting for all new agents
|
|
3. **Use** provided templates for common tasks
|
|
4. **Share** findings with team members
|
|
5. **Monitor** agent quality (should rarely ask questions)
|
|
|
|
### For Claude Development
|
|
|
|
1. **Consider** guidance on when to skip `AskUserQuestion`
|
|
2. **Document** permission bypass mode in official docs
|
|
3. **Add** examples of async prompt patterns
|
|
4. **Build** wrapper for common agent patterns
|
|
|
|
### For Future Research
|
|
|
|
1. **Study** efficiency of file-based IPC vs other patterns
|
|
2. **Measure** success rate of context-first prompts
|
|
3. **Compare** blocking duration in different scenarios
|
|
4. **Document** framework for other orchestrators
|
|
|
|
---
|
|
|
|
## Conclusion
|
|
|
|
Autonomous agents don't require complex async prompting systems. Instead, they require:
|
|
|
|
1. **Clear Architecture** - Detached processes, permission bypass, file IPC
|
|
2. **Good Prompts** - Specific, complete context, clear success criteria
|
|
3. **Exit Code Signaling** - Status persisted in files for async queries
|
|
|
|
Luzia implements all three. The findings in these documents provide patterns and best practices for anyone building autonomous agents with Claude.
|
|
|
|
**Key Insight:** The best way to avoid blocking is to design prompts that don't require user input. Luzia's architecture makes this pattern safe and scalable.
|
|
|
|
---
|
|
|
|
## Files Delivered
|
|
|
|
1. `/opt/server-agents/orchestrator/AGENT-AUTONOMY-RESEARCH.md` (12 sections, ~3000 lines)
|
|
2. `/opt/server-agents/orchestrator/AGENT-CLI-PATTERNS.md` (practical patterns, ~800 lines)
|
|
3. `/opt/server-agents/orchestrator/AUTONOMOUS-AGENT-TEMPLATES.md` (6 templates, ~900 lines)
|
|
4. `/opt/server-agents/orchestrator/RESEARCH-SUMMARY.md` (this file)
|
|
|
|
**Total:** ~4,700 lines of documentation
|
|
|
|
---
|
|
|
|
## Stored in Knowledge Graph
|
|
|
|
Relations created in `/etc/zen-swarm/memory/projects.db`:
|
|
- 5 core patterns documented
|
|
- 1 anti-pattern documented
|
|
- 2 best practices documented
|
|
- Architecture implementation linked
|
|
|
|
**Queryable:** Via `mcp__shared-projects-memory__*` tools
|
|
|
|
---
|
|
|
|
**Research completed by:** Claude Agent (Haiku)
|
|
**Research date:** 2026-01-09
|
|
**Status:** Ready for team adoption
|
|
|