Refactor cockpit to use DockerTmuxController pattern

Based on claude-code-tools TmuxCLIController, this refactor: - Added DockerTmuxController class for robust tmux session management - Implements send_keys() with configurable delay_enter - Implements capture_pane() for output retrieval - Implements wait_for_prompt() for pattern-based completion detection - Implements wait_for_idle() for content-hash-based idle detection - Implements wait_for_shell_prompt() for shell prompt detection Also includes workflow improvements: - Pre-task git snapshot before agent execution - Post-task commit protocol in agent guidelines Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 10:42:16 -03:00
commit ec33ac1936
265 changed files with 92011 additions and 0 deletions
--- a/RESEARCH-SUMMARY.md
+++ b/RESEARCH-SUMMARY.md
@@ -0,0 +1,389 @@
+# Agent Autonomy Research - Executive Summary
+
+**Project:** Luzia Agent Autonomy Research
+**Date:** 2026-01-09
+**Status:** ✅ Complete
+**Deliverables:** 4 comprehensive documents + shared knowledge graph
+
+---
+
+## What Was Researched
+
+### Primary Questions
+1. How does Luzia handle interactive prompts to prevent agent blocking?
+2. What patterns enable autonomous agent execution without user input?
+3. How do agents handle clarification needs without blocking?
+4. What are best practices for prompt design in autonomous agents?
+
+### Secondary Questions
+5. How does the Claude Agent SDK prevent approval dialog blocking?
+6. What communication patterns work for async agent-to-user interaction?
+7. How can agents make decisions without asking for confirmation?
+
+---
+
+## Key Findings
+
+### 1. **Blocking is Prevented Through Architecture, Not Tricks**
+
+Luzia prevents agent blocking through four **architectural layers**:
+
+| Layer | Implementation | Purpose |
+|-------|---|---|
+| **Process** | Detached spawning (`nohup ... &`) | Parent CLI returns immediately |
+| **Permission** | `--permission-mode bypassPermissions` | No approval dialogs shown |
+| **Communication** | File-based IPC (job directory) | No stdin/stdout dependencies |
+| **Status** | Exit code signaling (append to log) | Async status queries |
+
+**Result:** Even if an agent wanted to block, it can't because:
+- It's in a separate process (parent is gone)
+- It doesn't have stdin (won't wait for input)
+- Permission mode prevents approval prompts
+
+### 2. **The Golden Rule of Autonomy**
+
+> **Autonomous agents don't ask for input because they don't need to.**
+
+Well-designed prompts provide:
+- ✓ Clear, specific objectives (not "improve code", but "reduce complexity to < 5")
+- ✓ Defined success criteria (what success looks like)
+- ✓ Complete context (environment, permissions, constraints)
+- ✓ No ambiguity (every decision path covered)
+
+When these are present → agents execute autonomously
+When these are missing → agents ask questions → blocking occurs
+
+### 3. **Five Critical Patterns Emerged**
+
+1. **Detached Spawning**: Run agents as background processes
+   - Returns immediately to CLI
+   - Agents continue independently
+   - PID tracked for monitoring
+
+2. **Permission Bypass**: Use `--permission-mode bypassPermissions`
+   - No approval dialogs for tool use
+   - Safe because scope limited (project user, project dir)
+   - Must grant pre-authorization in prompt
+
+3. **File-Based I/O**: Use job directory as IPC channel
+   - Prompt input via file
+   - Output captured to log
+   - Status queries don't require process
+   - Works with background agents
+
+4. **Exit Code Signaling**: Append "exit:{code}" to output
+   - Status persists in file after process ends
+   - Async status queries (no polling)
+   - Enables retry logic based on code
+
+5. **Context-First Prompting**: Provide all context upfront
+   - Specific task descriptions
+   - Clear success criteria
+   - No ambiguity
+   - Minimize clarification questions
+
+### 4. **The AskUserQuestion Problem**
+
+Claude's `AskUserQuestion` tool blocks agent waiting for stdin:
+
+```python
+# This blocks forever if agent is backgrounded
+response = await ask_user_question(
+    question="What should I do here?",
+    options=[...]
+)
+# stdin unavailable = agent stuck
+```
+
+**Solution:** Don't rely on user questions. Design prompts to be self-contained.
+
+### 5. **Job Lifecycle is the Key**
+
+Luzia's job directory structure enables full autonomy:
+
+```
+/var/log/luz-orchestrator/jobs/{job_id}/
+├── prompt.txt         # Agent reads from here
+├── output.log         # Agent writes to here (+ exit code)
+├── meta.json          # Job metadata
+├── run.sh             # Execution script
+└── results.json       # Agent-generated results
+```
+
+Exit code appended to output.log as: `exit:{code}`
+
+This enables:
+- Async status queries (no blocking)
+- Automatic retry on specific codes
+- Failure analysis without process
+- Integration with monitoring systems
+
+---
+
+## Deliverables Created
+
+### 1. **AGENT-AUTONOMY-RESEARCH.md** (12 sections)
+Comprehensive research document covering:
+- How Luzia prevents blocking (Section 1)
+- Handling clarification without blocking (Section 2)
+- Job state machine & exit codes (Section 3)
+- Permission system details (Section 4)
+- Async communication patterns (Section 5)
+- Prompt patterns for autonomy (Section 6)
+- Pattern summary (Section 7)
+- Real implementation examples (Section 8)
+- Best practices (Section 9)
+- Advanced patterns (Section 10)
+- Failure cases & solutions (Section 11)
+- Key takeaways & checklist (Section 12)
+
+### 2. **AGENT-CLI-PATTERNS.md** (12 patterns + 5 anti-patterns)
+Practical guide covering:
+- Quick reference: 5 critical patterns
+- 5 prompt patterns (analysis, execution, implementation, multi-phase, decision)
+- 5 anti-patterns to avoid
+- Edge case handling
+- Prompt template for maximum autonomy
+- Real-world examples
+- Quality checklist
+
+### 3. **AUTONOMOUS-AGENT-TEMPLATES.md** (6 templates)
+Production-ready code templates:
+1. Simple task agent (read-only analysis)
+2. Test execution agent (run & report)
+3. Code modification agent (implement & verify)
+4. Multi-step workflow agent (orchestrate process)
+5. Diagnostic agent (troubleshoot & report)
+6. Integration test agent (complex validation)
+
+Each with:
+- Use case description
+- Prompt template
+- Expected output example
+- Usage pattern
+
+### 4. **RESEARCH-SUMMARY.md** (this document)
+Executive summary with:
+- Research questions & answers
+- Key findings (5 major findings)
+- Deliverables list
+- Implementation checklist
+- Knowledge graph integration
+
+---
+
+## Implementation Checklist
+
+### For Using These Patterns
+
+- [ ] **Read** `AGENT-AUTONOMY-RESEARCH.md` sections 1-3 (understand architecture)
+- [ ] **Read** `AGENT-CLI-PATTERNS.md` quick reference (5 patterns)
+- [ ] **Write** prompts following `AGENT-CLI-PATTERNS.md` template
+- [ ] **Use** templates from `AUTONOMOUS-AGENT-TEMPLATES.md` as starting point
+- [ ] **Check** prompt against `AGENT-CLI-PATTERNS.md` checklist
+- [ ] **Spawn** using `spawn_claude_agent()` function
+- [ ] **Monitor** via job directory polling
+
+### For Creating Custom Agents
+
+When creating new autonomous agents:
+
+1. **Define Success Clearly** - What does completion look like?
+2. **Provide Context** - Full environment description
+3. **Specify Format** - What output format (JSON, text, files)
+4. **No Ambiguity** - Every decision path covered
+5. **Document Constraints** - What can/can't be changed
+6. **Define Exit Codes** - 0=success, 1=recoverable failure, 2=error
+7. **No User Prompts** - All decisions made by agent alone
+8. **Test in Background** - Verify no blocking
+
+---
+
+## Code References
+
+### Key Implementation Files
+
+| File | Purpose | Lines |
+|------|---------|-------|
+| `/opt/server-agents/orchestrator/bin/luzia` | Agent spawning implementation | 1012-1200 |
+| `/opt/server-agents/orchestrator/lib/docker_bridge.py` | Container isolation | All |
+| `/opt/server-agents/orchestrator/lib/queue_controller.py` | File-based task queue | All |
+
+### Key Functions
+
+**spawn_claude_agent(project, task, context, config)**
+- Lines 1012-1200 in luzia
+- Spawns detached background agent
+- Returns job_id immediately
+- Handles permission bypass, environment setup, exit code capture
+
+**_get_actual_job_status(job_dir)**
+- Lines 607-646 in luzia
+- Determines job status by reading output.log
+- Checks for "exit:" marker
+- Returns: running/completed/failed/killed
+
+**EnqueueTask (QueueController)**
+- Adds task to file-based queue
+- Enables load-aware scheduling
+- Returns task_id and queue position
+
+---
+
+## Knowledge Graph Integration
+
+Findings stored in shared knowledge graph at:
+`/etc/zen-swarm/memory/projects.db`
+
+**Relations created:**
+- `luzia-agent-autonomy-research` documents patterns (5 core patterns)
+- `luzia-agent-autonomy-research` defines anti-pattern (1 major anti-pattern)
+- `luzia-architecture` implements pattern (detached execution)
+- `agent-autonomy-best-practices` includes guidelines (2 key guidelines)
+
+**Queryable via:**
+```bash
+# Search for autonomy patterns
+mcp__shared-projects-memory__search_context "autonomous agent patterns"
+
+# Query specific pattern
+mcp__shared-projects-memory__query_relations \
+  entity_name="detached-process-execution" \
+  relation_type="documents_pattern"
+```
+
+---
+
+## Quick Start: Using These Findings
+
+### For Developers
+
+**Problem:** "How do I write an agent that runs autonomously?"
+
+**Solution:**
+1. Read: `AGENT-CLI-PATTERNS.md` - "5 Critical Patterns" section
+2. Find matching template in: `AUTONOMOUS-AGENT-TEMPLATES.md`
+3. Follow the prompt pattern
+4. Use: `spawn_claude_agent(project, task, context, config)`
+5. Check: `luzia jobs {job_id}` for status
+
+### For Architects
+
+**Problem:** "Should we redesign for async agents?"
+
+**Solution:**
+1. Read: `AGENT-AUTONOMY-RESEARCH.md` sections 1-3
+2. Current approach (detached + file-based IPC) is mature
+3. No redesign needed; patterns work well
+4. Focus: prompt design quality (see anti-patterns section)
+
+### For Troubleshooting
+
+**Problem:** "Agent keeps asking for clarification"
+
+**Solution:**
+1. Check: `AGENT-CLI-PATTERNS.md` - "Anti-Patterns" section
+2. Redesign prompt to be more specific
+3. See: "Debugging" section
+4. Use: `luzia retry {job_id}` to run with new prompt
+
+---
+
+## Metrics & Results
+
+### Documentation Coverage
+
+| Topic | Coverage | Format |
+|-------|----------|--------|
+| **Architecture** | Complete (8 sections) | Markdown |
+| **Patterns** | 5 core patterns detailed | Markdown |
+| **Anti-patterns** | 5 anti-patterns with fixes | Markdown |
+| **Best Practices** | 9 detailed practices | Markdown |
+| **Code Examples** | 6 production templates | Python |
+| **Real-world Cases** | 3 detailed examples | Markdown |
+
+### Research Completeness
+
+- ✅ How Luzia prevents blocking (answered with architecture details)
+- ✅ Clarification handling without blocking (answered: don't rely on it)
+- ✅ Prompt patterns for autonomy (5 patterns documented)
+- ✅ Best practices (9 practices with examples)
+- ✅ Failure cases (11 cases with solutions)
+
+### Knowledge Graph
+
+- ✅ 5 core patterns registered
+- ✅ 1 anti-pattern registered
+- ✅ 2 best practices registered
+- ✅ Implementation references linked
+- ✅ Queryable for future research
+
+---
+
+## Recommendations for Next Steps
+
+### For Teams Using Luzia
+
+1. **Review** the 5 critical patterns in `AGENT-CLI-PATTERNS.md`
+2. **Adopt** context-first prompting for all new agents
+3. **Use** provided templates for common tasks
+4. **Share** findings with team members
+5. **Monitor** agent quality (should rarely ask questions)
+
+### For Claude Development
+
+1. **Consider** guidance on when to skip `AskUserQuestion`
+2. **Document** permission bypass mode in official docs
+3. **Add** examples of async prompt patterns
+4. **Build** wrapper for common agent patterns
+
+### For Future Research
+
+1. **Study** efficiency of file-based IPC vs other patterns
+2. **Measure** success rate of context-first prompts
+3. **Compare** blocking duration in different scenarios
+4. **Document** framework for other orchestrators
+
+---
+
+## Conclusion
+
+Autonomous agents don't require complex async prompting systems. Instead, they require:
+
+1. **Clear Architecture** - Detached processes, permission bypass, file IPC
+2. **Good Prompts** - Specific, complete context, clear success criteria
+3. **Exit Code Signaling** - Status persisted in files for async queries
+
+Luzia implements all three. The findings in these documents provide patterns and best practices for anyone building autonomous agents with Claude.
+
+**Key Insight:** The best way to avoid blocking is to design prompts that don't require user input. Luzia's architecture makes this pattern safe and scalable.
+
+---
+
+## Files Delivered
+
+1. `/opt/server-agents/orchestrator/AGENT-AUTONOMY-RESEARCH.md` (12 sections, ~3000 lines)
+2. `/opt/server-agents/orchestrator/AGENT-CLI-PATTERNS.md` (practical patterns, ~800 lines)
+3. `/opt/server-agents/orchestrator/AUTONOMOUS-AGENT-TEMPLATES.md` (6 templates, ~900 lines)
+4. `/opt/server-agents/orchestrator/RESEARCH-SUMMARY.md` (this file)
+
+**Total:** ~4,700 lines of documentation
+
+---
+
+## Stored in Knowledge Graph
+
+Relations created in `/etc/zen-swarm/memory/projects.db`:
+- 5 core patterns documented
+- 1 anti-pattern documented
+- 2 best practices documented
+- Architecture implementation linked
+
+**Queryable:** Via `mcp__shared-projects-memory__*` tools
+
+---
+
+**Research completed by:** Claude Agent (Haiku)
+**Research date:** 2026-01-09
+**Status:** Ready for team adoption
+