Refactor cockpit to use DockerTmuxController pattern

Based on claude-code-tools TmuxCLIController, this refactor: - Added DockerTmuxController class for robust tmux session management - Implements send_keys() with configurable delay_enter - Implements capture_pane() for output retrieval - Implements wait_for_prompt() for pattern-based completion detection - Implements wait_for_idle() for content-hash-based idle detection - Implements wait_for_shell_prompt() for shell prompt detection Also includes workflow improvements: - Pre-task git snapshot before agent execution - Post-task commit protocol in agent guidelines Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 10:42:16 -03:00
commit ec33ac1936
265 changed files with 92011 additions and 0 deletions
--- a/docs/CLAUDE-DISPATCH-ANALYSIS.md
+++ b/docs/CLAUDE-DISPATCH-ANALYSIS.md
@@ -0,0 +1,398 @@
+# Claude Dispatch and Monitor Flow Analysis
+
+**Date:** 2026-01-11
+**Author:** Luzia Research Agent
+**Status:** Complete
+
+---
+
+## Executive Summary
+
+Luzia dispatches Claude tasks using a **fully autonomous, non-blocking pattern**. The current architecture intentionally **does not support human-in-the-loop interaction** for background agents. This analysis documents the current flow, identifies pain points, and researches potential improvements for scenarios where user input is needed.
+
+---
+
+## Part 1: Current Dispatch Flow
+
+### 1.1 Task Dispatch Mechanism
+
+**Entry Point:** `spawn_claude_agent()` in `/opt/server-agents/orchestrator/bin/luzia:1102-1401`
+
+**Flow:**
+```
+User runs: luzia <project> <task>
+                ↓
+1. Permission check (project access validation)
+                ↓
+2. QA Preflight checks (optional, validates task)
+                ↓
+3. Job directory created: /var/log/luz-orchestrator/jobs/{job_id}/
+   ├── prompt.txt     (task + context)
+   ├── run.sh         (shell script with env setup)
+   ├── meta.json      (job metadata)
+   └── output.log     (will capture output)
+                ↓
+4. Shell script generated with:
+   - User-specific TMPDIR to avoid /tmp collisions
+   - HOME set to target user
+   - stdbuf for unbuffered output
+   - tee to capture to output.log
+                ↓
+5. Launched via: os.system(f'nohup "{script_file}" >/dev/null 2>&1 &')
+                ↓
+6. Control returns IMMEDIATELY to CLI (job_id returned)
+                ↓
+7. Agent runs in background, detached from parent process
+```
+
+### 1.2 Claude CLI Invocation
+
+**Command Line Built:**
+```bash
+claude --dangerously-skip-permissions \
+       --permission-mode bypassPermissions \
+       --add-dir "{project_path}" \
+       --add-dir /opt/server-agents \
+       --print \
+       --verbose \
+       -p  # Reads prompt from stdin
+```
+
+**Critical Flags:**
+| Flag | Purpose |
+|------|---------|
+| `--dangerously-skip-permissions` | Required to use bypassPermissions mode |
+| `--permission-mode bypassPermissions` | Skip ALL interactive prompts |
+| `--print` | Non-interactive output mode |
+| `--verbose` | Progress visibility in logs |
+| `-p` | Read prompt from stdin (piped from prompt.txt) |
+
+### 1.3 Output Capture
+
+**Run script template:**
+```bash
+#!/bin/bash
+echo $$ > "{pid_file}"
+
+# Environment setup
+export TMPDIR="{user_tmp_dir}"
+export HOME="{user_home}"
+
+# Execute with unbuffered output capture
+sudo -u {user} bash -c '... cd "{project_path}" && cat "{prompt_file}" | stdbuf -oL -eL {claude_cmd}' 2>&1 | tee "{output_file}"
+
+exit_code=${PIPESTATUS[0]}
+echo "" >> "{output_file}"
+echo "exit:$exit_code" >> "{output_file}"
+```
+
+---
+
+## Part 2: Output Monitoring Flow
+
+### 2.1 Status Checking
+
+**Function:** `get_job_status()` at line 1404
+
+Status is determined by:
+1. Reading `output.log` for `exit:` line at end
+2. Checking if process is still running (via PID)
+3. Updating `meta.json` with completion time metrics
+
+**Status Values:**
+- `running` - No exit code yet, PID may still be active
+- `completed` - `exit:0` found
+- `failed` - `exit:non-zero` found
+- `killed` - `exit:-9` or manual kill detected
+
+### 2.2 User Monitoring Commands
+
+```bash
+# List all jobs
+luzia jobs
+
+# Show specific job status
+luzia jobs {job_id}
+
+# View job output
+luzia logs {job_id}
+
+# Show with timing details
+luzia jobs --timing
+```
+
+### 2.3 Notification Flow
+
+On completion, the run script appends to notification log:
+```bash
+echo "[$(date +%H:%M:%S)] Agent {job_id} finished (exit $exit_code)" >> /var/log/luz-orchestrator/notifications.log
+```
+
+This allows external monitoring via:
+```bash
+tail -f /var/log/luz-orchestrator/notifications.log
+```
+
+---
+
+## Part 3: Current User Interaction Handling
+
+### 3.1 The Problem: No Interaction Supported
+
+**Current Design:** Background agents **cannot** receive user input.
+
+**Why:**
+1. `nohup` detaches from terminal - stdin unavailable
+2. `--permission-mode bypassPermissions` skips prompts
+3. No mechanism exists to pause agent and wait for input
+4. Output is captured to file, not interactive terminal
+
+### 3.2 When Claude Would Ask Questions
+
+Claude's `AskUserQuestion` tool would block waiting for stdin, which isn't available. Current mitigations:
+
+1. **Context-First Design** - Prompts include all necessary context
+2. **Pre-Authorization** - Permissions granted upfront
+3. **Structured Tasks** - Clear success criteria reduce ambiguity
+4. **Exit Code Signaling** - Agent exits with code 1 if unable to proceed
+
+### 3.3 Current Pain Points
+
+| Pain Point | Impact | Current Workaround |
+|------------|--------|-------------------|
+| Agent can't ask clarifying questions | May proceed with wrong assumptions | Write detailed prompts |
+| User can't provide mid-task guidance | Task might fail when adjustments needed | Retry with modified task |
+| No approval workflow for risky actions | Security relies on upfront authorization | Careful permission scoping |
+| Long tasks give no progress updates | User doesn't know if task is stuck | Check output.log manually |
+| AskUserQuestion blocks indefinitely | Agent hangs, appears as "running" forever | Must kill and retry |
+
+---
+
+## Part 4: Research on Interaction Improvements
+
+### 4.1 Pattern: File-Based Clarification Queue
+
+**Concept:** Agent writes questions to file, waits for answer file.
+
+```
+/var/log/luz-orchestrator/jobs/{job_id}/
+├── clarification.json     # Agent writes question
+├── response.json          # User writes answer
+└── output.log             # Agent logs waiting status
+```
+
+**Agent Behavior:**
+```python
+# Agent encounters ambiguity
+question = {
+    "type": "choice",
+    "question": "Which database: production or staging?",
+    "options": ["production", "staging"],
+    "timeout_minutes": 30,
+    "default_if_timeout": "staging"
+}
+Path("clarification.json").write_text(json.dumps(question))
+
+# Wait for response (polling)
+for _ in range(timeout * 60):
+    if Path("response.json").exists():
+        response = json.loads(Path("response.json").read_text())
+        return response["choice"]
+    time.sleep(1)
+
+# Timeout - use default
+return question["default_if_timeout"]
+```
+
+**User Side:**
+```bash
+# List pending questions
+luzia questions
+
+# Answer a question
+luzia answer {job_id} staging
+```
+
+### 4.2 Pattern: WebSocket Status Bridge
+
+**Concept:** Real-time bidirectional communication via WebSocket.
+
+```
+User Browser ←→ Luzia Status Server ←→ Agent Process
+                     ↑
+              /var/lib/luzia/status.sock
+```
+
+**Implementation in Existing Code:**
+`lib/luzia_status_integration.py` already has a status publisher framework that could be extended.
+
+**Flow:**
+1. Agent publishes status updates to socket
+2. Status server broadcasts to connected clients
+3. When question arises, server notifies all clients
+4. User responds via web UI or CLI
+5. Response routed back to agent
+
+### 4.3 Pattern: Telegram/Chat Integration
+
+**Existing:** `/opt/server-agents/mcp-servers/assistant-channel/` provides Telegram integration.
+
+**Extended for Agent Questions:**
+```python
+# Agent needs input
+channel_query(
+    sender=f"agent-{job_id}",
+    question="Should I update production database?",
+    context="Running migration task for musica project"
+)
+
+# Bruno responds via Telegram
+# Response delivered to agent via file or status channel
+```
+
+### 4.4 Pattern: Approval Gates
+
+**Concept:** Pre-define checkpoints where agent must wait for approval.
+
+```python
+# In task prompt
+"""
+## Approval Gates
+- Before running migrations: await approval
+- Before deleting files: await approval
+- Before modifying production config: await approval
+
+Write to approval.json when reaching a gate. Wait for approved.json.
+"""
+```
+
+**Gate File:**
+```json
+{
+    "gate": "database_migration",
+    "description": "About to run 3 migrations on staging DB",
+    "awaiting_since": "2026-01-11T14:30:00Z",
+    "auto_approve_after_minutes": null
+}
+```
+
+### 4.5 Pattern: Interactive Mode Flag
+
+**Concept:** Allow foreground execution when user is present.
+
+```bash
+# Background (current default)
+luzia musica "run tests"
+
+# Foreground/Interactive
+luzia musica "run tests" --fg
+
+# Interactive session (already exists)
+luzia work on musica
+```
+
+The `--fg` flag already exists but doesn't fully support interactive Q&A. Enhancement needed:
+- Don't detach process
+- Keep stdin connected
+- Allow Claude's AskUserQuestion to work normally
+
+---
+
+## Part 5: Recommendations
+
+### 5.1 Short-Term (Quick Wins)
+
+1. **Better Exit Code Semantics**
+   - Exit 100 = "needs clarification" (new code)
+   - Capture the question in `clarification.json`
+   - `luzia questions` command to list pending
+
+2. **Enhanced `--fg` Mode**
+   - Don't background the process
+   - Keep stdin/stdout connected
+   - Allow normal interactive Claude session
+
+3. **Progress Streaming**
+   - Add `luzia watch {job_id}` for `tail -f` on output.log
+   - Color-coded output for better readability
+
+### 5.2 Medium-Term (New Features)
+
+4. **File-Based Clarification System**
+   - Agent writes to `clarification.json`
+   - Luzia CLI watches for pending questions
+   - `luzia answer {job_id} <response>` writes `response.json`
+   - Agent polls and continues
+
+5. **Telegram/Chat Bridge for Questions**
+   - Extend assistant-channel for agent questions
+   - Push notification when agent needs input
+   - Reply via chat, response routed to agent
+
+6. **Status Dashboard**
+   - Web UI showing all running agents
+   - Real-time output streaming
+   - Question/response interface
+
+### 5.3 Long-Term (Architecture Evolution)
+
+7. **Approval Workflows**
+   - Define approval gates in task specification
+   - Configurable auto-approve timeouts
+   - Audit log of approvals
+
+8. **Agent Orchestration Layer**
+   - Queue of pending questions across agents
+   - Priority handling for urgent questions
+   - SLA tracking for response times
+
+9. **Hybrid Execution Mode**
+   - Start background, attach to foreground if question arises
+   - Agent sends signal when needing input
+   - CLI can "attach" to running agent
+
+---
+
+## Part 6: Implementation Priority
+
+| Priority | Feature | Effort | Impact |
+|----------|---------|--------|--------|
+| **P0** | Better `--fg` mode | Low | High - enables immediate interactive use |
+| **P0** | Exit code 100 for clarification | Low | Medium - better failure understanding |
+| **P1** | `luzia watch {job_id}` | Low | Medium - easier monitoring |
+| **P1** | File-based clarification | Medium | High - enables async Q&A |
+| **P2** | Telegram question bridge | Medium | Medium - mobile notification |
+| **P2** | Status dashboard | High | High - visual monitoring |
+| **P3** | Approval workflows | High | Medium - enterprise feature |
+
+---
+
+## Conclusion
+
+The current Luzia dispatch architecture is optimized for **fully autonomous** agent execution. This is the right default for background tasks. However, there's a gap for scenarios where:
+- Tasks are inherently ambiguous
+- User guidance is needed mid-task
+- High-stakes actions require approval
+
+The recommended path forward is:
+1. **Improve `--fg` mode** for true interactive sessions
+2. **Add file-based clarification** for async Q&A on background tasks
+3. **Integrate with Telegram** for push notifications on questions
+4. **Build status dashboard** for visual monitoring and interaction
+
+These improvements maintain the autonomous-by-default philosophy while enabling human-in-the-loop interaction when needed.
+
+---
+
+## Appendix: Key File Locations
+
+| File | Purpose |
+|------|---------|
+| `/opt/server-agents/orchestrator/bin/luzia:1102-1401` | `spawn_claude_agent()` - main dispatch |
+| `/opt/server-agents/orchestrator/bin/luzia:1404-1449` | `get_job_status()` - status checking |
+| `/opt/server-agents/orchestrator/bin/luzia:4000-4042` | `route_logs()` - log viewing |
+| `/opt/server-agents/orchestrator/lib/responsive_dispatcher.py` | Async dispatch patterns |
+| `/opt/server-agents/orchestrator/lib/cli_feedback.py` | CLI output formatting |
+| `/opt/server-agents/orchestrator/AGENT-AUTONOMY-RESEARCH.md` | Prior research on autonomy |
+| `/var/log/luz-orchestrator/jobs/` | Job directories |
+| `/var/log/luz-orchestrator/notifications.log` | Completion notifications |