Files

admin ec33ac1936 Refactor cockpit to use DockerTmuxController pattern

Based on claude-code-tools TmuxCLIController, this refactor:

- Added DockerTmuxController class for robust tmux session management
- Implements send_keys() with configurable delay_enter
- Implements capture_pane() for output retrieval
- Implements wait_for_prompt() for pattern-based completion detection
- Implements wait_for_idle() for content-hash-based idle detection
- Implements wait_for_shell_prompt() for shell prompt detection

Also includes workflow improvements:
- Pre-task git snapshot before agent execution
- Post-task commit protocol in agent guidelines

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-14 10:42:16 -03:00

24 KiB

Raw Blame History

Luzia Agent Autonomy Research

Interactive Prompts and Autonomous Agent Patterns

Date: 2026-01-09 Version: 1.0 Status: Complete

Executive Summary

This research documents how Luzia and the Claude Agent SDK enable autonomous agents to handle interactive scenarios without blocking. The key insight is that blocking is prevented through architectural choices, not technical tricks:

Detached Execution - Agents run in background processes, not waiting for input
Non-Interactive Mode - Permission mode set to bypassPermissions to avoid approval dialogs
Async Communication - Results delivered via files and notification logs, not stdin/stdout
Failure Recovery - Exit codes captured for retry logic without agent restart
Context-First Design - All necessary context provided upfront in prompts

Part 1: How Luzia Prevents Agent Blocking

1.1 The Core Pattern: Detached Spawning

File: /opt/server-agents/orchestrator/bin/luzia (lines 1012-1200)

# Agents run detached with nohup, not waiting for completion
os.system(f'nohup "{script_file}" >/dev/null 2>&1 &')

Key Design Decisions:

Aspect	Implementation	Why It Works
Process Isolation	`nohup ... &` spawns detached process	Parent doesn't block; agent runs independently
Permission Mode	`--permission-mode bypassPermissions`	No permission dialogs to pause agent
PID Tracking	Job directory captures PID at startup	Can monitor/kill if needed without blocking
Output Capture	`tee` pipes output to log file	Results captured even if agent backgrounded
Status Tracking	Exit code appended to output.log	Job status determined post-execution

1.2 The Full Agent Spawn Flow

Complete lifecycle (simplified):

1. spawn_claude_agent() called
   ↓
2. Job directory created: /var/log/luz-orchestrator/jobs/{job_id}/
   ├── prompt.txt (full context + task)
   ├── run.sh (executable shell script)
   ├── meta.json (job metadata)
   └── output.log (will capture all output)
   ↓
3. Script written with all environment setup
   ├── TMPDIR set to user's home (prevent /tmp collisions)
   ├── HOME set to project user
   ├── Current directory: project path
   └── Claude CLI invoked with full prompt
   ↓
4. Execution via nohup (detached)
   os.system(f'nohup "{script_file}" >/dev/null 2>&1 &')
   ↓
5. Control returns immediately to CLI
   ↓
6. Agent continues in background:
   ├── Reads prompt from file
   ├── Executes task (reads/writes files)
   ├── All output captured to output.log
   ├── Exit code captured: "exit:{code}"
   └── Completion logged to notifications.log

1.3 Permission Bypass Strategy

Critical Flag: --permission-mode bypassPermissions

# From spawn_claude_agent()
claude_cmd = f'claude --dangerously-skip-permissions --permission-mode bypassPermissions ...'

Why This Works:

No User Prompts: Claude doesn't ask for approval on tool use
Full Autonomy: Agent makes all decisions without waiting
Pre-Authorization: All permissions granted upfront in job spawning
Isolation: Each agent runs as project user in their own space

When This is Safe:

All agents have limited scope (project directory)
Running as restricted user (not root)
Task fully specified in prompt (no ambiguity)
Agent context includes execution environment details

Part 2: Handling Clarification Without Blocking

2.1 The AskUserQuestion Problem

When agents need clarification, Claude's AskUserQuestion tool blocks the agent process waiting for stdin input. For background agents, this is problematic.

Solutions in Luzia:

Solution 1: Context-First Design

Provide all necessary context upfront so agents rarely need to ask:

prompt = f"""
You are a project agent working on the **{project}** project.

## Your Task
{task}

## Execution Environment
- You are running as user: {run_as_user}
- Working directory: {project_path}
- All file operations are pre-authorized
- Complete the task autonomously

## Guidelines
- Complete the task autonomously
- If you encounter errors, debug and fix them
- Store important findings in the shared knowledge graph
"""

Solution 2: Structured Task Format

Use specific, unambiguous task descriptions:

GOOD:   "Run tests in /workspace/tests and report pass/fail count"
BAD:    "Fix the test suite"  (unclear what 'fix' means)

GOOD:   "Analyze src/index.ts for complexity metrics"
BAD:    "Improve code quality" (needs clarification on what to improve)

Solution 3: Async Fallback Mechanism

If clarification is truly needed, agents can:

Create a hold file in job directory
Log the question to a status file
Return exit code 1 (needs input)
Await resolution via file modification

# Example pattern for agent code:
# (Not yet implemented in Luzia, but pattern is documented)

import json
from pathlib import Path

job_dir = Path("/var/log/luz-orchestrator/jobs/{job_id}")

# Agent encounters ambiguity
question = "Should I update production database or staging?"

# Write question to file
clarification = {
    "status": "awaiting_clarification",
    "question": question,
    "options": ["production", "staging"],
    "agent_paused_at": datetime.now().isoformat()
}
(job_dir / "clarification.json").write_text(json.dumps(clarification))

# Exit with code 1 to signal "needs input"
exit(1)

Then externally:

# Operator provides input
echo '{"choice": "staging"}' > /var/log/luz-orchestrator/jobs/{job_id}/clarification.json

# Restart agent (automatic retry system)
luzia retry {job_id}

2.2 Why AskUserQuestion Doesn't Work for Async Agents

Scenario	Issue	Solution
User runs `luzia project task`	User might close terminal	Store prompt in file, not stdin
Agent backgrounded	stdin not available	No interactive input possible
Multiple agents running	stdin interference	Use file-based IPC instead
Agent on remote machine	stdin tunneling complex	All I/O via files or HTTP

Part 3: Job State Machine and Exit Codes

3.1 Job Lifecycle States

Defined in: /opt/server-agents/orchestrator/bin/luzia (lines 607-646)

def _get_actual_job_status(job_dir: Path) -> str:
    """Determine actual job status by checking output.log"""

    # Status values:
    # - "running" (process still active)
    # - "completed" (exit:0)
    # - "failed" (exit:non-zero)
    # - "killed" (exit:-9)
    # - "unknown" (no status info)

State Transitions:

Job Created
  ↓
[meta.json: status="running", output.log: empty]
  ↓
Agent Executes (captured in output.log)
  ↓
Agent Completes/Exits
  ↓
output.log appended with "exit:{code}"
  ↓
Status determined:
  ├─ exit:0     → "completed"
  ├─ exit:non-0 → "failed"
  ├─ exit:-9    → "killed"
  └─ no exit    → "running" (still active)

3.2 The Critical Line: Capturing Exit Code

From run.sh template (lines 1148-1173):

# Command with output capture
stdbuf ... {claude_cmd} 2>&1 | tee "{output_file}"
exit_code=${PIPESTATUS[0]}

# CRITICAL: Append exit code to log
echo "" >> "{output_file}"
echo "exit:$exit_code" >> "{output_file}"

# Notify completion
{notify_cmd}

Why This Matters:

Job status determined by examining log file, not process exit
Exit code persists in file even after process terminates
Allows status queries without spawning process
Enables automatic retry logic based on exit code

Part 4: Handling Approval Prompts in Background

4.1 How Claude Code Approval Prompts Work

Claude Code tools can ask for permission before executing risky operations:

⚠️ This command has high privilege level. Approve?
[Y/n] _

In interactive mode: User can respond In background mode: Command blocks indefinitely waiting for stdin

4.2 Luzia's Prevention Mechanism

Three-layer approach:

CLI Flag: --permission-mode bypassPermissions
- Tells Claude CLI to skip permission dialogs
- Requires --dangerously-skip-permissions flag
Environment Setup: User runs as project user, not root
- Limited scope prevents catastrophic damage
- Job runs in isolated directory
- File ownership is correct by default
Process Isolation: Agent runs detached
- Even if blocked, parent CLI returns immediately
- Job continues in background
- Can be monitored/killed separately

Example: Safe Bash Execution

# This command would normally require approval in interactive mode
command = "rm -rf /opt/sensitive-data"

# But in agent context:
# 1. Agent running as limited user (not root)
# 2. Project path restricted (can't access /opt from project user)
# 3. Permission flags bypass confirmation dialog
# 4. Agent detached (blocking doesn't affect CLI)

# Result: Command executes without interactive prompt

Part 5: Async Communication Patterns

5.1 File-Based Job Queue

Implemented in: /opt/server-agents/orchestrator/lib/queue_controller.py

Pattern:

User provides task → Enqueue to file-based queue → Status logged to disk
                                                       ↓
                                          Load-aware scheduler polls queue
                                                       ↓
                                          Task spawned as background agent
                                                       ↓
                                          Agent writes to output.log
                                                       ↓
                                          User queries status via filesystem

Queue Structure:

/var/lib/luzia/queue/
├── pending/
│   ├── high/
│   │   └── {priority}_{timestamp}_{project}_{task_id}.json
│   └── normal/
│       └── {priority}_{timestamp}_{project}_{task_id}.json
├── config.json
└── capacity.json

Task File Format:

{
  "id": "a1b2c3d4",
  "project": "musica",
  "priority": 5,
  "prompt": "Run tests in /workspace/tests",
  "skill_match": "test-runner",
  "enqueued_at": "2026-01-09T15:30:45Z",
  "enqueued_by": "admin",
  "status": "pending"
}

5.2 Notification Log Pattern

Location: /var/log/luz-orchestrator/notifications.log

Pattern:

[14:23:15] Agent 142315-a1b2 finished (exit 0)
[14:24:03] Agent 142403-c3d4 finished (exit 1)
[14:25:12] Agent 142512-e5f6 finished (exit 0)

Usage:

# Script can tail this file to await completion
# without polling job directories
tail -f /var/log/luz-orchestrator/notifications.log | \
  grep "Agent {job_id}"

5.3 Job Directory as IPC Channel

Location: /var/log/luz-orchestrator/jobs/{job_id}/

Files Used for Communication:

File	Purpose	Direction
`prompt.txt`	Task definition & context	Input (before agent starts)
`output.log`	Agent's stdout/stderr + exit code	Output (written during execution)
`meta.json`	Job metadata & status	Both (initial + final)
`clarification.json`	Awaiting user input (pattern)	Bidirectional
`run.sh`	Execution script	Input
`pid`	Process ID	Output

Example: Monitoring Job Completion

#!/bin/bash
job_id="142315-a1b2"
job_dir="/var/log/luz-orchestrator/jobs/$job_id"

# Poll for completion
while true; do
  if grep -q "^exit:" "$job_dir/output.log"; then
    exit_code=$(grep "^exit:" "$job_dir/output.log" | tail -1 | cut -d: -f2)
    echo "Job completed with exit code: $exit_code"
    break
  fi
  sleep 1
done

Part 6: Prompt Patterns for Agent Autonomy

6.1 The Ideal Autonomous Agent Prompt

Pattern:

1. Identity & Context
   - What role is the agent playing?
   - What project/domain?

2. Task Specification
   - What needs to be done?
   - What are success criteria?
   - What are the constraints?

3. Execution Environment
   - What tools are available?
   - What directories can be accessed?
   - What permissions are granted?

4. Decision Autonomy
   - What decisions can the agent make alone?
   - When should it ask for clarification? (ideally: never)
   - What should it do if ambiguous?

5. Communication
   - Where should results be written?
   - What format (JSON, text, files)?
   - When should it report progress?

6. Failure Handling
   - What to do if task fails?
   - Should it retry? How many times?
   - What exit codes to use?

6.2 Good vs Bad Prompts for Autonomy

BAD - Requires Clarification:

"Help me improve the code"
- Ambiguous: which files? What metrics?
- No success criteria
- Agent likely to ask questions

"Fix the bug"
- Which bug? What symptoms?
- Agent needs to investigate then ask

GOOD - Autonomous:

"Run tests in /workspace/tests and report:
- Total test count
- Passed count
- Failed count
- Exit code (0 if all pass, 1 if any fail)"

"Analyze src/index.ts for:
- Lines of code
- Number of functions
- Max function complexity
- Save results to analysis.json"

6.3 Prompt Template for Autonomous Agents

Used in Luzia: /opt/server-agents/orchestrator/bin/luzia (lines 1053-1079)

prompt_template = """You are a project agent working on the **{project}** project.

{context}

## Your Task
{task}

## Execution Environment
- You are running as user: {run_as_user}
- You are running directly in the project directory: {project_path}
- You have FULL permission to read, write, and execute files in this directory
- Use standard Claude tools (Read, Write, Edit, Bash) directly
- All file operations are pre-authorized - proceed without asking for permission

## Knowledge Graph - IMPORTANT
Use the **shared/global knowledge graph** for storing knowledge:
- Use `mcp__shared-projects-memory__store_fact` to store facts
- Use `mcp__shared-projects-memory__query_relations` to query
- Use `mcp__shared-projects-memory__search_context` to search

## Guidelines
- Complete the task autonomously
- If you encounter errors, debug and fix them
- Store important findings in the shared knowledge graph
- Provide a summary of what was done when complete
"""

Key Autonomy Features:

No "ask for help" - pre-authorization is explicit
Clear environment details - no guessing about paths/permissions
Knowledge graph integration - preserve learnings across runs
Exit code expectations - clear success/failure criteria

Part 7: Pattern Summary - Building Autonomous Agents

7.1 The Five Patterns

Pattern	When to Use	Implementation
Detached Spawning	Background tasks that shouldn't block CLI	`nohup ... &` with PID tracking
Permission Bypass	Autonomous execution without prompts	`--permission-mode bypassPermissions`
File-Based IPC	Async communication with agents	Job directory as channel
Exit Code Signaling	Status determination without polling	Append "exit:{code}" to output
Context-First Prompts	Avoid clarification questions	Detailed spec + success criteria

7.2 Comparison: Interactive vs Autonomous Patterns

Aspect	Interactive Agent	Autonomous Agent
Execution	Runs in foreground, blocks	Detached process, returns immediately
Prompts	Can use `AskUserQuestion`	Must provide all context upfront
Approval	Can request tool permission	Uses `--permission-mode bypassPermissions`
I/O	stdin/stdout with user	Files, logs, notification channels
Failure	User responds to errors	Agent handles/reports via exit code
Monitoring	User watches output	Query filesystem for status

7.3 When to Use Each Pattern

Use Interactive Agents When:

User is present and waiting
Task requires user input/decisions
Working in development/exploration mode
Real-time feedback is valuable

Use Autonomous Agents When:

Running background maintenance tasks
Multiple parallel operations needed
No user available to respond to prompts
Results needed asynchronously

Part 8: Real Implementation Examples

8.1 Example: Running Tests Autonomously

Task:

Run pytest in /workspace/tests and report results as JSON

Luzia Command:

luzia musica "Run pytest in /workspace/tests and save results to tests.json with {passed: int, failed: int, errors: int}"

What Happens:

Job directory created with UUID
Prompt written with full context
Script prepared with environment setup
Launched via nohup
Immediately returns job_id to user

Agent runs in background:

cd /workspace
pytest tests/ --json > tests.json
# Results saved to file

Exit code captured (0 if all pass, 1 if failures)
Output logged to output.log
Completion notification sent

User Monitor:

luzia jobs {job_id}
# Status: running/completed/failed
# Exit code: 0/1
# Output preview: last 10 lines

8.2 Example: Code Analysis Autonomously

Task:

Analyze the codebase structure and save metrics to analysis.json

Agent Does (no prompts needed):

Reads prompt from job directory
Scans project structure
Collects metrics (LOC, functions, classes, complexity)
Writes results to analysis.json
Stores findings in knowledge graph
Exits with 0

Success Criteria (in prompt):

Results saved to analysis.json with:
- total_files: int
- total_lines: int
- total_functions: int
- total_classes: int
- average_complexity: float

Part 9: Best Practices

9.1 Prompt Design for Autonomy

Be Specific
- What files? What directories?
- What exact metrics/outputs?
- What format (JSON, CSV, text)?
Provide Success Criteria
- What makes this task complete?
- What should the output look like?
- What exit code for success/failure?
Include Error Handling
- What if file doesn't exist?
- What if command fails?
- Should it retry or report and exit?
Minimize Ambiguity
- Don't say "improve code quality" - say what to measure
- Don't say "fix bugs" - specify which bugs or how to find them
- Don't say "optimize" - specify what metric to optimize

9.2 Environment Setup for Autonomy

Pre-authorize Everything
- Set correct user/group
- Ensure file permissions allow operations
- Document what's accessible
Provide Full Context
- Include CLAUDE.md or similar
- Document project structure
- Explain architectural decisions
Set Clear Boundaries
- Which directories can be modified?
- Which operations are allowed?
- What can't be changed?

9.3 Failure Recovery for Autonomy

Use Exit Codes Meaningfully
- 0 = success
- 1 = recoverable failure
- 2 = unrecoverable failure
- -9 = killed/timeout
Log Failures Comprehensively
- What was attempted?
- What failed and why?
- What was tried to recover?
Enable Automatic Retry
- Retry on exit code 1 (optional)
- Don't retry on exit code 2 (unrecoverable)
- Track retry count to prevent infinite loops

Part 10: Advanced Patterns

10.1 Multi-Phase Autonomous Tasks

For complex tasks requiring multiple steps:

# Phase 1: Context gathering
# Phase 2: Analysis
# Phase 3: Implementation
# Phase 4: Verification
# Phase 5: Reporting

# All phases defined upfront in prompt
# Agent proceeds through all without asking
# Exit code reflects overall success

10.2 Knowledge Graph Integration

Agents store findings persistently:

# From agent code:
from mcp__shared-projects-memory__store_fact import store_fact

# After analysis, store for future agents
store_fact(
    entity_source_name="musica-project",
    relation="has_complexity_metrics",
    entity_target_name="analysis-2026-01-09",
    context={
        "avg_complexity": 3.2,
        "hotspots": ["index.ts", "processor.ts"]
    }
)

10.3 Cross-Agent Coordination

Use shared state file for coordination:

# /opt/server-agents/state/cross-agent-todos.json
# Agents read/update this to coordinate

{
  "current_tasks": [
    {
      "id": "analyze-musica",
      "project": "musica",
      "status": "in_progress",
      "assigned_to": "agent-a1b2",
      "started": "2026-01-09T14:23:00Z"
    }
  ],
  "completed_archive": { ... }
}

Part 11: Failure Cases and Solutions

11.1 Common Blocking Issues

Issue	Cause	Solution
Agent pauses on permission prompt	Tool permission check enabled	Use `--permission-mode bypassPermissions`
Agent blocks on AskUserQuestion	Prompt causes clarification needed	Redesign prompt with full context
stdin unavailable	Agent backgrounded	Use file-based IPC for input
Exit code not recorded	Script exits before writing exit code	Ensure "exit:{code}" in output.log
Job marked "running" forever	Process dies but exit code not appended	Use `tee` and explicit exit code capture

11.2 Debugging Blocking Agents

# Check if agent is actually running
ps aux | grep {job_id}

# Check job output (should show what it's doing)
tail -f /var/log/luz-orchestrator/jobs/{job_id}/output.log

# Check if waiting on stdin
strace -p {pid} | grep read

# Look for approval prompts in output
grep -i "approve\|confirm\|permission" /var/log/luz-orchestrator/jobs/{job_id}/output.log

# Check if exit code was written
tail -5 /var/log/luz-orchestrator/jobs/{job_id}/output.log

Part 12: Conclusion - Key Takeaways

12.1 The Core Principle

Autonomous agents don't ask for input because they don't need to.

Rather than implementing complex async prompting, the better approach is:

Specify tasks completely - No ambiguity
Provide full context - No guessing required
Set clear boundaries - Know what's allowed
Detach execution - Run independent of CLI
Capture results - File-based communication

12.2 Implementation Checklist

Prompt includes all necessary context
Task has clear success criteria
Environment fully described (user, directory, permissions)
No ambiguous language in prompt
Exit codes defined (0=success, 1=failure, 2=error)
Output format specified (JSON, text, files)
Job runs as appropriate user
Results captured to files/logs
Notification system tracks completion
Status queryable without blocking

12.3 When Blocks Still Occur

Rare: Well-designed prompts rarely need clarification
Detectable: Agent exits with code 1 and logs to output.log
Recoverable: Can retry or modify task and re-queue
Monitorable: Parent CLI never blocks, can watch from elsewhere

Appendix A: Key Code Locations

Location	Purpose
`/opt/server-agents/orchestrator/bin/luzia` (lines 1012-1200)	`spawn_claude_agent()` - Core autonomous agent spawning
`/opt/server-agents/orchestrator/lib/docker_bridge.py`	Container isolation for project agents
`/opt/server-agents/orchestrator/lib/queue_controller.py`	File-based task queue with load awareness
`/var/log/luz-orchestrator/jobs/`	Job directory structure and IPC
`/opt/server-agents/orchestrator/CLAUDE.md`	Embedded instructions for agents

Appendix B: Environment Variables

Agents have these set automatically:

TMPDIR="/home/{user}/.tmp"      # Prevent /tmp collisions
TEMP="/home/{user}/.tmp"        # Same
TMP="/home/{user}/.tmp"         # Same
HOME="/home/{user}"             # User's home directory
PWD="/path/to/project"          # Working directory

Appendix C: File Formats Reference

Job Directory Files

meta.json:

{
  "id": "142315-a1b2",
  "project": "musica",
  "task": "Run tests and report results",
  "type": "agent",
  "user": "musica",
  "pid": "12847",
  "started": "2026-01-09T14:23:15Z",
  "status": "running",
  "debug": false
}

output.log:

[14:23:15] Starting agent...
[14:23:16] Reading prompt from file
[14:23:17] Executing task...
[14:23:18] Running tests...
PASSED: test_1
PASSED: test_2
...
[14:23:25] Task complete

exit:0

References

Luzia CLI: /opt/server-agents/orchestrator/bin/luzia
Agent SDK: Claude Agent SDK (Anthropic)
Docker Bridge: Container isolation for agent execution
Queue Controller: File-based task queue implementation
Bot Orchestration Protocol: /opt/server-agents/BOT-ORCHESTRATION-PROTOCOL.md

24 KiB Raw Blame History

Luzia Agent Autonomy Research

Interactive Prompts and Autonomous Agent Patterns

Executive Summary

Part 1: How Luzia Prevents Agent Blocking

1.1 The Core Pattern: Detached Spawning

1.2 The Full Agent Spawn Flow

1.3 Permission Bypass Strategy

Part 2: Handling Clarification Without Blocking

2.1 The AskUserQuestion Problem

Solution 1: Context-First Design

Solution 2: Structured Task Format

Solution 3: Async Fallback Mechanism

2.2 Why AskUserQuestion Doesn't Work for Async Agents

Part 3: Job State Machine and Exit Codes

3.1 Job Lifecycle States

3.2 The Critical Line: Capturing Exit Code

Part 4: Handling Approval Prompts in Background

4.1 How Claude Code Approval Prompts Work

4.2 Luzia's Prevention Mechanism

Part 5: Async Communication Patterns

5.1 File-Based Job Queue

5.2 Notification Log Pattern

5.3 Job Directory as IPC Channel

Part 6: Prompt Patterns for Agent Autonomy

6.1 The Ideal Autonomous Agent Prompt

6.2 Good vs Bad Prompts for Autonomy

6.3 Prompt Template for Autonomous Agents

Part 7: Pattern Summary - Building Autonomous Agents

7.1 The Five Patterns

7.2 Comparison: Interactive vs Autonomous Patterns

7.3 When to Use Each Pattern

Part 8: Real Implementation Examples

8.1 Example: Running Tests Autonomously

8.2 Example: Code Analysis Autonomously

Part 9: Best Practices

9.1 Prompt Design for Autonomy

9.2 Environment Setup for Autonomy

9.3 Failure Recovery for Autonomy

Part 10: Advanced Patterns

10.1 Multi-Phase Autonomous Tasks

10.2 Knowledge Graph Integration

10.3 Cross-Agent Coordination

Part 11: Failure Cases and Solutions

11.1 Common Blocking Issues

11.2 Debugging Blocking Agents

Part 12: Conclusion - Key Takeaways

12.1 The Core Principle

12.2 Implementation Checklist

12.3 When Blocks Still Occur

Appendix A: Key Code Locations

Appendix B: Environment Variables

Appendix C: File Formats Reference

Job Directory Files

References

24 KiB

Raw Blame History