Files
luzia/AGENT-AUTONOMY-RESEARCH.md
admin ec33ac1936 Refactor cockpit to use DockerTmuxController pattern
Based on claude-code-tools TmuxCLIController, this refactor:

- Added DockerTmuxController class for robust tmux session management
- Implements send_keys() with configurable delay_enter
- Implements capture_pane() for output retrieval
- Implements wait_for_prompt() for pattern-based completion detection
- Implements wait_for_idle() for content-hash-based idle detection
- Implements wait_for_shell_prompt() for shell prompt detection

Also includes workflow improvements:
- Pre-task git snapshot before agent execution
- Post-task commit protocol in agent guidelines

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 10:42:16 -03:00

24 KiB

Luzia Agent Autonomy Research

Interactive Prompts and Autonomous Agent Patterns

Date: 2026-01-09 Version: 1.0 Status: Complete


Executive Summary

This research documents how Luzia and the Claude Agent SDK enable autonomous agents to handle interactive scenarios without blocking. The key insight is that blocking is prevented through architectural choices, not technical tricks:

  1. Detached Execution - Agents run in background processes, not waiting for input
  2. Non-Interactive Mode - Permission mode set to bypassPermissions to avoid approval dialogs
  3. Async Communication - Results delivered via files and notification logs, not stdin/stdout
  4. Failure Recovery - Exit codes captured for retry logic without agent restart
  5. Context-First Design - All necessary context provided upfront in prompts

Part 1: How Luzia Prevents Agent Blocking

1.1 The Core Pattern: Detached Spawning

File: /opt/server-agents/orchestrator/bin/luzia (lines 1012-1200)

# Agents run detached with nohup, not waiting for completion
os.system(f'nohup "{script_file}" >/dev/null 2>&1 &')

Key Design Decisions:

Aspect Implementation Why It Works
Process Isolation nohup ... & spawns detached process Parent doesn't block; agent runs independently
Permission Mode --permission-mode bypassPermissions No permission dialogs to pause agent
PID Tracking Job directory captures PID at startup Can monitor/kill if needed without blocking
Output Capture tee pipes output to log file Results captured even if agent backgrounded
Status Tracking Exit code appended to output.log Job status determined post-execution

1.2 The Full Agent Spawn Flow

Complete lifecycle (simplified):

1. spawn_claude_agent() called
   ↓
2. Job directory created: /var/log/luz-orchestrator/jobs/{job_id}/
   ├── prompt.txt (full context + task)
   ├── run.sh (executable shell script)
   ├── meta.json (job metadata)
   └── output.log (will capture all output)
   ↓
3. Script written with all environment setup
   ├── TMPDIR set to user's home (prevent /tmp collisions)
   ├── HOME set to project user
   ├── Current directory: project path
   └── Claude CLI invoked with full prompt
   ↓
4. Execution via nohup (detached)
   os.system(f'nohup "{script_file}" >/dev/null 2>&1 &')
   ↓
5. Control returns immediately to CLI
   ↓
6. Agent continues in background:
   ├── Reads prompt from file
   ├── Executes task (reads/writes files)
   ├── All output captured to output.log
   ├── Exit code captured: "exit:{code}"
   └── Completion logged to notifications.log

1.3 Permission Bypass Strategy

Critical Flag: --permission-mode bypassPermissions

# From spawn_claude_agent()
claude_cmd = f'claude --dangerously-skip-permissions --permission-mode bypassPermissions ...'

Why This Works:

  • No User Prompts: Claude doesn't ask for approval on tool use
  • Full Autonomy: Agent makes all decisions without waiting
  • Pre-Authorization: All permissions granted upfront in job spawning
  • Isolation: Each agent runs as project user in their own space

When This is Safe:

  • All agents have limited scope (project directory)
  • Running as restricted user (not root)
  • Task fully specified in prompt (no ambiguity)
  • Agent context includes execution environment details

Part 2: Handling Clarification Without Blocking

2.1 The AskUserQuestion Problem

When agents need clarification, Claude's AskUserQuestion tool blocks the agent process waiting for stdin input. For background agents, this is problematic.

Solutions in Luzia:

Solution 1: Context-First Design

Provide all necessary context upfront so agents rarely need to ask:

prompt = f"""
You are a project agent working on the **{project}** project.

## Your Task
{task}

## Execution Environment
- You are running as user: {run_as_user}
- Working directory: {project_path}
- All file operations are pre-authorized
- Complete the task autonomously

## Guidelines
- Complete the task autonomously
- If you encounter errors, debug and fix them
- Store important findings in the shared knowledge graph
"""

Solution 2: Structured Task Format

Use specific, unambiguous task descriptions:

GOOD:   "Run tests in /workspace/tests and report pass/fail count"
BAD:    "Fix the test suite"  (unclear what 'fix' means)

GOOD:   "Analyze src/index.ts for complexity metrics"
BAD:    "Improve code quality" (needs clarification on what to improve)

Solution 3: Async Fallback Mechanism

If clarification is truly needed, agents can:

  1. Create a hold file in job directory
  2. Log the question to a status file
  3. Return exit code 1 (needs input)
  4. Await resolution via file modification
# Example pattern for agent code:
# (Not yet implemented in Luzia, but pattern is documented)

import json
from pathlib import Path

job_dir = Path("/var/log/luz-orchestrator/jobs/{job_id}")

# Agent encounters ambiguity
question = "Should I update production database or staging?"

# Write question to file
clarification = {
    "status": "awaiting_clarification",
    "question": question,
    "options": ["production", "staging"],
    "agent_paused_at": datetime.now().isoformat()
}
(job_dir / "clarification.json").write_text(json.dumps(clarification))

# Exit with code 1 to signal "needs input"
exit(1)

Then externally:

# Operator provides input
echo '{"choice": "staging"}' > /var/log/luz-orchestrator/jobs/{job_id}/clarification.json

# Restart agent (automatic retry system)
luzia retry {job_id}

2.2 Why AskUserQuestion Doesn't Work for Async Agents

Scenario Issue Solution
User runs luzia project task User might close terminal Store prompt in file, not stdin
Agent backgrounded stdin not available No interactive input possible
Multiple agents running stdin interference Use file-based IPC instead
Agent on remote machine stdin tunneling complex All I/O via files or HTTP

Part 3: Job State Machine and Exit Codes

3.1 Job Lifecycle States

Defined in: /opt/server-agents/orchestrator/bin/luzia (lines 607-646)

def _get_actual_job_status(job_dir: Path) -> str:
    """Determine actual job status by checking output.log"""

    # Status values:
    # - "running" (process still active)
    # - "completed" (exit:0)
    # - "failed" (exit:non-zero)
    # - "killed" (exit:-9)
    # - "unknown" (no status info)

State Transitions:

Job Created
  ↓
[meta.json: status="running", output.log: empty]
  ↓
Agent Executes (captured in output.log)
  ↓
Agent Completes/Exits
  ↓
output.log appended with "exit:{code}"
  ↓
Status determined:
  ├─ exit:0     → "completed"
  ├─ exit:non-0 → "failed"
  ├─ exit:-9    → "killed"
  └─ no exit    → "running" (still active)

3.2 The Critical Line: Capturing Exit Code

From run.sh template (lines 1148-1173):

# Command with output capture
stdbuf ... {claude_cmd} 2>&1 | tee "{output_file}"
exit_code=${PIPESTATUS[0]}

# CRITICAL: Append exit code to log
echo "" >> "{output_file}"
echo "exit:$exit_code" >> "{output_file}"

# Notify completion
{notify_cmd}

Why This Matters:

  • Job status determined by examining log file, not process exit
  • Exit code persists in file even after process terminates
  • Allows status queries without spawning process
  • Enables automatic retry logic based on exit code

Part 4: Handling Approval Prompts in Background

4.1 How Claude Code Approval Prompts Work

Claude Code tools can ask for permission before executing risky operations:

⚠️ This command has high privilege level. Approve?
[Y/n] _

In interactive mode: User can respond In background mode: Command blocks indefinitely waiting for stdin

4.2 Luzia's Prevention Mechanism

Three-layer approach:

  1. CLI Flag: --permission-mode bypassPermissions

    • Tells Claude CLI to skip permission dialogs
    • Requires --dangerously-skip-permissions flag
  2. Environment Setup: User runs as project user, not root

    • Limited scope prevents catastrophic damage
    • Job runs in isolated directory
    • File ownership is correct by default
  3. Process Isolation: Agent runs detached

    • Even if blocked, parent CLI returns immediately
    • Job continues in background
    • Can be monitored/killed separately

Example: Safe Bash Execution

# This command would normally require approval in interactive mode
command = "rm -rf /opt/sensitive-data"

# But in agent context:
# 1. Agent running as limited user (not root)
# 2. Project path restricted (can't access /opt from project user)
# 3. Permission flags bypass confirmation dialog
# 4. Agent detached (blocking doesn't affect CLI)

# Result: Command executes without interactive prompt

Part 5: Async Communication Patterns

5.1 File-Based Job Queue

Implemented in: /opt/server-agents/orchestrator/lib/queue_controller.py

Pattern:

User provides task → Enqueue to file-based queue → Status logged to disk
                                                       ↓
                                          Load-aware scheduler polls queue
                                                       ↓
                                          Task spawned as background agent
                                                       ↓
                                          Agent writes to output.log
                                                       ↓
                                          User queries status via filesystem

Queue Structure:

/var/lib/luzia/queue/
├── pending/
│   ├── high/
│   │   └── {priority}_{timestamp}_{project}_{task_id}.json
│   └── normal/
│       └── {priority}_{timestamp}_{project}_{task_id}.json
├── config.json
└── capacity.json

Task File Format:

{
  "id": "a1b2c3d4",
  "project": "musica",
  "priority": 5,
  "prompt": "Run tests in /workspace/tests",
  "skill_match": "test-runner",
  "enqueued_at": "2026-01-09T15:30:45Z",
  "enqueued_by": "admin",
  "status": "pending"
}

5.2 Notification Log Pattern

Location: /var/log/luz-orchestrator/notifications.log

Pattern:

[14:23:15] Agent 142315-a1b2 finished (exit 0)
[14:24:03] Agent 142403-c3d4 finished (exit 1)
[14:25:12] Agent 142512-e5f6 finished (exit 0)

Usage:

# Script can tail this file to await completion
# without polling job directories
tail -f /var/log/luz-orchestrator/notifications.log | \
  grep "Agent {job_id}"

5.3 Job Directory as IPC Channel

Location: /var/log/luz-orchestrator/jobs/{job_id}/

Files Used for Communication:

File Purpose Direction
prompt.txt Task definition & context Input (before agent starts)
output.log Agent's stdout/stderr + exit code Output (written during execution)
meta.json Job metadata & status Both (initial + final)
clarification.json Awaiting user input (pattern) Bidirectional
run.sh Execution script Input
pid Process ID Output

Example: Monitoring Job Completion

#!/bin/bash
job_id="142315-a1b2"
job_dir="/var/log/luz-orchestrator/jobs/$job_id"

# Poll for completion
while true; do
  if grep -q "^exit:" "$job_dir/output.log"; then
    exit_code=$(grep "^exit:" "$job_dir/output.log" | tail -1 | cut -d: -f2)
    echo "Job completed with exit code: $exit_code"
    break
  fi
  sleep 1
done

Part 6: Prompt Patterns for Agent Autonomy

6.1 The Ideal Autonomous Agent Prompt

Pattern:

1. Identity & Context
   - What role is the agent playing?
   - What project/domain?

2. Task Specification
   - What needs to be done?
   - What are success criteria?
   - What are the constraints?

3. Execution Environment
   - What tools are available?
   - What directories can be accessed?
   - What permissions are granted?

4. Decision Autonomy
   - What decisions can the agent make alone?
   - When should it ask for clarification? (ideally: never)
   - What should it do if ambiguous?

5. Communication
   - Where should results be written?
   - What format (JSON, text, files)?
   - When should it report progress?

6. Failure Handling
   - What to do if task fails?
   - Should it retry? How many times?
   - What exit codes to use?

6.2 Good vs Bad Prompts for Autonomy

BAD - Requires Clarification:

"Help me improve the code"
- Ambiguous: which files? What metrics?
- No success criteria
- Agent likely to ask questions

"Fix the bug"
- Which bug? What symptoms?
- Agent needs to investigate then ask

GOOD - Autonomous:

"Run tests in /workspace/tests and report:
- Total test count
- Passed count
- Failed count
- Exit code (0 if all pass, 1 if any fail)"

"Analyze src/index.ts for:
- Lines of code
- Number of functions
- Max function complexity
- Save results to analysis.json"

6.3 Prompt Template for Autonomous Agents

Used in Luzia: /opt/server-agents/orchestrator/bin/luzia (lines 1053-1079)

prompt_template = """You are a project agent working on the **{project}** project.

{context}

## Your Task
{task}

## Execution Environment
- You are running as user: {run_as_user}
- You are running directly in the project directory: {project_path}
- You have FULL permission to read, write, and execute files in this directory
- Use standard Claude tools (Read, Write, Edit, Bash) directly
- All file operations are pre-authorized - proceed without asking for permission

## Knowledge Graph - IMPORTANT
Use the **shared/global knowledge graph** for storing knowledge:
- Use `mcp__shared-projects-memory__store_fact` to store facts
- Use `mcp__shared-projects-memory__query_relations` to query
- Use `mcp__shared-projects-memory__search_context` to search

## Guidelines
- Complete the task autonomously
- If you encounter errors, debug and fix them
- Store important findings in the shared knowledge graph
- Provide a summary of what was done when complete
"""

Key Autonomy Features:

  • No "ask for help" - pre-authorization is explicit
  • Clear environment details - no guessing about paths/permissions
  • Knowledge graph integration - preserve learnings across runs
  • Exit code expectations - clear success/failure criteria

Part 7: Pattern Summary - Building Autonomous Agents

7.1 The Five Patterns

Pattern When to Use Implementation
Detached Spawning Background tasks that shouldn't block CLI nohup ... & with PID tracking
Permission Bypass Autonomous execution without prompts --permission-mode bypassPermissions
File-Based IPC Async communication with agents Job directory as channel
Exit Code Signaling Status determination without polling Append "exit:{code}" to output
Context-First Prompts Avoid clarification questions Detailed spec + success criteria

7.2 Comparison: Interactive vs Autonomous Patterns

Aspect Interactive Agent Autonomous Agent
Execution Runs in foreground, blocks Detached process, returns immediately
Prompts Can use AskUserQuestion Must provide all context upfront
Approval Can request tool permission Uses --permission-mode bypassPermissions
I/O stdin/stdout with user Files, logs, notification channels
Failure User responds to errors Agent handles/reports via exit code
Monitoring User watches output Query filesystem for status

7.3 When to Use Each Pattern

Use Interactive Agents When:

  • User is present and waiting
  • Task requires user input/decisions
  • Working in development/exploration mode
  • Real-time feedback is valuable

Use Autonomous Agents When:

  • Running background maintenance tasks
  • Multiple parallel operations needed
  • No user available to respond to prompts
  • Results needed asynchronously

Part 8: Real Implementation Examples

8.1 Example: Running Tests Autonomously

Task:

Run pytest in /workspace/tests and report results as JSON

Luzia Command:

luzia musica "Run pytest in /workspace/tests and save results to tests.json with {passed: int, failed: int, errors: int}"

What Happens:

  1. Job directory created with UUID
  2. Prompt written with full context
  3. Script prepared with environment setup
  4. Launched via nohup
  5. Immediately returns job_id to user
  6. Agent runs in background:
    cd /workspace
    pytest tests/ --json > tests.json
    # Results saved to file
    
  7. Exit code captured (0 if all pass, 1 if failures)
  8. Output logged to output.log
  9. Completion notification sent

User Monitor:

luzia jobs {job_id}
# Status: running/completed/failed
# Exit code: 0/1
# Output preview: last 10 lines

8.2 Example: Code Analysis Autonomously

Task:

Analyze the codebase structure and save metrics to analysis.json

Agent Does (no prompts needed):

  1. Reads prompt from job directory
  2. Scans project structure
  3. Collects metrics (LOC, functions, classes, complexity)
  4. Writes results to analysis.json
  5. Stores findings in knowledge graph
  6. Exits with 0

Success Criteria (in prompt):

Results saved to analysis.json with:
- total_files: int
- total_lines: int
- total_functions: int
- total_classes: int
- average_complexity: float

Part 9: Best Practices

9.1 Prompt Design for Autonomy

  1. Be Specific

    • What files? What directories?
    • What exact metrics/outputs?
    • What format (JSON, CSV, text)?
  2. Provide Success Criteria

    • What makes this task complete?
    • What should the output look like?
    • What exit code for success/failure?
  3. Include Error Handling

    • What if file doesn't exist?
    • What if command fails?
    • Should it retry or report and exit?
  4. Minimize Ambiguity

    • Don't say "improve code quality" - say what to measure
    • Don't say "fix bugs" - specify which bugs or how to find them
    • Don't say "optimize" - specify what metric to optimize

9.2 Environment Setup for Autonomy

  1. Pre-authorize Everything

    • Set correct user/group
    • Ensure file permissions allow operations
    • Document what's accessible
  2. Provide Full Context

    • Include CLAUDE.md or similar
    • Document project structure
    • Explain architectural decisions
  3. Set Clear Boundaries

    • Which directories can be modified?
    • Which operations are allowed?
    • What can't be changed?

9.3 Failure Recovery for Autonomy

  1. Use Exit Codes Meaningfully

    • 0 = success
    • 1 = recoverable failure
    • 2 = unrecoverable failure
    • -9 = killed/timeout
  2. Log Failures Comprehensively

    • What was attempted?
    • What failed and why?
    • What was tried to recover?
  3. Enable Automatic Retry

    • Retry on exit code 1 (optional)
    • Don't retry on exit code 2 (unrecoverable)
    • Track retry count to prevent infinite loops

Part 10: Advanced Patterns

10.1 Multi-Phase Autonomous Tasks

For complex tasks requiring multiple steps:

# Phase 1: Context gathering
# Phase 2: Analysis
# Phase 3: Implementation
# Phase 4: Verification
# Phase 5: Reporting

# All phases defined upfront in prompt
# Agent proceeds through all without asking
# Exit code reflects overall success

10.2 Knowledge Graph Integration

Agents store findings persistently:

# From agent code:
from mcp__shared-projects-memory__store_fact import store_fact

# After analysis, store for future agents
store_fact(
    entity_source_name="musica-project",
    relation="has_complexity_metrics",
    entity_target_name="analysis-2026-01-09",
    context={
        "avg_complexity": 3.2,
        "hotspots": ["index.ts", "processor.ts"]
    }
)

10.3 Cross-Agent Coordination

Use shared state file for coordination:

# /opt/server-agents/state/cross-agent-todos.json
# Agents read/update this to coordinate

{
  "current_tasks": [
    {
      "id": "analyze-musica",
      "project": "musica",
      "status": "in_progress",
      "assigned_to": "agent-a1b2",
      "started": "2026-01-09T14:23:00Z"
    }
  ],
  "completed_archive": { ... }
}

Part 11: Failure Cases and Solutions

11.1 Common Blocking Issues

Issue Cause Solution
Agent pauses on permission prompt Tool permission check enabled Use --permission-mode bypassPermissions
Agent blocks on AskUserQuestion Prompt causes clarification needed Redesign prompt with full context
stdin unavailable Agent backgrounded Use file-based IPC for input
Exit code not recorded Script exits before writing exit code Ensure "exit:{code}" in output.log
Job marked "running" forever Process dies but exit code not appended Use tee and explicit exit code capture

11.2 Debugging Blocking Agents

# Check if agent is actually running
ps aux | grep {job_id}

# Check job output (should show what it's doing)
tail -f /var/log/luz-orchestrator/jobs/{job_id}/output.log

# Check if waiting on stdin
strace -p {pid} | grep read

# Look for approval prompts in output
grep -i "approve\|confirm\|permission" /var/log/luz-orchestrator/jobs/{job_id}/output.log

# Check if exit code was written
tail -5 /var/log/luz-orchestrator/jobs/{job_id}/output.log

Part 12: Conclusion - Key Takeaways

12.1 The Core Principle

Autonomous agents don't ask for input because they don't need to.

Rather than implementing complex async prompting, the better approach is:

  1. Specify tasks completely - No ambiguity
  2. Provide full context - No guessing required
  3. Set clear boundaries - Know what's allowed
  4. Detach execution - Run independent of CLI
  5. Capture results - File-based communication

12.2 Implementation Checklist

  • Prompt includes all necessary context
  • Task has clear success criteria
  • Environment fully described (user, directory, permissions)
  • No ambiguous language in prompt
  • Exit codes defined (0=success, 1=failure, 2=error)
  • Output format specified (JSON, text, files)
  • Job runs as appropriate user
  • Results captured to files/logs
  • Notification system tracks completion
  • Status queryable without blocking

12.3 When Blocks Still Occur

  1. Rare: Well-designed prompts rarely need clarification
  2. Detectable: Agent exits with code 1 and logs to output.log
  3. Recoverable: Can retry or modify task and re-queue
  4. Monitorable: Parent CLI never blocks, can watch from elsewhere

Appendix A: Key Code Locations

Location Purpose
/opt/server-agents/orchestrator/bin/luzia (lines 1012-1200) spawn_claude_agent() - Core autonomous agent spawning
/opt/server-agents/orchestrator/lib/docker_bridge.py Container isolation for project agents
/opt/server-agents/orchestrator/lib/queue_controller.py File-based task queue with load awareness
/var/log/luz-orchestrator/jobs/ Job directory structure and IPC
/opt/server-agents/orchestrator/CLAUDE.md Embedded instructions for agents

Appendix B: Environment Variables

Agents have these set automatically:

TMPDIR="/home/{user}/.tmp"      # Prevent /tmp collisions
TEMP="/home/{user}/.tmp"        # Same
TMP="/home/{user}/.tmp"         # Same
HOME="/home/{user}"             # User's home directory
PWD="/path/to/project"          # Working directory

Appendix C: File Formats Reference

Job Directory Files

meta.json:

{
  "id": "142315-a1b2",
  "project": "musica",
  "task": "Run tests and report results",
  "type": "agent",
  "user": "musica",
  "pid": "12847",
  "started": "2026-01-09T14:23:15Z",
  "status": "running",
  "debug": false
}

output.log:

[14:23:15] Starting agent...
[14:23:16] Reading prompt from file
[14:23:17] Executing task...
[14:23:18] Running tests...
PASSED: test_1
PASSED: test_2
...
[14:23:25] Task complete

exit:0

References

  • Luzia CLI: /opt/server-agents/orchestrator/bin/luzia
  • Agent SDK: Claude Agent SDK (Anthropic)
  • Docker Bridge: Container isolation for agent execution
  • Queue Controller: File-based task queue implementation
  • Bot Orchestration Protocol: /opt/server-agents/BOT-ORCHESTRATION-PROTOCOL.md