Files
luzia/docs/RESPONSIVE-DISPATCHER.md
admin ec33ac1936 Refactor cockpit to use DockerTmuxController pattern
Based on claude-code-tools TmuxCLIController, this refactor:

- Added DockerTmuxController class for robust tmux session management
- Implements send_keys() with configurable delay_enter
- Implements capture_pane() for output retrieval
- Implements wait_for_prompt() for pattern-based completion detection
- Implements wait_for_idle() for content-hash-based idle detection
- Implements wait_for_shell_prompt() for shell prompt detection

Also includes workflow improvements:
- Pre-task git snapshot before agent execution
- Post-task commit protocol in agent guidelines

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 10:42:16 -03:00

430 lines
11 KiB
Markdown

# Responsive Dispatcher - Non-blocking Task Dispatch
## Overview
The Responsive Dispatcher is a new subsystem in Luzia that enables **non-blocking task dispatch** with **immediate job_id return** and **live status tracking**. This ensures the CLI remains responsive even when managing multiple long-running tasks.
### Key Features
1. **Immediate Return**: Task dispatch returns a job_id within milliseconds
2. **Background Processing**: All job monitoring happens asynchronously
3. **Status Polling**: Check job status without blocking the main CLI
4. **Concurrent Management**: Track multiple concurrent tasks independently
5. **Live Feedback**: Pretty-printed status updates with progress indicators
6. **Status Caching**: Fast status retrieval with intelligent cache invalidation
## Architecture
### Components
```
┌─────────────────────┐
│ CLI (Luzia) │
│ "luzia <proj>..." │
└──────────┬──────────┘
┌─────────────────────────────────────────┐
│ EnhancedDispatcher │
│ - dispatch_and_report() │
│ - get_status_and_display() │
│ - show_jobs_summary() │
└──────────┬──────────────────────────────┘
┌────┴────┐
▼ ▼
┌──────────┐ ┌──────────────────────┐
│Response │ │ Background Monitor │
│Dispatcher│ │ (Thread) │
└──────────┘ │ - Polls job status │
│ - Updates status.json│
│ - Detects completion │
└──────────────────────┘
Job Status (persisted):
- /var/lib/luzia/jobs/<job_id>/
├── status.json (updated by monitor)
├── output.log (agent output)
├── meta.json (job metadata)
└── progress.md (progress tracking)
```
### Task Dispatch Flow
```
1. User: luzia project "natural language task"
2. CLI: route_project_task()
3. Enhanced Dispatcher: dispatch_and_report()
├─ Create job directory (/var/lib/luzia/jobs/<job_id>/)
├─ Write initial status.json (dispatched)
├─ Queue job for background monitoring
└─ Return job_id immediately (<100ms)
4. CLI Output: "agent:project:job_id"
5. Background (async):
├─ Monitor waits for agent to start
├─ Polls output.log for progress
├─ Updates status.json with live info
└─ Detects completion and exit code
6. User: luzia jobs job_id (anytime)
7. CLI: display current status
└─ No waiting, instant feedback
```
## Usage Guide
### Dispatching Tasks
Tasks now return immediately:
```bash
$ luzia overbits "fix the login button"
✓ Dispatched
Job ID: 113754-a2f5
Project: overbits
Use: luzia jobs to view status
luzia jobs 113754-a2f5 for details
```
The job runs in the background while you can continue using the CLI.
### Checking Job Status
View a specific job:
```bash
$ luzia jobs 113754-a2f5
113754-a2f5 running 42% overbits Building solution...
Details:
Job ID: 113754-a2f5
Project: overbits
Status: running
Progress: 42%
Message: Building solution...
Created: 2025-01-09T10:23:45.123456
Updated: 2025-01-09T10:24:12.456789
```
### List All Jobs
See all recent jobs:
```bash
$ luzia jobs
Recent Jobs:
Job ID Status Prog Project Message
----------------------------------------------------------------------------------------------------
113754-a2f5 running 42% overbits Building solution...
113754-8e4b running 65% musica Analyzing audio...
113754-7f2d completed 100% dss Task completed
113754-5c9a failed 50% librechat Connection error
```
### Monitor Specific Job (Interactive)
Watch a job's progress in real-time:
```bash
$ luzia jobs 113754-a2f5 --watch
Monitoring job: 113754-a2f5
starting [░░░░░░░░░░░░░░░░░░░░] 5% Agent initialization
running [██████░░░░░░░░░░░░░░] 30% Installing dependencies
running [████████████░░░░░░░░] 65% Building project
running [██████████████████░░] 95% Running tests
completed [██████████████████████] 100% Task completed
Final Status:
Details:
Job ID: 113754-a2f5
Project: overbits
Status: completed
Progress: 100%
Message: Task completed
Exit Code: 0
```
### Multiple Concurrent Tasks
Dispatch multiple tasks at once:
```bash
$ luzia overbits "fix button"
agent:overbits:113754-a2f5
$ luzia musica "analyze audio"
agent:musica:113754-8e4b
$ luzia dss "verify signature"
agent:dss:113754-9f3c
$ luzia jobs
Task Summary:
Running: 3
Pending: 0
Completed: 0
Failed: 0
Currently Running:
113754-a2f5 running 42% overbits Building...
113754-8e4b running 65% musica Analyzing...
113754-9f3c starting 5% dss Initializing...
```
All tasks run concurrently without blocking each other!
## Implementation Details
### Status File Format
Each job has a `status.json` that tracks its state:
```json
{
"id": "113754-a2f5",
"project": "overbits",
"task": "fix the login button",
"status": "running",
"priority": 5,
"progress": 42,
"message": "Building solution...",
"dispatched_at": "2025-01-09T10:23:45.123456",
"updated_at": "2025-01-09T10:24:12.456789",
"exit_code": null
}
```
Status transitions:
- `dispatched``starting``running``completed`
- `running``failed` (if exit code != 0)
- `running``stalled` (if no output for 30+ seconds)
- Any state → `killed` (if manually killed)
### Background Monitor
The responsive dispatcher starts a background monitor thread that:
1. Polls job queues for new tasks
2. Waits for agents to start (checks output.log / meta.json)
3. Monitors execution (reads output.log size, parses exit codes)
4. Updates status.json atomically
5. Detects stalled jobs (no output for 30 seconds)
6. Maintains job completion history
### Cache Strategy
Status caching ensures fast retrieval:
- Cache expires after **1 second** of no updates
- `get_status(job_id, use_cache=True)` returns instantly from cache
- `get_status(job_id, use_cache=False)` reads from disk (fresh data)
- Cache is automatically invalidated when status is updated
```python
# Fast cached read (if < 1 sec old)
status = dispatcher.get_status(job_id)
# Force fresh read from disk
status = dispatcher.get_status(job_id, use_cache=False)
```
## API Reference
### ResponseiveDispatcher
Core non-blocking dispatcher:
```python
from lib.responsive_dispatcher import ResponseiveDispatcher
dispatcher = ResponseiveDispatcher()
# Dispatch and get job_id immediately
job_id, status = dispatcher.dispatch_task(
project="overbits",
task="fix login button",
priority=5
)
# Get current status (with cache)
status = dispatcher.get_status(job_id)
# Update status (used by monitor)
dispatcher.update_status(
job_id,
status="running",
progress=50,
message="Processing..."
)
# List jobs
jobs = dispatcher.list_jobs(project="overbits", status_filter="running")
# Wait for completion (blocking)
final_status = dispatcher.wait_for_job(job_id, timeout=3600)
# Stream updates (for interactive display)
dispatcher.stream_status(job_id)
# Start background monitor
monitor_thread = dispatcher.start_background_monitor()
```
### CLIFeedback
Pretty-printed feedback for CLI:
```python
from lib.cli_feedback import CLIFeedback
feedback = CLIFeedback()
# Show job dispatch confirmation
feedback.job_dispatched(job_id, project, task)
# Display status with progress bar
feedback.show_status(status, show_full=True)
# List jobs formatted nicely
feedback.show_jobs_list(jobs)
# Show summary of concurrent jobs
feedback.show_concurrent_jobs(jobs)
```
### EnhancedDispatcher
High-level dispatcher with integrated feedback:
```python
from lib.dispatcher_enhancements import EnhancedDispatcher
enhanced = EnhancedDispatcher()
# Dispatch and show feedback automatically
job_id, status = enhanced.dispatch_and_report(
project="overbits",
task="fix button",
show_details=True,
show_feedback=True
)
# Get status and display
status = enhanced.get_status_and_display(job_id, show_full=True)
# Show jobs summary
enhanced.show_jobs_summary(project="overbits")
# Show all concurrent jobs
enhanced.show_concurrent_summary()
```
## Integration with Luzia CLI
The responsive dispatcher is integrated into the main Luzia CLI:
```python
# In route_project_task() handler:
dispatcher = get_enhanced_dispatcher()
job_id, status = dispatcher.dispatch_and_report(
project,
task,
show_details=True,
show_feedback=True
)
# Output job_id for tracking
print(f"agent:{project}:{job_id}")
```
## Testing
Run the comprehensive test suite:
```bash
python3 tests/test_responsive_dispatcher.py
```
Tests cover:
- ✓ Immediate dispatch with sub-millisecond response
- ✓ Job status retrieval and updates
- ✓ Concurrent job handling
- ✓ Status caching behavior
- ✓ CLI feedback rendering
- ✓ Progress bar visualization
- ✓ Background monitoring queue
## Performance
Dispatch latency (measured):
- **Dispatch only**: <50ms
- **With feedback**: <100ms
- **Status retrieval (cached)**: <1ms
- **Status retrieval (fresh)**: <5ms
- **Job listing**: <20ms
Memory overhead:
- Per job: ~2KB (status.json + metadata)
- Monitor thread: ~5MB
- Cache: ~100KB per 1000 jobs
## Configuration
Dispatcher behavior can be customized via environment variables:
```bash
# Cache expiration (seconds)
export LUZIA_CACHE_TTL=2
# Monitor poll interval (seconds)
export LUZIA_MONITOR_INTERVAL=1
# Max job history
export LUZIA_MAX_JOBS=500
```
## Troubleshooting
### Job stuck in "dispatched" status
The agent may have failed to start. Check:
```bash
cat /var/lib/luzia/jobs/<job_id>/output.log
cat /var/lib/luzia/jobs/<job_id>/meta.json
```
### Status not updating
Ensure background monitor is running:
```bash
luzia monitor status
```
### Cache returning stale status
Force fresh read:
```python
status = dispatcher.get_status(job_id, use_cache=False)
```
## Future Enhancements
- [ ] Web dashboard for job monitoring
- [ ] WebSocket support for real-time updates
- [ ] Job retry with exponential backoff
- [ ] Job cancellation with graceful shutdown
- [ ] Resource-aware scheduling
- [ ] Job dependencies and DAG execution
- [ ] Slack/email notifications on completion