luzia/docs/TIME_METRICS.md

# Luzia Time Metrics Integration

**Last Updated:** 2026-01-11

## Overview

Luzia now includes comprehensive time-based metrics and context tracking for task execution. This enables:

- **Task Timing**: Track dispatch, start, and completion times for all tasks
- **Duration Tracking**: Calculate and display task durations in various formats
- **System Context**: Capture CPU load, memory, and disk usage at dispatch and completion
- **Performance Baselines**: Establish normal task durations and detect anomalies
- **Causality Tracking**: Understand task sequencing and dependencies

## Quick Start

```bash
# View jobs with timing information
luzia jobs --timing

# View specific job with detailed timing
luzia jobs 123456-abcd

# View logs with timing header
luzia logs 123456-abcd

# View aggregate metrics
luzia metrics

# View project-specific metrics
luzia metrics musica --days 30
```

## Commands

### `luzia jobs`

Lists jobs with optional timing display.

```bash
# Standard output (shows elapsed time for running jobs)
luzia jobs

# Detailed timing columns
luzia jobs --timing

# Specific job details
luzia jobs <job_id>
```

Example output with `--timing`:
```
Job ID             Project    Status     Dispatch   Duration   CPU
--------------------------------------------------------------------------------
123456-abcd        admin      completed  10:30:00   00:05:30   0.52
234567-efgh        musica     running    11:00:00   00:15:42   0.48
```

### `luzia logs`

Shows job output with timing header.

```bash
# With timing header
luzia logs <job_id>

# Without timing header
luzia logs <job_id> --no-header
```

Example header:
```
═════════════════════════════════════════════════════════════════
Job: agent:admin:123456-abcd
Agent: admin
Dispatched: 2026-01-11T10:30:00Z (America/Montevideo: 07:30:00)
Status: completed (took 00:05:30)
System: CPU 0.52, Memory 65%, Disk 45%
═════════════════════════════════════════════════════════════════
```

### `luzia metrics`

Shows aggregate task metrics and performance statistics.

```bash
# All projects summary
luzia metrics

# Specific project
luzia metrics <project>

# Custom time period
luzia metrics --days 30

# Success rate by duration bucket
luzia metrics --by-bucket

# Performance baseline
luzia metrics <project> --baseline
```

Example output:
```
=== Luzia Task Metrics (Last 7 Days) ===

Total Tasks:   45
Total Time:    02:15:30

By Project:
Project            Tasks        Time        Avg    Success
-------------------------------------------------------
admin                 15    01:00:00   00:04:00     93.3%
musica                20    00:45:30   00:02:17     95.0%
dss                   10    00:30:00   00:03:00     90.0%

Longest Running Tasks:
  1. dss: 00:15:42
  2. admin: 00:12:30
```

## Metadata Structure

Time metrics are stored in job `meta.json`:

```json
{
  "id": "123456-abcd",
  "project": "admin",
  "task": "implement feature X",
  "status": "completed",
  "started": "2026-01-11T10:30:00.123456",
  "time_metrics": {
    "dispatch": {
      "utc_time": "2026-01-11T10:30:00Z",
      "agent_timezone": "America/Montevideo",
      "system_load": [0.52, 0.48, 0.45],
      "memory_percent": 65,
      "disk_percent": 45
    },
    "completion": {
      "utc_time": "2026-01-11T10:35:30Z",
      "duration_seconds": 330,
      "duration_formatted": "00:05:30",
      "exit_code": 0,
      "system_load": [0.48, 0.50, 0.47],
      "memory_percent": 67,
      "disk_percent": 45
    }
  },
  "time_tracker_data": {
    "task_id": "123456-abcd",
    "project": "admin",
    "dispatch_time": "2026-01-11T10:30:00Z",
    "agent_timezone": "America/Montevideo"
  }
}
```

## Performance Baselines

The system automatically collects task duration data to establish performance baselines:

```bash
# View current baseline for a project
luzia metrics musica --baseline
```

Output:
```
Performance Baseline:
  Average: 00:02:17
  Median:  00:01:45
  P95:     00:05:30
  Samples: 50
```

### Anomaly Detection

Tasks that run significantly longer than the baseline are flagged:

- **Normal**: Within P95
- **Slow**: > average + 2σ (flagged as potential issue)
- **Very Slow**: > average + 3σ (highlighted in reports)
- **Extreme**: > average + 4σ (should be investigated)

## Success Rate by Duration

Analyze success rates based on task duration:

```bash
luzia metrics --by-bucket
```

Output:
```
Success Rate by Duration:
Duration         Total   Success     Rate
-----------------------------------------
< 1 minute          15       15    100.0%
1-5 minutes         20       19     95.0%
5-15 minutes        10        9     90.0%
15-30 minutes        5        4     80.0%
```

## Causality Tracking

Tasks can reference prior task completions to establish causal relationships:

```python
from time_metrics import find_prior_task

# Find the most recent task that completed before this dispatch
prior = find_prior_task(dispatch_time, project="admin")
if prior:
    print(f"Started {prior['gap_formatted']} after {prior['task_id']} completed")
```

## Implementation Details

### Module Location

```
/opt/server-agents/orchestrator/lib/time_metrics.py
```

### Key Functions

| Function | Description |
|----------|-------------|
| `get_utc_now()` | Get current UTC timestamp |
| `calculate_duration_seconds(start, end)` | Calculate duration between timestamps |
| `format_duration(seconds)` | Format as HH:MM:SS |
| `format_duration_human(seconds)` | Format as "1h 30m 45s" |
| `capture_system_context()` | Get CPU, memory, disk snapshot |
| `create_task_time_metadata(task_id, project)` | Create dispatch metadata |
| `update_task_completion_metadata(meta, exit_code)` | Add completion info |

### Storage

- **Per-task metadata**: `/var/log/luz-orchestrator/jobs/<job_id>/meta.json`
- **Performance baselines**: `/var/lib/luzia/metrics/baselines.db`

## Timezone Support

All internal timestamps are in UTC. Display functions can convert to local time:

```python
from time_metrics import convert_to_local_time, format_timestamp_with_local

utc_ts = "2026-01-11T10:30:00Z"

# Convert to local time
local = convert_to_local_time(utc_ts, "America/Montevideo")
# Returns: "07:30:00"

# Format with both
formatted = format_timestamp_with_local(utc_ts, "America/Montevideo")
# Returns: "2026-01-11T10:30:00Z (America/Montevideo: 07:30:00)"
```

## Integration with Time MCP

The time metrics system can be enhanced with the Time MCP for more accurate timezone handling:

```python
# Via MCP
mcp__time__get_current_time(timezone="UTC")
mcp__time__convert_time(source_timezone="UTC", time="10:30", target_timezone="America/Montevideo")
```

## Best Practices

1. **Always use UTC for storage** - Only convert for display
2. **Capture context at dispatch** - System state affects task performance
3. **Monitor baselines regularly** - Detect performance regressions
4. **Use duration buckets** - Identify problematic task lengths
5. **Track causality** - Understand task dependencies

## Troubleshooting

### Time metrics not appearing

Check that the module is loaded:
```bash
python3 -c "from time_metrics import TIME_METRICS_AVAILABLE; print(TIME_METRICS_AVAILABLE)"
```

### Baseline database missing

Create the metrics directory:
```bash
mkdir -p /var/lib/luzia/metrics
```

### Duration showing as unknown

Ensure both dispatch and completion times are recorded:
```bash
cat /var/log/luz-orchestrator/jobs/<job_id>/meta.json | jq '.time_metrics'
```

## Testing

Run the test suite:
```bash
cd /opt/server-agents/orchestrator
python3 -m pytest tests/test_time_metrics.py -v
```