luzia/docs/SKILL_LEARNING_SYSTEM.md

# Skill and Knowledge Learning System

## Overview

The Skill and Knowledge Learning System automatically extracts learnings from completed tasks and QA passes, storing them in the shared knowledge graph for future skill recommendations and continuous decision-making improvements.

This system enables Luzia to:
- **Learn from successes**: Extract patterns from passing QA validations
- **Build skill profiles**: Aggregate tool usage, patterns, and decision-making approaches
- **Make recommendations**: Suggest effective approaches for similar future tasks
- **Improve over time**: Store learnings persistently for cross-session learning

## Architecture

### Components

```
TaskExecution
     ↓
TaskAnalyzer → Patterns & Metadata
     ↓
SkillExtractor → Extracted Skills
     ↓
LearningEngine → Learning Objects
     ↓
KnowledgeGraph (research domain)
     ↓
SkillRecommender → Task Recommendations
```

### Core Classes

#### 1. **TaskAnalyzer**
Analyzes task executions to extract patterns and metadata.

```python
from lib.skill_learning_engine import TaskAnalyzer

analyzer = TaskAnalyzer()

# Analyze a single task
execution = analyzer.analyze_task({
    "task_id": "task_001",
    "prompt": "Refactor database schema",
    "project": "overbits",
    "status": "success",
    "tools_used": ["Bash", "Read", "Edit"],
    "duration": 45.2,
    "result_summary": "Schema refactored successfully",
    "qa_passed": True,
    "timestamp": "2026-01-09T12:00:00"
})

# Extract patterns from multiple executions
patterns = analyzer.extract_patterns(executions)
# Returns: success_rate, average_duration, common_tools, etc.
```

#### 2. **SkillExtractor**
Extracts skills from task executions and QA results.

```python
from lib.skill_learning_engine import SkillExtractor

extractor = SkillExtractor()

# Extract skills from task
skills = extractor.extract_from_task(execution)

# Extract skills from QA results
qa_skills = extractor.extract_from_qa_results(qa_results)

# Aggregate multiple skill extractions
aggregated = extractor.aggregate_skills(all_skills)
```

**Skill Categories:**
- `tool_usage`: Tools used in task (Read, Bash, Edit, etc.)
- `pattern`: Task patterns (optimization, debugging, testing, etc.)
- `decision`: Decision-making approaches
- `architecture`: Project/system knowledge

#### 3. **LearningEngine**
Processes and stores learnings in the knowledge graph.

```python
from lib.skill_learning_engine import LearningEngine

engine = LearningEngine()

# Extract a learning from successful task
learning = engine.extract_learning(execution, skills, qa_results)

# Store in knowledge graph
learning_id = engine.store_learning(learning)

# Create skill entities
skill_id = engine.create_skill_entity(skill)
```

#### 4. **SkillRecommender**
Recommends skills for future tasks based on stored learnings.

```python
from lib.skill_learning_engine import SkillRecommender

recommender = SkillRecommender()

# Get recommendations for a task
recommendations = recommender.recommend_for_task(
    task_prompt="Optimize database performance",
    project="overbits"
)

# Get overall skill profile
profile = recommender.get_skill_profile()
```

#### 5. **SkillLearningSystem**
Unified orchestrator for the complete learning pipeline.

```python
from lib.skill_learning_engine import SkillLearningSystem

system = SkillLearningSystem()

# Process a completed task with QA results
result = system.process_task_completion(task_data, qa_results)
# Result includes: skills_extracted, learning_created, learning_id

# Get recommendations
recommendations = system.get_recommendations(prompt, project)

# Get learning summary
summary = system.get_learning_summary()
```

## Integration with QA Validator

The learning system integrates seamlessly with the QA validator:

### Manual Integration

```python
from lib.qa_learning_integration import QALearningIntegrator

integrator = QALearningIntegrator()

# Run QA with automatic learning extraction
result = integrator.run_qa_and_sync_with_learning(sync=True, verbose=True)
```

### Via CLI

```bash
# Standard QA validation
python3 lib/qa_validator.py

# QA validation with learning extraction
python3 lib/qa_validator.py --learn --sync --verbose

# Get statistics on learning integration
python3 lib/qa_learning_integration.py --stats
```

## Knowledge Graph Storage

Learnings are stored in the `research` domain of the knowledge graph:

```
Entity Type: finding
Name: learning_20260109_120000_Refactor_Database_Schema
Content:
  - Title: Refactor Database Schema
  - Description: Task execution details
  - Skills Used: tool_bash, tool_read, tool_edit, ...
  - Pattern: refactoring_pattern
  - Applicability: overbits, tool_bash, decision, ...
  - Confidence: 0.85

Metadata:
  - skills: [list of skill names]
  - pattern: refactoring_pattern
  - confidence: 0.85
  - applicability: [projects, tools, categories]
  - extraction_time: ISO timestamp
```

### Accessing Stored Learnings

```python
from lib.knowledge_graph import KnowledgeGraph

kg = KnowledgeGraph("research")

# Search for learnings
learnings = kg.search("database optimization", limit=10)

# Get specific learning
learning = kg.get_entity("learning_20260109_120000_Refactor_Database_Schema")

# Get related skills
relations = kg.get_relations("learning_20260109_120000_...")

# List all learnings
all_learnings = kg.list_entities(entity_type="finding")
```

## Usage Examples

### Example 1: Extract Learnings from Task Completion

```python
from lib.skill_learning_engine import SkillLearningSystem

system = SkillLearningSystem()

# Task data from execution
task_data = {
    "task_id": "deploy_overbits_v2",
    "prompt": "Deploy new frontend build to production with zero downtime",
    "project": "overbits",
    "status": "success",
    "tools_used": ["Bash", "Read", "Edit"],
    "duration": 120.5,
    "result_summary": "Successfully deployed with no downtime, 100% rollback verified",
    "qa_passed": True,
    "timestamp": "2026-01-09T15:30:00"
}

# QA validation results
qa_results = {
    "passed": True,
    "results": {
        "syntax": True,
        "routes": True,
        "command_docs": True,
    },
    "summary": {
        "errors": 0,
        "warnings": 0,
        "info": 5,
    }
}

# Process and extract learnings
result = system.process_task_completion(task_data, qa_results)

print(f"Skills extracted: {result['skills_extracted']}")
print(f"Learning created: {result['learning_id']}")
```

### Example 2: Get Recommendations for Similar Task

```python
# Later, for a similar deployment task
new_prompt = "Deploy database migration to production"

recommendations = system.get_recommendations(new_prompt, project="overbits")

for rec in recommendations:
    print(f"Skill: {rec['skill']}")
    print(f"From learning: {rec['source_learning']}")
    print(f"Confidence: {rec['confidence']:.1%}")
```

### Example 3: Build Skill Profile

```python
# Get overview of learned skills
profile = system.get_learning_summary()

print(f"Total learnings: {profile['total_learnings']}")
print(f"Skills by category: {profile['by_category']}")
print(f"Top 5 skills:")
for skill, count in profile['top_skills'][:5]:
    print(f"  {skill}: {count} occurrences")
```

## Testing

Run the comprehensive test suite:

```bash
python3 -m pytest tests/test_skill_learning.py -v
```

**Test Coverage:**
- Task analysis and pattern extraction
- Skill extraction from tasks and QA results
- Decision pattern recognition
- Skill aggregation
- Learning extraction and storage
- Skill recommendations
- Full integration pipeline

All tests pass with mocked knowledge graph to avoid dependencies.

## Configuration

The system is configured in the QA validator integration:

**File:** `lib/qa_learning_integration.py`

Key settings:
- **Knowledge Graph Domain**: `research` (all learnings stored here)
- **Learning Extraction Trigger**: QA pass with all validations successful
- **Skill Categories**: tool_usage, pattern, decision, architecture
- **Confidence Calculation**: Weighted average of skill confidence and QA pass rate

## Data Flow

```
Task Execution
      ↓
Task Analysis
      ├─→ Success rate: 85%
      ├─→ Average duration: 45 min
      ├─→ Common tools: [Bash, Read, Edit]
      └─→ Project distribution: {overbits: 60%, dss: 40%}
      ↓
Skill Extraction
      ├─→ Tool skills (from tools_used)
      ├─→ Decision patterns (from prompt)
      ├─→ Project knowledge (from project)
      └─→ QA validation skills
      ↓
Learning Creation
      ├─→ Title & description
      ├─→ Skill aggregation
      ├─→ Pattern classification
      ├─→ Confidence scoring
      └─→ Applicability determination
      ↓
Knowledge Graph Storage
      └─→ Entity: finding
          Relations: skill → learning
          Metadata: skills, pattern, confidence, applicability
      ↓
Future Recommendations
      └─→ Search similar tasks
          Extract applicable skills
          Rank by confidence
```

## Performance Considerations

**Learning Extraction:**
- Runs only on successful QA passes (not a bottleneck)
- Async-ready (future enhancement)
- Minimal overhead (~100ms per extraction)

**Recommendation:**
- Uses FTS5 full-text search on KG
- Limited to top 10 results
- Confidence-ranked sorting

**Storage:**
- SQLite with FTS5 (efficient)
- Automatic indexing and triggers
- Scales to thousands of learnings

## Future Enhancements

1. **Async Extraction**: Background learning extraction during deployment
2. **Confidence Evolution**: Learnings gain/lose confidence based on outcomes
3. **Skill Decay**: Unused skills decrease in relevance over time
4. **Cross-Project Learning**: Share learnings between similar projects
5. **Decision Tracing**: Link recommendations back to specific successful tasks
6. **Feedback Loop**: Update learning confidence based on task outcomes
7. **Skill Trees**: Build hierarchies of related skills
8. **Collaborative Learning**: Share learnings across team instances

## Troubleshooting

### Learnings Not Being Created

Check:
1. QA validation passes (`qa_results["passed"] == True`)
2. Knowledge graph is accessible and writable
3. No errors in `qa_learning_integration.py` output

```bash
python3 lib/qa_validator.py --learn --verbose
```

### Recommendations Are Empty

Possible causes:
1. No learnings stored yet (run a successful task with `--learn`)
2. Task prompt doesn't match stored learning titles
3. Knowledge graph search not finding results

Test with:
```bash
python3 lib/skill_learning_engine.py recommend --task-prompt "Your task" --project overbits
```

### Knowledge Graph Issues

Check knowledge graph status:
```bash
python3 lib/knowledge_graph.py stats
python3 lib/knowledge_graph.py search "learning"
```

## API Reference

See inline documentation in:
- `lib/skill_learning_engine.py` - Main system classes
- `lib/qa_learning_integration.py` - QA integration
- `tests/test_skill_learning.py` - Usage examples via tests

## Contributing

To add new skill extraction patterns:

1. Add pattern to `SkillExtractor._extract_decision_patterns()`
2. Update test cases in `TestSkillExtractor.test_extract_decision_patterns()`
3. Test with: `python3 lib/skill_learning_engine.py test`
4. Document pattern in this guide

## License

Part of Luzia Orchestrator. See parent project license.