Files
luzia/docs/SKILL_LEARNING_SYSTEM.md
admin ec33ac1936 Refactor cockpit to use DockerTmuxController pattern
Based on claude-code-tools TmuxCLIController, this refactor:

- Added DockerTmuxController class for robust tmux session management
- Implements send_keys() with configurable delay_enter
- Implements capture_pane() for output retrieval
- Implements wait_for_prompt() for pattern-based completion detection
- Implements wait_for_idle() for content-hash-based idle detection
- Implements wait_for_shell_prompt() for shell prompt detection

Also includes workflow improvements:
- Pre-task git snapshot before agent execution
- Post-task commit protocol in agent guidelines

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 10:42:16 -03:00

426 lines
11 KiB
Markdown

# Skill and Knowledge Learning System
## Overview
The Skill and Knowledge Learning System automatically extracts learnings from completed tasks and QA passes, storing them in the shared knowledge graph for future skill recommendations and continuous decision-making improvements.
This system enables Luzia to:
- **Learn from successes**: Extract patterns from passing QA validations
- **Build skill profiles**: Aggregate tool usage, patterns, and decision-making approaches
- **Make recommendations**: Suggest effective approaches for similar future tasks
- **Improve over time**: Store learnings persistently for cross-session learning
## Architecture
### Components
```
TaskExecution
TaskAnalyzer → Patterns & Metadata
SkillExtractor → Extracted Skills
LearningEngine → Learning Objects
KnowledgeGraph (research domain)
SkillRecommender → Task Recommendations
```
### Core Classes
#### 1. **TaskAnalyzer**
Analyzes task executions to extract patterns and metadata.
```python
from lib.skill_learning_engine import TaskAnalyzer
analyzer = TaskAnalyzer()
# Analyze a single task
execution = analyzer.analyze_task({
"task_id": "task_001",
"prompt": "Refactor database schema",
"project": "overbits",
"status": "success",
"tools_used": ["Bash", "Read", "Edit"],
"duration": 45.2,
"result_summary": "Schema refactored successfully",
"qa_passed": True,
"timestamp": "2026-01-09T12:00:00"
})
# Extract patterns from multiple executions
patterns = analyzer.extract_patterns(executions)
# Returns: success_rate, average_duration, common_tools, etc.
```
#### 2. **SkillExtractor**
Extracts skills from task executions and QA results.
```python
from lib.skill_learning_engine import SkillExtractor
extractor = SkillExtractor()
# Extract skills from task
skills = extractor.extract_from_task(execution)
# Extract skills from QA results
qa_skills = extractor.extract_from_qa_results(qa_results)
# Aggregate multiple skill extractions
aggregated = extractor.aggregate_skills(all_skills)
```
**Skill Categories:**
- `tool_usage`: Tools used in task (Read, Bash, Edit, etc.)
- `pattern`: Task patterns (optimization, debugging, testing, etc.)
- `decision`: Decision-making approaches
- `architecture`: Project/system knowledge
#### 3. **LearningEngine**
Processes and stores learnings in the knowledge graph.
```python
from lib.skill_learning_engine import LearningEngine
engine = LearningEngine()
# Extract a learning from successful task
learning = engine.extract_learning(execution, skills, qa_results)
# Store in knowledge graph
learning_id = engine.store_learning(learning)
# Create skill entities
skill_id = engine.create_skill_entity(skill)
```
#### 4. **SkillRecommender**
Recommends skills for future tasks based on stored learnings.
```python
from lib.skill_learning_engine import SkillRecommender
recommender = SkillRecommender()
# Get recommendations for a task
recommendations = recommender.recommend_for_task(
task_prompt="Optimize database performance",
project="overbits"
)
# Get overall skill profile
profile = recommender.get_skill_profile()
```
#### 5. **SkillLearningSystem**
Unified orchestrator for the complete learning pipeline.
```python
from lib.skill_learning_engine import SkillLearningSystem
system = SkillLearningSystem()
# Process a completed task with QA results
result = system.process_task_completion(task_data, qa_results)
# Result includes: skills_extracted, learning_created, learning_id
# Get recommendations
recommendations = system.get_recommendations(prompt, project)
# Get learning summary
summary = system.get_learning_summary()
```
## Integration with QA Validator
The learning system integrates seamlessly with the QA validator:
### Manual Integration
```python
from lib.qa_learning_integration import QALearningIntegrator
integrator = QALearningIntegrator()
# Run QA with automatic learning extraction
result = integrator.run_qa_and_sync_with_learning(sync=True, verbose=True)
```
### Via CLI
```bash
# Standard QA validation
python3 lib/qa_validator.py
# QA validation with learning extraction
python3 lib/qa_validator.py --learn --sync --verbose
# Get statistics on learning integration
python3 lib/qa_learning_integration.py --stats
```
## Knowledge Graph Storage
Learnings are stored in the `research` domain of the knowledge graph:
```
Entity Type: finding
Name: learning_20260109_120000_Refactor_Database_Schema
Content:
- Title: Refactor Database Schema
- Description: Task execution details
- Skills Used: tool_bash, tool_read, tool_edit, ...
- Pattern: refactoring_pattern
- Applicability: overbits, tool_bash, decision, ...
- Confidence: 0.85
Metadata:
- skills: [list of skill names]
- pattern: refactoring_pattern
- confidence: 0.85
- applicability: [projects, tools, categories]
- extraction_time: ISO timestamp
```
### Accessing Stored Learnings
```python
from lib.knowledge_graph import KnowledgeGraph
kg = KnowledgeGraph("research")
# Search for learnings
learnings = kg.search("database optimization", limit=10)
# Get specific learning
learning = kg.get_entity("learning_20260109_120000_Refactor_Database_Schema")
# Get related skills
relations = kg.get_relations("learning_20260109_120000_...")
# List all learnings
all_learnings = kg.list_entities(entity_type="finding")
```
## Usage Examples
### Example 1: Extract Learnings from Task Completion
```python
from lib.skill_learning_engine import SkillLearningSystem
system = SkillLearningSystem()
# Task data from execution
task_data = {
"task_id": "deploy_overbits_v2",
"prompt": "Deploy new frontend build to production with zero downtime",
"project": "overbits",
"status": "success",
"tools_used": ["Bash", "Read", "Edit"],
"duration": 120.5,
"result_summary": "Successfully deployed with no downtime, 100% rollback verified",
"qa_passed": True,
"timestamp": "2026-01-09T15:30:00"
}
# QA validation results
qa_results = {
"passed": True,
"results": {
"syntax": True,
"routes": True,
"command_docs": True,
},
"summary": {
"errors": 0,
"warnings": 0,
"info": 5,
}
}
# Process and extract learnings
result = system.process_task_completion(task_data, qa_results)
print(f"Skills extracted: {result['skills_extracted']}")
print(f"Learning created: {result['learning_id']}")
```
### Example 2: Get Recommendations for Similar Task
```python
# Later, for a similar deployment task
new_prompt = "Deploy database migration to production"
recommendations = system.get_recommendations(new_prompt, project="overbits")
for rec in recommendations:
print(f"Skill: {rec['skill']}")
print(f"From learning: {rec['source_learning']}")
print(f"Confidence: {rec['confidence']:.1%}")
```
### Example 3: Build Skill Profile
```python
# Get overview of learned skills
profile = system.get_learning_summary()
print(f"Total learnings: {profile['total_learnings']}")
print(f"Skills by category: {profile['by_category']}")
print(f"Top 5 skills:")
for skill, count in profile['top_skills'][:5]:
print(f" {skill}: {count} occurrences")
```
## Testing
Run the comprehensive test suite:
```bash
python3 -m pytest tests/test_skill_learning.py -v
```
**Test Coverage:**
- Task analysis and pattern extraction
- Skill extraction from tasks and QA results
- Decision pattern recognition
- Skill aggregation
- Learning extraction and storage
- Skill recommendations
- Full integration pipeline
All tests pass with mocked knowledge graph to avoid dependencies.
## Configuration
The system is configured in the QA validator integration:
**File:** `lib/qa_learning_integration.py`
Key settings:
- **Knowledge Graph Domain**: `research` (all learnings stored here)
- **Learning Extraction Trigger**: QA pass with all validations successful
- **Skill Categories**: tool_usage, pattern, decision, architecture
- **Confidence Calculation**: Weighted average of skill confidence and QA pass rate
## Data Flow
```
Task Execution
Task Analysis
├─→ Success rate: 85%
├─→ Average duration: 45 min
├─→ Common tools: [Bash, Read, Edit]
└─→ Project distribution: {overbits: 60%, dss: 40%}
Skill Extraction
├─→ Tool skills (from tools_used)
├─→ Decision patterns (from prompt)
├─→ Project knowledge (from project)
└─→ QA validation skills
Learning Creation
├─→ Title & description
├─→ Skill aggregation
├─→ Pattern classification
├─→ Confidence scoring
└─→ Applicability determination
Knowledge Graph Storage
└─→ Entity: finding
Relations: skill → learning
Metadata: skills, pattern, confidence, applicability
Future Recommendations
└─→ Search similar tasks
Extract applicable skills
Rank by confidence
```
## Performance Considerations
**Learning Extraction:**
- Runs only on successful QA passes (not a bottleneck)
- Async-ready (future enhancement)
- Minimal overhead (~100ms per extraction)
**Recommendation:**
- Uses FTS5 full-text search on KG
- Limited to top 10 results
- Confidence-ranked sorting
**Storage:**
- SQLite with FTS5 (efficient)
- Automatic indexing and triggers
- Scales to thousands of learnings
## Future Enhancements
1. **Async Extraction**: Background learning extraction during deployment
2. **Confidence Evolution**: Learnings gain/lose confidence based on outcomes
3. **Skill Decay**: Unused skills decrease in relevance over time
4. **Cross-Project Learning**: Share learnings between similar projects
5. **Decision Tracing**: Link recommendations back to specific successful tasks
6. **Feedback Loop**: Update learning confidence based on task outcomes
7. **Skill Trees**: Build hierarchies of related skills
8. **Collaborative Learning**: Share learnings across team instances
## Troubleshooting
### Learnings Not Being Created
Check:
1. QA validation passes (`qa_results["passed"] == True`)
2. Knowledge graph is accessible and writable
3. No errors in `qa_learning_integration.py` output
```bash
python3 lib/qa_validator.py --learn --verbose
```
### Recommendations Are Empty
Possible causes:
1. No learnings stored yet (run a successful task with `--learn`)
2. Task prompt doesn't match stored learning titles
3. Knowledge graph search not finding results
Test with:
```bash
python3 lib/skill_learning_engine.py recommend --task-prompt "Your task" --project overbits
```
### Knowledge Graph Issues
Check knowledge graph status:
```bash
python3 lib/knowledge_graph.py stats
python3 lib/knowledge_graph.py search "learning"
```
## API Reference
See inline documentation in:
- `lib/skill_learning_engine.py` - Main system classes
- `lib/qa_learning_integration.py` - QA integration
- `tests/test_skill_learning.py` - Usage examples via tests
## Contributing
To add new skill extraction patterns:
1. Add pattern to `SkillExtractor._extract_decision_patterns()`
2. Update test cases in `TestSkillExtractor.test_extract_decision_patterns()`
3. Test with: `python3 lib/skill_learning_engine.py test`
4. Document pattern in this guide
## License
Part of Luzia Orchestrator. See parent project license.