Based on claude-code-tools TmuxCLIController, this refactor: - Added DockerTmuxController class for robust tmux session management - Implements send_keys() with configurable delay_enter - Implements capture_pane() for output retrieval - Implements wait_for_prompt() for pattern-based completion detection - Implements wait_for_idle() for content-hash-based idle detection - Implements wait_for_shell_prompt() for shell prompt detection Also includes workflow improvements: - Pre-task git snapshot before agent execution - Post-task commit protocol in agent guidelines Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
426 lines
11 KiB
Markdown
426 lines
11 KiB
Markdown
# Skill and Knowledge Learning System
|
|
|
|
## Overview
|
|
|
|
The Skill and Knowledge Learning System automatically extracts learnings from completed tasks and QA passes, storing them in the shared knowledge graph for future skill recommendations and continuous decision-making improvements.
|
|
|
|
This system enables Luzia to:
|
|
- **Learn from successes**: Extract patterns from passing QA validations
|
|
- **Build skill profiles**: Aggregate tool usage, patterns, and decision-making approaches
|
|
- **Make recommendations**: Suggest effective approaches for similar future tasks
|
|
- **Improve over time**: Store learnings persistently for cross-session learning
|
|
|
|
## Architecture
|
|
|
|
### Components
|
|
|
|
```
|
|
TaskExecution
|
|
↓
|
|
TaskAnalyzer → Patterns & Metadata
|
|
↓
|
|
SkillExtractor → Extracted Skills
|
|
↓
|
|
LearningEngine → Learning Objects
|
|
↓
|
|
KnowledgeGraph (research domain)
|
|
↓
|
|
SkillRecommender → Task Recommendations
|
|
```
|
|
|
|
### Core Classes
|
|
|
|
#### 1. **TaskAnalyzer**
|
|
Analyzes task executions to extract patterns and metadata.
|
|
|
|
```python
|
|
from lib.skill_learning_engine import TaskAnalyzer
|
|
|
|
analyzer = TaskAnalyzer()
|
|
|
|
# Analyze a single task
|
|
execution = analyzer.analyze_task({
|
|
"task_id": "task_001",
|
|
"prompt": "Refactor database schema",
|
|
"project": "overbits",
|
|
"status": "success",
|
|
"tools_used": ["Bash", "Read", "Edit"],
|
|
"duration": 45.2,
|
|
"result_summary": "Schema refactored successfully",
|
|
"qa_passed": True,
|
|
"timestamp": "2026-01-09T12:00:00"
|
|
})
|
|
|
|
# Extract patterns from multiple executions
|
|
patterns = analyzer.extract_patterns(executions)
|
|
# Returns: success_rate, average_duration, common_tools, etc.
|
|
```
|
|
|
|
#### 2. **SkillExtractor**
|
|
Extracts skills from task executions and QA results.
|
|
|
|
```python
|
|
from lib.skill_learning_engine import SkillExtractor
|
|
|
|
extractor = SkillExtractor()
|
|
|
|
# Extract skills from task
|
|
skills = extractor.extract_from_task(execution)
|
|
|
|
# Extract skills from QA results
|
|
qa_skills = extractor.extract_from_qa_results(qa_results)
|
|
|
|
# Aggregate multiple skill extractions
|
|
aggregated = extractor.aggregate_skills(all_skills)
|
|
```
|
|
|
|
**Skill Categories:**
|
|
- `tool_usage`: Tools used in task (Read, Bash, Edit, etc.)
|
|
- `pattern`: Task patterns (optimization, debugging, testing, etc.)
|
|
- `decision`: Decision-making approaches
|
|
- `architecture`: Project/system knowledge
|
|
|
|
#### 3. **LearningEngine**
|
|
Processes and stores learnings in the knowledge graph.
|
|
|
|
```python
|
|
from lib.skill_learning_engine import LearningEngine
|
|
|
|
engine = LearningEngine()
|
|
|
|
# Extract a learning from successful task
|
|
learning = engine.extract_learning(execution, skills, qa_results)
|
|
|
|
# Store in knowledge graph
|
|
learning_id = engine.store_learning(learning)
|
|
|
|
# Create skill entities
|
|
skill_id = engine.create_skill_entity(skill)
|
|
```
|
|
|
|
#### 4. **SkillRecommender**
|
|
Recommends skills for future tasks based on stored learnings.
|
|
|
|
```python
|
|
from lib.skill_learning_engine import SkillRecommender
|
|
|
|
recommender = SkillRecommender()
|
|
|
|
# Get recommendations for a task
|
|
recommendations = recommender.recommend_for_task(
|
|
task_prompt="Optimize database performance",
|
|
project="overbits"
|
|
)
|
|
|
|
# Get overall skill profile
|
|
profile = recommender.get_skill_profile()
|
|
```
|
|
|
|
#### 5. **SkillLearningSystem**
|
|
Unified orchestrator for the complete learning pipeline.
|
|
|
|
```python
|
|
from lib.skill_learning_engine import SkillLearningSystem
|
|
|
|
system = SkillLearningSystem()
|
|
|
|
# Process a completed task with QA results
|
|
result = system.process_task_completion(task_data, qa_results)
|
|
# Result includes: skills_extracted, learning_created, learning_id
|
|
|
|
# Get recommendations
|
|
recommendations = system.get_recommendations(prompt, project)
|
|
|
|
# Get learning summary
|
|
summary = system.get_learning_summary()
|
|
```
|
|
|
|
## Integration with QA Validator
|
|
|
|
The learning system integrates seamlessly with the QA validator:
|
|
|
|
### Manual Integration
|
|
|
|
```python
|
|
from lib.qa_learning_integration import QALearningIntegrator
|
|
|
|
integrator = QALearningIntegrator()
|
|
|
|
# Run QA with automatic learning extraction
|
|
result = integrator.run_qa_and_sync_with_learning(sync=True, verbose=True)
|
|
```
|
|
|
|
### Via CLI
|
|
|
|
```bash
|
|
# Standard QA validation
|
|
python3 lib/qa_validator.py
|
|
|
|
# QA validation with learning extraction
|
|
python3 lib/qa_validator.py --learn --sync --verbose
|
|
|
|
# Get statistics on learning integration
|
|
python3 lib/qa_learning_integration.py --stats
|
|
```
|
|
|
|
## Knowledge Graph Storage
|
|
|
|
Learnings are stored in the `research` domain of the knowledge graph:
|
|
|
|
```
|
|
Entity Type: finding
|
|
Name: learning_20260109_120000_Refactor_Database_Schema
|
|
Content:
|
|
- Title: Refactor Database Schema
|
|
- Description: Task execution details
|
|
- Skills Used: tool_bash, tool_read, tool_edit, ...
|
|
- Pattern: refactoring_pattern
|
|
- Applicability: overbits, tool_bash, decision, ...
|
|
- Confidence: 0.85
|
|
|
|
Metadata:
|
|
- skills: [list of skill names]
|
|
- pattern: refactoring_pattern
|
|
- confidence: 0.85
|
|
- applicability: [projects, tools, categories]
|
|
- extraction_time: ISO timestamp
|
|
```
|
|
|
|
### Accessing Stored Learnings
|
|
|
|
```python
|
|
from lib.knowledge_graph import KnowledgeGraph
|
|
|
|
kg = KnowledgeGraph("research")
|
|
|
|
# Search for learnings
|
|
learnings = kg.search("database optimization", limit=10)
|
|
|
|
# Get specific learning
|
|
learning = kg.get_entity("learning_20260109_120000_Refactor_Database_Schema")
|
|
|
|
# Get related skills
|
|
relations = kg.get_relations("learning_20260109_120000_...")
|
|
|
|
# List all learnings
|
|
all_learnings = kg.list_entities(entity_type="finding")
|
|
```
|
|
|
|
## Usage Examples
|
|
|
|
### Example 1: Extract Learnings from Task Completion
|
|
|
|
```python
|
|
from lib.skill_learning_engine import SkillLearningSystem
|
|
|
|
system = SkillLearningSystem()
|
|
|
|
# Task data from execution
|
|
task_data = {
|
|
"task_id": "deploy_overbits_v2",
|
|
"prompt": "Deploy new frontend build to production with zero downtime",
|
|
"project": "overbits",
|
|
"status": "success",
|
|
"tools_used": ["Bash", "Read", "Edit"],
|
|
"duration": 120.5,
|
|
"result_summary": "Successfully deployed with no downtime, 100% rollback verified",
|
|
"qa_passed": True,
|
|
"timestamp": "2026-01-09T15:30:00"
|
|
}
|
|
|
|
# QA validation results
|
|
qa_results = {
|
|
"passed": True,
|
|
"results": {
|
|
"syntax": True,
|
|
"routes": True,
|
|
"command_docs": True,
|
|
},
|
|
"summary": {
|
|
"errors": 0,
|
|
"warnings": 0,
|
|
"info": 5,
|
|
}
|
|
}
|
|
|
|
# Process and extract learnings
|
|
result = system.process_task_completion(task_data, qa_results)
|
|
|
|
print(f"Skills extracted: {result['skills_extracted']}")
|
|
print(f"Learning created: {result['learning_id']}")
|
|
```
|
|
|
|
### Example 2: Get Recommendations for Similar Task
|
|
|
|
```python
|
|
# Later, for a similar deployment task
|
|
new_prompt = "Deploy database migration to production"
|
|
|
|
recommendations = system.get_recommendations(new_prompt, project="overbits")
|
|
|
|
for rec in recommendations:
|
|
print(f"Skill: {rec['skill']}")
|
|
print(f"From learning: {rec['source_learning']}")
|
|
print(f"Confidence: {rec['confidence']:.1%}")
|
|
```
|
|
|
|
### Example 3: Build Skill Profile
|
|
|
|
```python
|
|
# Get overview of learned skills
|
|
profile = system.get_learning_summary()
|
|
|
|
print(f"Total learnings: {profile['total_learnings']}")
|
|
print(f"Skills by category: {profile['by_category']}")
|
|
print(f"Top 5 skills:")
|
|
for skill, count in profile['top_skills'][:5]:
|
|
print(f" {skill}: {count} occurrences")
|
|
```
|
|
|
|
## Testing
|
|
|
|
Run the comprehensive test suite:
|
|
|
|
```bash
|
|
python3 -m pytest tests/test_skill_learning.py -v
|
|
```
|
|
|
|
**Test Coverage:**
|
|
- Task analysis and pattern extraction
|
|
- Skill extraction from tasks and QA results
|
|
- Decision pattern recognition
|
|
- Skill aggregation
|
|
- Learning extraction and storage
|
|
- Skill recommendations
|
|
- Full integration pipeline
|
|
|
|
All tests pass with mocked knowledge graph to avoid dependencies.
|
|
|
|
## Configuration
|
|
|
|
The system is configured in the QA validator integration:
|
|
|
|
**File:** `lib/qa_learning_integration.py`
|
|
|
|
Key settings:
|
|
- **Knowledge Graph Domain**: `research` (all learnings stored here)
|
|
- **Learning Extraction Trigger**: QA pass with all validations successful
|
|
- **Skill Categories**: tool_usage, pattern, decision, architecture
|
|
- **Confidence Calculation**: Weighted average of skill confidence and QA pass rate
|
|
|
|
## Data Flow
|
|
|
|
```
|
|
Task Execution
|
|
↓
|
|
Task Analysis
|
|
├─→ Success rate: 85%
|
|
├─→ Average duration: 45 min
|
|
├─→ Common tools: [Bash, Read, Edit]
|
|
└─→ Project distribution: {overbits: 60%, dss: 40%}
|
|
↓
|
|
Skill Extraction
|
|
├─→ Tool skills (from tools_used)
|
|
├─→ Decision patterns (from prompt)
|
|
├─→ Project knowledge (from project)
|
|
└─→ QA validation skills
|
|
↓
|
|
Learning Creation
|
|
├─→ Title & description
|
|
├─→ Skill aggregation
|
|
├─→ Pattern classification
|
|
├─→ Confidence scoring
|
|
└─→ Applicability determination
|
|
↓
|
|
Knowledge Graph Storage
|
|
└─→ Entity: finding
|
|
Relations: skill → learning
|
|
Metadata: skills, pattern, confidence, applicability
|
|
↓
|
|
Future Recommendations
|
|
└─→ Search similar tasks
|
|
Extract applicable skills
|
|
Rank by confidence
|
|
```
|
|
|
|
## Performance Considerations
|
|
|
|
**Learning Extraction:**
|
|
- Runs only on successful QA passes (not a bottleneck)
|
|
- Async-ready (future enhancement)
|
|
- Minimal overhead (~100ms per extraction)
|
|
|
|
**Recommendation:**
|
|
- Uses FTS5 full-text search on KG
|
|
- Limited to top 10 results
|
|
- Confidence-ranked sorting
|
|
|
|
**Storage:**
|
|
- SQLite with FTS5 (efficient)
|
|
- Automatic indexing and triggers
|
|
- Scales to thousands of learnings
|
|
|
|
## Future Enhancements
|
|
|
|
1. **Async Extraction**: Background learning extraction during deployment
|
|
2. **Confidence Evolution**: Learnings gain/lose confidence based on outcomes
|
|
3. **Skill Decay**: Unused skills decrease in relevance over time
|
|
4. **Cross-Project Learning**: Share learnings between similar projects
|
|
5. **Decision Tracing**: Link recommendations back to specific successful tasks
|
|
6. **Feedback Loop**: Update learning confidence based on task outcomes
|
|
7. **Skill Trees**: Build hierarchies of related skills
|
|
8. **Collaborative Learning**: Share learnings across team instances
|
|
|
|
## Troubleshooting
|
|
|
|
### Learnings Not Being Created
|
|
|
|
Check:
|
|
1. QA validation passes (`qa_results["passed"] == True`)
|
|
2. Knowledge graph is accessible and writable
|
|
3. No errors in `qa_learning_integration.py` output
|
|
|
|
```bash
|
|
python3 lib/qa_validator.py --learn --verbose
|
|
```
|
|
|
|
### Recommendations Are Empty
|
|
|
|
Possible causes:
|
|
1. No learnings stored yet (run a successful task with `--learn`)
|
|
2. Task prompt doesn't match stored learning titles
|
|
3. Knowledge graph search not finding results
|
|
|
|
Test with:
|
|
```bash
|
|
python3 lib/skill_learning_engine.py recommend --task-prompt "Your task" --project overbits
|
|
```
|
|
|
|
### Knowledge Graph Issues
|
|
|
|
Check knowledge graph status:
|
|
```bash
|
|
python3 lib/knowledge_graph.py stats
|
|
python3 lib/knowledge_graph.py search "learning"
|
|
```
|
|
|
|
## API Reference
|
|
|
|
See inline documentation in:
|
|
- `lib/skill_learning_engine.py` - Main system classes
|
|
- `lib/qa_learning_integration.py` - QA integration
|
|
- `tests/test_skill_learning.py` - Usage examples via tests
|
|
|
|
## Contributing
|
|
|
|
To add new skill extraction patterns:
|
|
|
|
1. Add pattern to `SkillExtractor._extract_decision_patterns()`
|
|
2. Update test cases in `TestSkillExtractor.test_extract_decision_patterns()`
|
|
3. Test with: `python3 lib/skill_learning_engine.py test`
|
|
4. Document pattern in this guide
|
|
|
|
## License
|
|
|
|
Part of Luzia Orchestrator. See parent project license.
|