Refactor cockpit to use DockerTmuxController pattern

Based on claude-code-tools TmuxCLIController, this refactor: - Added DockerTmuxController class for robust tmux session management - Implements send_keys() with configurable delay_enter - Implements capture_pane() for output retrieval - Implements wait_for_prompt() for pattern-based completion detection - Implements wait_for_idle() for content-hash-based idle detection - Implements wait_for_shell_prompt() for shell prompt detection Also includes workflow improvements: - Pre-task git snapshot before agent execution - Post-task commit protocol in agent guidelines Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 10:42:16 -03:00
commit ec33ac1936
265 changed files with 92011 additions and 0 deletions
--- a/docs/SKILL_LEARNING_SYSTEM.md
+++ b/docs/SKILL_LEARNING_SYSTEM.md
@@ -0,0 +1,425 @@
+# Skill and Knowledge Learning System
+
+## Overview
+
+The Skill and Knowledge Learning System automatically extracts learnings from completed tasks and QA passes, storing them in the shared knowledge graph for future skill recommendations and continuous decision-making improvements.
+
+This system enables Luzia to:
+- **Learn from successes**: Extract patterns from passing QA validations
+- **Build skill profiles**: Aggregate tool usage, patterns, and decision-making approaches
+- **Make recommendations**: Suggest effective approaches for similar future tasks
+- **Improve over time**: Store learnings persistently for cross-session learning
+
+## Architecture
+
+### Components
+
+```
+TaskExecution
+     ↓
+TaskAnalyzer → Patterns & Metadata
+     ↓
+SkillExtractor → Extracted Skills
+     ↓
+LearningEngine → Learning Objects
+     ↓
+KnowledgeGraph (research domain)
+     ↓
+SkillRecommender → Task Recommendations
+```
+
+### Core Classes
+
+#### 1. **TaskAnalyzer**
+Analyzes task executions to extract patterns and metadata.
+
+```python
+from lib.skill_learning_engine import TaskAnalyzer
+
+analyzer = TaskAnalyzer()
+
+# Analyze a single task
+execution = analyzer.analyze_task({
+    "task_id": "task_001",
+    "prompt": "Refactor database schema",
+    "project": "overbits",
+    "status": "success",
+    "tools_used": ["Bash", "Read", "Edit"],
+    "duration": 45.2,
+    "result_summary": "Schema refactored successfully",
+    "qa_passed": True,
+    "timestamp": "2026-01-09T12:00:00"
+})
+
+# Extract patterns from multiple executions
+patterns = analyzer.extract_patterns(executions)
+# Returns: success_rate, average_duration, common_tools, etc.
+```
+
+#### 2. **SkillExtractor**
+Extracts skills from task executions and QA results.
+
+```python
+from lib.skill_learning_engine import SkillExtractor
+
+extractor = SkillExtractor()
+
+# Extract skills from task
+skills = extractor.extract_from_task(execution)
+
+# Extract skills from QA results
+qa_skills = extractor.extract_from_qa_results(qa_results)
+
+# Aggregate multiple skill extractions
+aggregated = extractor.aggregate_skills(all_skills)
+```
+
+**Skill Categories:**
+- `tool_usage`: Tools used in task (Read, Bash, Edit, etc.)
+- `pattern`: Task patterns (optimization, debugging, testing, etc.)
+- `decision`: Decision-making approaches
+- `architecture`: Project/system knowledge
+
+#### 3. **LearningEngine**
+Processes and stores learnings in the knowledge graph.
+
+```python
+from lib.skill_learning_engine import LearningEngine
+
+engine = LearningEngine()
+
+# Extract a learning from successful task
+learning = engine.extract_learning(execution, skills, qa_results)
+
+# Store in knowledge graph
+learning_id = engine.store_learning(learning)
+
+# Create skill entities
+skill_id = engine.create_skill_entity(skill)
+```
+
+#### 4. **SkillRecommender**
+Recommends skills for future tasks based on stored learnings.
+
+```python
+from lib.skill_learning_engine import SkillRecommender
+
+recommender = SkillRecommender()
+
+# Get recommendations for a task
+recommendations = recommender.recommend_for_task(
+    task_prompt="Optimize database performance",
+    project="overbits"
+)
+
+# Get overall skill profile
+profile = recommender.get_skill_profile()
+```
+
+#### 5. **SkillLearningSystem**
+Unified orchestrator for the complete learning pipeline.
+
+```python
+from lib.skill_learning_engine import SkillLearningSystem
+
+system = SkillLearningSystem()
+
+# Process a completed task with QA results
+result = system.process_task_completion(task_data, qa_results)
+# Result includes: skills_extracted, learning_created, learning_id
+
+# Get recommendations
+recommendations = system.get_recommendations(prompt, project)
+
+# Get learning summary
+summary = system.get_learning_summary()
+```
+
+## Integration with QA Validator
+
+The learning system integrates seamlessly with the QA validator:
+
+### Manual Integration
+
+```python
+from lib.qa_learning_integration import QALearningIntegrator
+
+integrator = QALearningIntegrator()
+
+# Run QA with automatic learning extraction
+result = integrator.run_qa_and_sync_with_learning(sync=True, verbose=True)
+```
+
+### Via CLI
+
+```bash
+# Standard QA validation
+python3 lib/qa_validator.py
+
+# QA validation with learning extraction
+python3 lib/qa_validator.py --learn --sync --verbose
+
+# Get statistics on learning integration
+python3 lib/qa_learning_integration.py --stats
+```
+
+## Knowledge Graph Storage
+
+Learnings are stored in the `research` domain of the knowledge graph:
+
+```
+Entity Type: finding
+Name: learning_20260109_120000_Refactor_Database_Schema
+Content:
+  - Title: Refactor Database Schema
+  - Description: Task execution details
+  - Skills Used: tool_bash, tool_read, tool_edit, ...
+  - Pattern: refactoring_pattern
+  - Applicability: overbits, tool_bash, decision, ...
+  - Confidence: 0.85
+
+Metadata:
+  - skills: [list of skill names]
+  - pattern: refactoring_pattern
+  - confidence: 0.85
+  - applicability: [projects, tools, categories]
+  - extraction_time: ISO timestamp
+```
+
+### Accessing Stored Learnings
+
+```python
+from lib.knowledge_graph import KnowledgeGraph
+
+kg = KnowledgeGraph("research")
+
+# Search for learnings
+learnings = kg.search("database optimization", limit=10)
+
+# Get specific learning
+learning = kg.get_entity("learning_20260109_120000_Refactor_Database_Schema")
+
+# Get related skills
+relations = kg.get_relations("learning_20260109_120000_...")
+
+# List all learnings
+all_learnings = kg.list_entities(entity_type="finding")
+```
+
+## Usage Examples
+
+### Example 1: Extract Learnings from Task Completion
+
+```python
+from lib.skill_learning_engine import SkillLearningSystem
+
+system = SkillLearningSystem()
+
+# Task data from execution
+task_data = {
+    "task_id": "deploy_overbits_v2",
+    "prompt": "Deploy new frontend build to production with zero downtime",
+    "project": "overbits",
+    "status": "success",
+    "tools_used": ["Bash", "Read", "Edit"],
+    "duration": 120.5,
+    "result_summary": "Successfully deployed with no downtime, 100% rollback verified",
+    "qa_passed": True,
+    "timestamp": "2026-01-09T15:30:00"
+}
+
+# QA validation results
+qa_results = {
+    "passed": True,
+    "results": {
+        "syntax": True,
+        "routes": True,
+        "command_docs": True,
+    },
+    "summary": {
+        "errors": 0,
+        "warnings": 0,
+        "info": 5,
+    }
+}
+
+# Process and extract learnings
+result = system.process_task_completion(task_data, qa_results)
+
+print(f"Skills extracted: {result['skills_extracted']}")
+print(f"Learning created: {result['learning_id']}")
+```
+
+### Example 2: Get Recommendations for Similar Task
+
+```python
+# Later, for a similar deployment task
+new_prompt = "Deploy database migration to production"
+
+recommendations = system.get_recommendations(new_prompt, project="overbits")
+
+for rec in recommendations:
+    print(f"Skill: {rec['skill']}")
+    print(f"From learning: {rec['source_learning']}")
+    print(f"Confidence: {rec['confidence']:.1%}")
+```
+
+### Example 3: Build Skill Profile
+
+```python
+# Get overview of learned skills
+profile = system.get_learning_summary()
+
+print(f"Total learnings: {profile['total_learnings']}")
+print(f"Skills by category: {profile['by_category']}")
+print(f"Top 5 skills:")
+for skill, count in profile['top_skills'][:5]:
+    print(f"  {skill}: {count} occurrences")
+```
+
+## Testing
+
+Run the comprehensive test suite:
+
+```bash
+python3 -m pytest tests/test_skill_learning.py -v
+```
+
+**Test Coverage:**
+- Task analysis and pattern extraction
+- Skill extraction from tasks and QA results
+- Decision pattern recognition
+- Skill aggregation
+- Learning extraction and storage
+- Skill recommendations
+- Full integration pipeline
+
+All tests pass with mocked knowledge graph to avoid dependencies.
+
+## Configuration
+
+The system is configured in the QA validator integration:
+
+**File:** `lib/qa_learning_integration.py`
+
+Key settings:
+- **Knowledge Graph Domain**: `research` (all learnings stored here)
+- **Learning Extraction Trigger**: QA pass with all validations successful
+- **Skill Categories**: tool_usage, pattern, decision, architecture
+- **Confidence Calculation**: Weighted average of skill confidence and QA pass rate
+
+## Data Flow
+
+```
+Task Execution
+      ↓
+Task Analysis
+      ├─→ Success rate: 85%
+      ├─→ Average duration: 45 min
+      ├─→ Common tools: [Bash, Read, Edit]
+      └─→ Project distribution: {overbits: 60%, dss: 40%}
+      ↓
+Skill Extraction
+      ├─→ Tool skills (from tools_used)
+      ├─→ Decision patterns (from prompt)
+      ├─→ Project knowledge (from project)
+      └─→ QA validation skills
+      ↓
+Learning Creation
+      ├─→ Title & description
+      ├─→ Skill aggregation
+      ├─→ Pattern classification
+      ├─→ Confidence scoring
+      └─→ Applicability determination
+      ↓
+Knowledge Graph Storage
+      └─→ Entity: finding
+          Relations: skill → learning
+          Metadata: skills, pattern, confidence, applicability
+      ↓
+Future Recommendations
+      └─→ Search similar tasks
+          Extract applicable skills
+          Rank by confidence
+```
+
+## Performance Considerations
+
+**Learning Extraction:**
+- Runs only on successful QA passes (not a bottleneck)
+- Async-ready (future enhancement)
+- Minimal overhead (~100ms per extraction)
+
+**Recommendation:**
+- Uses FTS5 full-text search on KG
+- Limited to top 10 results
+- Confidence-ranked sorting
+
+**Storage:**
+- SQLite with FTS5 (efficient)
+- Automatic indexing and triggers
+- Scales to thousands of learnings
+
+## Future Enhancements
+
+1. **Async Extraction**: Background learning extraction during deployment
+2. **Confidence Evolution**: Learnings gain/lose confidence based on outcomes
+3. **Skill Decay**: Unused skills decrease in relevance over time
+4. **Cross-Project Learning**: Share learnings between similar projects
+5. **Decision Tracing**: Link recommendations back to specific successful tasks
+6. **Feedback Loop**: Update learning confidence based on task outcomes
+7. **Skill Trees**: Build hierarchies of related skills
+8. **Collaborative Learning**: Share learnings across team instances
+
+## Troubleshooting
+
+### Learnings Not Being Created
+
+Check:
+1. QA validation passes (`qa_results["passed"] == True`)
+2. Knowledge graph is accessible and writable
+3. No errors in `qa_learning_integration.py` output
+
+```bash
+python3 lib/qa_validator.py --learn --verbose
+```
+
+### Recommendations Are Empty
+
+Possible causes:
+1. No learnings stored yet (run a successful task with `--learn`)
+2. Task prompt doesn't match stored learning titles
+3. Knowledge graph search not finding results
+
+Test with:
+```bash
+python3 lib/skill_learning_engine.py recommend --task-prompt "Your task" --project overbits
+```
+
+### Knowledge Graph Issues
+
+Check knowledge graph status:
+```bash
+python3 lib/knowledge_graph.py stats
+python3 lib/knowledge_graph.py search "learning"
+```
+
+## API Reference
+
+See inline documentation in:
+- `lib/skill_learning_engine.py` - Main system classes
+- `lib/qa_learning_integration.py` - QA integration
+- `tests/test_skill_learning.py` - Usage examples via tests
+
+## Contributing
+
+To add new skill extraction patterns:
+
+1. Add pattern to `SkillExtractor._extract_decision_patterns()`
+2. Update test cases in `TestSkillExtractor.test_extract_decision_patterns()`
+3. Test with: `python3 lib/skill_learning_engine.py test`
+4. Document pattern in this guide
+
+## License
+
+Part of Luzia Orchestrator. See parent project license.