feat: implement CI (Continuous Intelligence) autonomous engineering harness

Implements the full PRD for CI - a fully autonomous AI-driven software engineering harness derived from Learnship's architecture. Core components: - CI Orchestrator agent with autonomous pipeline (SPECIFY → CLARIFY → RESEARCH → PLAN → EXECUTE → VERIFY → COMPLETE) - Decision Engine with confidence thresholds (high/medium/low) - Clarify Phase with question budget and default acceptance - Escalation Protocol with timeout auto-proceed - Audit Trail system (.ci/audit/) for post-hoc review - Error Recovery with retry, plan revision, and rollback 18 agents (all Learnship agents + Orchestrator): - Autonomous behavioral modifications per PRD §7.1 - Agent registry with factory pattern 11 CLI commands: - ci init, ci run, ci quick, ci debug, ci verify - ci review, ci status, ci audit, ci clarify - ci rollback, ci ship 4-layer verification system: - Structural, Behavioral, Security, Code Quality 3 autonomy levels: full, supervised, guided Compatible with Learnship artifact schemas (.planning/)
2026-05-28 23:24:42 +00:00
commit 9cf5c000d9
57 changed files with 7336 additions and 0 deletions
@@ -0,0 +1,203 @@
+# CI — Continuous Intelligence
+
+Fully autonomous AI-driven software engineering harness.
+
+## Overview
+
+CI (Continuous Intelligence) is an autonomous-first software engineering harness that eliminates human-in-the-loop overhead while preserving the rigor of guided development. It receives a specification, resolves ambiguities through a single Clarify phase, then executes the full pipeline — research, plan, execute, verify — autonomously.
+
+## Installation
+
+```bash
+npm install -g @continuous-intelligence/ci
+```
+
+Or from source:
+
+```bash
+git clone <repo-url>
+cd ci
+npm install
+npm run build
+npm link
+```
+
+## Quick Start
+
+```bash
+# Initialize from inline specification
+ci init "Build a REST API for task management"
+
+# Initialize from a specification file
+ci init --spec ./specs/my-project.md
+
+# Initialize with interactive clarify phase
+ci init --clarify "Build a REST API for task management"
+
+# Run the full autonomous pipeline
+ci run --all
+
+# Run a specific phase
+ci run research
+ci run plan
+ci run execute
+ci run verify
+
+# Execute an ad-hoc task
+ci quick "Add authentication middleware"
+
+# Verify a phase
+ci verify 1
+
+# Check project status
+ci status
+
+# Review autonomous decisions
+ci audit
+ci audit --verbose
+
+# Debug an issue
+ci debug "Tests failing on CI"
+
+# Rollback a phase
+ci rollback 1
+
+# Ship a phase (verify, security, commit, tag)
+ci ship 1
+```
+
+## Autonomy Levels
+
+| Level | Behavior |
+|-------|----------|
+| `full` | No human interaction after Clarify. Escalate only irreversible decisions. |
+| `supervised` | Escalate on every Escalation Gate plus verification failures. |
+| `guided` | Escalate on every Decision Gate. Closest to Learnship behavior. |
+
+## Configuration
+
+CI uses `.ci/config.json` for project configuration:
+
+```json
+{
+  "autonomy": {
+    "level": "full",
+    "escalation_hooks": ["deploy", "delete_data", "merge_to_main"],
+    "clarify_budget": 10,
+    "decision_confidence_threshold": 0.6,
+    "max_revision_iterations": 3,
+    "max_verification_retries": 2,
+    "escalation_timeout_ms": 300000
+  },
+  "model_profile": "quality",
+  "parallelization": {
+    "enabled": true,
+    "max_concurrent_agents": 5,
+    "min_plans_for_parallel": 2
+  },
+  "verification": {
+    "automated_only": true,
+    "escalate_visual": true,
+    "escalate_external_integration": true,
+    "test_first": false
+  },
+  "security": {
+    "auto_accept_low_severity": true,
+    "auto_mitigate_medium_severity": true,
+    "escalate_high_severity": true
+  },
+  "git": {
+    "branching_strategy": "phase",
+    "auto_commit": true,
+    "auto_push": false
+  }
+}
+```
+
+## Architecture
+
+### Pipeline
+
+```
+SPECIFY → CLARIFY → RESEARCH → PLAN → EXECUTE → VERIFY → COMPLETE
+               ↕               ↕         ↕          ↕
+          (questions)    (auto-decide) (auto-run) (auto-verify)
+```
+
+### Decision Engine
+
+Every autonomous decision is classified by confidence:
+- **High (>0.85)**: Auto-decide, log to audit trail
+- **Medium (0.60-0.85)**: Auto-decide with assumption logging, flag for review
+- **Low (<0.60)**: Escalate to human
+
+### 18 Agents
+
+All 17 Learnship agents retained, plus the CI Orchestrator:
+
+| Agent | Role | Modification |
+|-------|------|-------------|
+| orchestrator | Pipeline controller | New — replaces interactive workflows |
+| planner | Plan creation | Never sets `autonomous: false` |
+| executor | Task execution | Never pauses for checkpoints |
+| verifier | Output verification | Generates automated tests, not human UAT |
+| researcher | Domain research | Logs assumptions, never flags for human |
+| challenger | Plan stress-testing | Binding verdicts, only escalates <0.60 |
+| security-auditor | Security audit | Auto-dispositions threats |
+| debugger | Bug fixing | Auto-fixes when confidence > threshold |
+| Others | Various | Unchanged from Learnship |
+
+### Verification Layers
+
+1. **Structural**: File existence, import/export wiring, no stubs
+2. **Behavioral**: Generated automated tests for must-haves
+3. **Security**: STRIDE analysis with auto-disposition
+4. **Code Quality**: Multi-persona review with P0 auto-fix
+
+## Specification Format
+
+```markdown
+# Project: My Project
+
+## Objective
+Build a REST API for task management.
+
+## Requirements
+- User authentication (JWT-based)
+- CRUD operations for tasks
+- Real-time notifications
+
+## Constraints
+- Must use Node.js
+- Must be production-ready
+
+## Out of Scope
+- Admin dashboard
+- Mobile apps
+```
+
+## Escalation Protocol
+
+When CI cannot proceed autonomously:
+
+1. **Irreversible Action**: Deploy, delete, merge to protected branch
+2. **Verification Failure**: Tests pass but functional verification fails
+3. **Low Confidence Decision**: Critical decision below threshold
+4. **Security Escalation**: High-severity threat detected
+5. **Specification Ambiguity**: Multiple valid interpretations
+
+Each escalation includes a recommended default with auto-proceed timeout.
+
+## Differences from Learnship
+
+| Dimension | Learnship | CI |
+|-----------|-----------|-----|
+| Human Interactions | 19+/lifecycle | 1-2/lifecycle |
+| Decision Making | Human decides, agent implements | Agent decides, human reviews post-hoc |
+| Verification | Human UAT | Automated tests + escalation |
+| Specification | Multi-round conversation | Single spec file |
+| Learning Curve | Moderate | Low (5 core commands) |
+
+## License
+
+MIT