9cf5c000d9
Implements the full PRD for CI - a fully autonomous AI-driven software engineering harness derived from Learnship's architecture. Core components: - CI Orchestrator agent with autonomous pipeline (SPECIFY → CLARIFY → RESEARCH → PLAN → EXECUTE → VERIFY → COMPLETE) - Decision Engine with confidence thresholds (high/medium/low) - Clarify Phase with question budget and default acceptance - Escalation Protocol with timeout auto-proceed - Audit Trail system (.ci/audit/) for post-hoc review - Error Recovery with retry, plan revision, and rollback 18 agents (all Learnship agents + Orchestrator): - Autonomous behavioral modifications per PRD §7.1 - Agent registry with factory pattern 11 CLI commands: - ci init, ci run, ci quick, ci debug, ci verify - ci review, ci status, ci audit, ci clarify - ci rollback, ci ship 4-layer verification system: - Structural, Behavioral, Security, Code Quality 3 autonomy levels: full, supervised, guided Compatible with Learnship artifact schemas (.planning/)
5.2 KiB
5.2 KiB
CI — Continuous Intelligence
Fully autonomous AI-driven software engineering harness.
Overview
CI (Continuous Intelligence) is an autonomous-first software engineering harness that eliminates human-in-the-loop overhead while preserving the rigor of guided development. It receives a specification, resolves ambiguities through a single Clarify phase, then executes the full pipeline — research, plan, execute, verify — autonomously.
Installation
npm install -g @continuous-intelligence/ci
Or from source:
git clone <repo-url>
cd ci
npm install
npm run build
npm link
Quick Start
# Initialize from inline specification
ci init "Build a REST API for task management"
# Initialize from a specification file
ci init --spec ./specs/my-project.md
# Initialize with interactive clarify phase
ci init --clarify "Build a REST API for task management"
# Run the full autonomous pipeline
ci run --all
# Run a specific phase
ci run research
ci run plan
ci run execute
ci run verify
# Execute an ad-hoc task
ci quick "Add authentication middleware"
# Verify a phase
ci verify 1
# Check project status
ci status
# Review autonomous decisions
ci audit
ci audit --verbose
# Debug an issue
ci debug "Tests failing on CI"
# Rollback a phase
ci rollback 1
# Ship a phase (verify, security, commit, tag)
ci ship 1
Autonomy Levels
| Level | Behavior |
|---|---|
full |
No human interaction after Clarify. Escalate only irreversible decisions. |
supervised |
Escalate on every Escalation Gate plus verification failures. |
guided |
Escalate on every Decision Gate. Closest to Learnship behavior. |
Configuration
CI uses .ci/config.json for project configuration:
{
"autonomy": {
"level": "full",
"escalation_hooks": ["deploy", "delete_data", "merge_to_main"],
"clarify_budget": 10,
"decision_confidence_threshold": 0.6,
"max_revision_iterations": 3,
"max_verification_retries": 2,
"escalation_timeout_ms": 300000
},
"model_profile": "quality",
"parallelization": {
"enabled": true,
"max_concurrent_agents": 5,
"min_plans_for_parallel": 2
},
"verification": {
"automated_only": true,
"escalate_visual": true,
"escalate_external_integration": true,
"test_first": false
},
"security": {
"auto_accept_low_severity": true,
"auto_mitigate_medium_severity": true,
"escalate_high_severity": true
},
"git": {
"branching_strategy": "phase",
"auto_commit": true,
"auto_push": false
}
}
Architecture
Pipeline
SPECIFY → CLARIFY → RESEARCH → PLAN → EXECUTE → VERIFY → COMPLETE
↕ ↕ ↕ ↕
(questions) (auto-decide) (auto-run) (auto-verify)
Decision Engine
Every autonomous decision is classified by confidence:
- High (>0.85): Auto-decide, log to audit trail
- Medium (0.60-0.85): Auto-decide with assumption logging, flag for review
- Low (<0.60): Escalate to human
18 Agents
All 17 Learnship agents retained, plus the CI Orchestrator:
| Agent | Role | Modification |
|---|---|---|
| orchestrator | Pipeline controller | New — replaces interactive workflows |
| planner | Plan creation | Never sets autonomous: false |
| executor | Task execution | Never pauses for checkpoints |
| verifier | Output verification | Generates automated tests, not human UAT |
| researcher | Domain research | Logs assumptions, never flags for human |
| challenger | Plan stress-testing | Binding verdicts, only escalates <0.60 |
| security-auditor | Security audit | Auto-dispositions threats |
| debugger | Bug fixing | Auto-fixes when confidence > threshold |
| Others | Various | Unchanged from Learnship |
Verification Layers
- Structural: File existence, import/export wiring, no stubs
- Behavioral: Generated automated tests for must-haves
- Security: STRIDE analysis with auto-disposition
- Code Quality: Multi-persona review with P0 auto-fix
Specification Format
# Project: My Project
## Objective
Build a REST API for task management.
## Requirements
- User authentication (JWT-based)
- CRUD operations for tasks
- Real-time notifications
## Constraints
- Must use Node.js
- Must be production-ready
## Out of Scope
- Admin dashboard
- Mobile apps
Escalation Protocol
When CI cannot proceed autonomously:
- Irreversible Action: Deploy, delete, merge to protected branch
- Verification Failure: Tests pass but functional verification fails
- Low Confidence Decision: Critical decision below threshold
- Security Escalation: High-severity threat detected
- Specification Ambiguity: Multiple valid interpretations
Each escalation includes a recommended default with auto-proceed timeout.
Differences from Learnship
| Dimension | Learnship | CI |
|---|---|---|
| Human Interactions | 19+/lifecycle | 1-2/lifecycle |
| Decision Making | Human decides, agent implements | Agent decides, human reviews post-hoc |
| Verification | Human UAT | Automated tests + escalation |
| Specification | Multi-round conversation | Single spec file |
| Learning Curve | Moderate | Low (5 core commands) |
License
MIT