Add IntelligenceBackend abstraction with two categories: - LLMBackend (OllamaLocal, OllamaCloud): CI runs tool loop, provides tools, constructs prompts - AgentBackend (Opencode): agent runs own tool loop, CI serializes request Refactor all 18 agents from hardcoded stubs to persona loaders that delegate to the active backend or fail honestly when no backend is available. Refactor OrchestratorAgent.executeStage() from monolithic switch to agent delegation via STAGE_AGENT_MAP for intelligent stages (research, plan, execute, verify), with mechanical stages (specify, clarify, complete) staying inline. Wire CLI commands with --backend flag and auto-detection (opencode → ollama-local → ollama-cloud). Harden rollback/ship with real git operations. No command returns fake success.
14 KiB
AGENTS.md — CI Project Guidelines
Build & Run Commands
- Build:
npm run build(compiles TypeScript todist/) - Typecheck:
npm run typecheck(runstsc --noEmit) - Test:
npm run test(runs Jest with ts-jest) - Dev:
npm run dev(runs CLI via ts-node)
Project Overview
CI (Continuous Intelligence) is a fully autonomous AI-driven software engineering harness. It receives a specification, resolves ambiguities through a single Clarify phase, then executes the full pipeline (research → plan → execute → verify) autonomously, escalating only when it cannot safely proceed alone.
Architecture
src/
agents/ # 18 agent implementations (persona loaders delegating to backends)
backends/ # Intelligence backend layer
types.ts # IntelligenceBackend, BackendRequest, BackendResult, BackendConfigSection
tool-registry.ts # CI-owned tool implementations (readFile, writeFile, editFile, runBash, glob, grep)
ollama-base.ts # Abstract base for Ollama backends (shared tool loop, prompt construction)
ollama-local.ts # OllamaLocalBackend (localhost:11434)
ollama-cloud.ts # OllamaCloudBackend (remote endpoint, auth, rate limiting)
opencode.ts # OpencodeBackend (shells out to opencode --non-interactive)
index.ts # Backend registry + auto-detection
cli/ # Commander.js CLI (commands.ts, index.ts)
core/ # Core engine components
artifacts.ts # Legacy .planning/ artifact management (retained for backward compat)
audit.ts # Legacy audit trail in .ci/audit/ (retained for backward compat)
ci-files.ts # .ci/ long-lived reference file management (PROJECT.md, ROADMAP.md, etc.)
clarify.ts # Clarify phase: question generation, default acceptance
commit-builder.ts # Structured commit message generation (---ci--- YAML blocks)
commit-parser.ts # ---ci--- YAML block extraction and parsing
config.ts # .ci/config.json load/save/init
decision-engine.ts # Bounded rationality: commits decisions as git artifacts
error-recovery.ts # Retry, plan revision, rollback logic
escalation.ts # Escalation protocol: commits escalations as git artifacts
git-branch.ts # Branch lifecycle: phase/NN-slug, milestone/vX.X-slug
git-context.ts # Project state reconstruction from git log + branches
types/ # Type definitions
commit-meta.ts # CiMetadata, CommitDecision, CommitEscalation, ParsedCiCommit
config.ts # CIConfig, AutonomyLevel, ModelProfile, DEFAULT_CI_CONFIG (includes backend)
decisions.ts # Decision, ConfidenceLevel, DecisionCategory
escalation.ts # Escalation, EscalationType, EscalationResolution
clarify.ts # ClarifyQuestion, ClarifyResult
specification.ts # Specification parser (objective, requirements, constraints, out_of_scope)
pipeline.ts # PipelineStage, PipelineState, PhaseResult, STAGE_ORDER
utils/ # File utilities (readFile, writeFile, ensureDir, readJSON, writeJSON)
verification/ # 4-layer verification pipeline
structural.ts # Layer 1: file existence, imports wired, no stubs
behavioral.ts # Layer 2: test generation and execution (stub)
security.ts # Layer 3: STRIDE threat analysis (stub)
quality.ts # Layer 4: multi-persona code review (stub)
index.ts # Public API exports
version.ts # VERSION = "0.3.0"
templates/ # Template files (config.json, DECISIONS.md, specification.md)
Key Design Decisions
- Autonomy levels:
full(no HITL after clarify),supervised(escalate on gates + verification failures),guided(escalate on every decision gate) - Decision confidence thresholds: High (>0.85) auto-decide and log; Medium (0.60–0.85) auto-decide with assumption logging; Low (<0.60) escalate to human
- Escalation timeout: Default 5 minutes, then auto-proceeds with recommended option. Set to
0to require human,-1to always auto-proceed - 18 agents inherited from Learnship, all re-prompted for autonomous operation. OrchestratorAgent is CI-specific
- Git-native context: The git log IS the project memory. Agent's first impulse to gather context is
git log+git branch, not file reads. Dynamic state (decisions, escalations, lessons, compounding) lives in---ci---YAML blocks in commit messages..ci/holds only long-lived reference docs (PROJECT.md, ARCHITECTURE.md, ROADMAP.md, REQUIREMENTS.md, config.json). - Artifact compatibility: CI no longer writes
.planning/schema. Dynamic state is derived from git history..ci/files follow a CI-native schema.
Code Conventions
- Language: TypeScript with ES2022 target, Node16 modules
- Module resolution: Node16 style with
.jsextensions in imports - Agent pattern: All agents extend
BaseAgentwithname(AgentName),description,workflow, andexecute(context: AgentContext): Promise<AgentResult>. Agents delegate tocontext.backendwhen available, fail honestly when not. - No runtime validation library: Uses plain TypeScript types, not Zod schemas (Zod is a dependency but types are hand-defined)
- File I/O: Use
src/utils/file.tshelpers (writeFile,readFile,ensureDir,readJSON,writeJSON) instead of rawfscalls in agent/business logic - Config:
CIConfigtype andDEFAULT_CI_CONFIGinsrc/types/config.ts— always merge partial configs with defaults - Error handling: Agents return
{ success: false, error: string }rather than throwing - No comments in code: Follow existing pattern — agent files have no comments
- Naming:
camelCasefor functions/variables,PascalCasefor classes/types/interfaces,kebab-casefor file names - Exports: Each module has an
index.tsbarrel file re-exporting public API
Pipeline Flow
SPECIFY → CLARIFY → RESEARCH → PLAN → EXECUTE → VERIFY → COMPLETE
Each stage is executed by OrchestratorAgent.executeStage(). The orchestrator delegates intelligent stages (research, plan, execute, verify) to specialized agents via context.backend when available, falling back to mechanical execution when no backend is configured. Mechanical stages (specify, clarify, complete) are always handled by the orchestrator directly.
Intelligence Backend Architecture
IntelligenceBackend (unified interface)
├── LLMBackend (CI runs tool loop, provides tools, constructs prompts)
│ ├── OllamaLocalBackend (localhost:11434, no auth)
│ ├── OllamaCloudBackend (remote endpoint, API key, rate limits)
│ └── (future: OpenAI, Anthropic, Gemini, etc.)
└── AgentBackend (agent runs own tool loop, CI sends request)
├── OpencodeBackend (opencode --non-interactive)
└── (future: Codex, Claude Code, Hermes, etc.)
- LLM backends: CI constructs system prompts from persona.md + workflow.md, defines tool schemas, runs the tool-call loop via
ToolRegistry, and parses structured JSON output - Agent backends: CI serializes
BackendRequest, invokes the agent, and parses JSONBackendResultfrom stdout - Auto-detection (provider: "auto"): tries opencode → ollama-local → ollama-cloud → fails with instructions
- Per-command override:
ci run --backend ollama-localforces a specific backend - Config:
backendsection in.ci/config.jsonwith provider, fallback, agent_backends, llm_backends
Agent Modification Rules (from PRD)
| Agent | Key Modification |
|---|---|
| planner | Never set autonomous: false. Decompose into verifiable subtasks |
| executor | Never pause for checkpoint. Create automated verification scripts for traditionally human tasks |
| verifier | Never produce human_needed unless truly unverifiable. Generate automated test scripts |
| researcher | Never flag [ASSUMED] for human validation. Log assumption to DECISIONS.md with confidence |
| challenger | Binding verdicts. Only escalate when confidence < 0.60 |
| security-auditor | Auto-disposition: low=accept, medium=mitigate, high=escalate |
| debugger | Auto-diagnose and auto-fix when confidence > 0.60 |
| code-reviewer | Auto-apply P0 fixes. Flag P1+ for post-hoc review |
Verification Layers
- Structural: Files exist, imports wired, no stubs/TODOs
- Behavioral: Generate and run automated tests for must-haves (currently stub)
- Security: STRIDE analysis with auto-disposition (currently stub)
- Code Quality: Multi-persona review with P0 auto-fix (currently stub)
Testing
- Test framework: Jest with ts-jest
- Test file pattern:
**/*.test.tsinsrc/ - Run:
npm run test - 25 test suites, 218 tests covering types, core, git-native, verification, and utility modules
- Tests use temp directories (os.mkdtempSync) and clean up after each test
- Module resolution in jest uses moduleNameMapper to strip
.jsextensions
Important Files
.ci/config.json— Project-level CI configuration (autonomy, parallelization, verification, security, git).ci/PROJECT.md— Vision, core value, requirements, constraints, key decisions table.ci/ARCHITECTURE.md— System architecture, component boundaries, data flow.ci/ROADMAP.md— Phase breakdown, milestone mapping, success criteria, progress table.ci/REQUIREMENTS.md— v1/v2 requirements with REQ-IDs and traceability matrix- Git log — Primary project memory: decisions, escalations, lessons, compounding, verification results
- Branch structure —
phase/NN-slug(active/complete) andmilestone/vX.X-slugbranches
Release Flow
Automation Requirements (Full Autopilot)
- Zero-HITL Execution: Beyond initial milestone setup, operate fully autonomously. No confirmation, permission, or prompts to the user.
- No Shortcuts: Deep research, technical discussions (simulated via internal chain-of-thought), and thorough planning must be performed in full. Autonomy does not bypass rigor.
- Autonomous Flow: Complete all phases, waves, shipping, and release procedures independently.
- Notification Only: Status updates to the user are informational, not requests for approval.
- Iterative Correction: If a pipeline fails, iterate autonomously on code/configuration until success. Do not ask the user for guidance on fixing failures.
Execution Workflow
-
Pre-Development Setup
- Define semver tag before any development work begins
- Ensure milestone is defined with version mapping: Major → Project, Minor → Milestone, Patch → Phase
-
Development & Integration
- Create a dedicated feature branch in Gitea
- Create/configure the CI pipeline via coreci
- Create comprehensive tests to validate the feature
- Push all changes to the Gitea repository
-
PR & Quality Assurance
- Open PR in Gitea
- Set PR to auto-merge upon pipeline success
- Never merge a PR with a failed pipeline test
- Conduct thorough autonomous review of the PR
- On success: approve, merge, notify user. On failure: iterate autonomously until pipeline passes
-
Release Finalization
- Apply the previously defined semver tag in Gitea
- Create distribution packages via coreci
- Generate comprehensive release notes
Supported Ecosystem
| Component | Provider | Detail |
|---|---|---|
| VCS | Gitea | https://git.cloudinit.dev |
| CI | coreci | https://coreci.dev |
| CLI | tea |
Gitea CLI |
Gitea serves strictly as the VCS. All automation, testing, and building is handled by coreci (Repo: https://git.cloudinit.dev/cloudinit-dev/coreci).
Current State
- v0.2.0: Git-native architecture — project memory lives in git log, not
.planning/files - New modules: commit-parser (
---ci---YAML block extraction/parsing), commit-builder (structured commit message generation), git-context (project state reconstruction from git log + branches), git-branch (phase/milestone branch lifecycle), ci-files (.ci/long-lived reference file management) - Commit schema: Every CI-generated commit contains a
---ci---YAML block with phase, milestone, status, decisions, escalations, requirements, lessons, and compound metadata - Branch strategy:
phase/NN-slugandmilestone/vX.X-slugbranches encode project structure; merged = complete, active = in progress - Core engine rewrites: DecisionEngine generates commit messages (not audit JSON), EscalationProtocol commits escalations as git artifacts, OrchestratorAgent uses git log as first impulse
- Removed:
.ci/audit/directory (audit trail is git log),.planning/directory (dynamic state derived from git history) .ci/contents:config.json,PROJECT.md,ARCHITECTURE.md,ROADMAP.md,REQUIREMENTS.md— long-lived reference docs updated with discipline- Reconstruction test: An agent with only commit message access can reconstruct project state (phase, decisions, requirements coverage, lessons, escalations)
- Verification layers: All 4 layers implemented — structural, behavioral, security (STRIDE), quality
- CLI: All 11 commands wired up (
init,run,quick,debug,verify,review,status,audit,clarify,rollback,ship) - Agent implementations: Persona loaders that delegate to active backend. Fail honestly when no backend is available (no more fake success).
- Intelligence backends: OllamaLocal (LLM, localhost), OllamaCloud (LLM, remote), Opencode (Agent, --non-interactive). Auto-detection: opencode → ollama-local → ollama-cloud.
- Tests: 25 test suites, 218 tests covering types, config, decision-engine, escalation, clarify, commit-parser, commit-builder, git-context, git-branch, ci-files, all 4 verification layers, file utils