# AGENTS.md — CIAgent Project Guidelines ## Build & Run Commands - **Build**: `npm run build` (compiles TypeScript to `dist/`) - **Typecheck**: `npm run typecheck` (runs `tsc --noEmit`) - **Test**: `npm run test` (runs Jest with ts-jest) - **Dev**: `npm run dev` (runs CLI via ts-node) ## Project Overview CIAgent (Continuous Intelligence) is a fully autonomous AI-driven software engineering harness. It receives a specification, resolves ambiguities through a single Clarify phase, then executes the full pipeline (research → plan → execute → verify) autonomously, escalating only when it cannot safely proceed alone. ## Architecture ``` src/ agents/ # 19 agent implementations (persona loaders delegating to backends) backends/ # Intelligence backend layer types.ts # IntelligenceBackend, BackendRequest, BackendResult, BackendConfigSection tool-registry.ts # CIAgent-owned tool implementations (readFile, writeFile, editFile, runBash, glob, grep) llm-base.ts # Abstract base for LLM backends (shared tool loop, prompt construction) ollama-local.ts # OllamaLocalBackend (localhost:11434) ollama-cloud.ts # OllamaCloudBackend (remote endpoint, auth, rate limiting) openai.ts # OpenAIBackend (OpenAI API, gpt-4o) anthropic.ts # AnthropicBackend (Anthropic API, Claude) opencode.ts # OpencodeBackend (shells out to opencode --non-interactive) index.ts # Backend registry + auto-detection cli/ # Commander.js CLI (commands.ts, index.ts) core/ # Core engine components artifacts.ts # Legacy .ciagent/ artifact management (retained for backward compat) audit.ts # Git-native audit trail — reads decisions/escalations from git log ciagent-files.ts # .ciagent/ long-lived reference file management (PROJECT.md, ROADMAP.md, etc.) clarify.ts # Clarify phase: question generation, default acceptance commit-builder.ts # Structured commit message generation (---ci--- YAML blocks) commit-parser.ts # ---ci--- YAML block extraction and parsing config.ts # .ciagent/config.json load/save/init decision-engine.ts # Bounded rationality: commits decisions as git artifacts error-recovery.ts # Retry, plan revision, rollback logic escalation.ts # Escalation protocol: commits escalations as git artifacts git-branch.ts # Branch lifecycle: phase/NN-slug, milestone/vX.X-slug git-context.ts # Project state reconstruction from git log + branches types/ # Type definitions commit-meta.ts # CIAgentMetadata, CommitDecision, CommitEscalation, ParsedCIAgentCommit config.ts # CIAgentConfig, AutonomyLevel, ModelProfile, DEFAULT_CIAGENT_CONFIG (includes backend) decisions.ts # Decision, ConfidenceLevel, DecisionCategory escalation.ts # Escalation, EscalationType, EscalationResolution clarify.ts # ClarifyQuestion, ClarifyResult specification.ts # Specification parser (objective, requirements, constraints, out_of_scope) pipeline.ts # PipelineStage, PipelineState, PhaseResult, STAGE_ORDER utils/ # File utilities (readFile, writeFile, ensureDir, readJSON, writeJSON) verification/ # 4-layer verification pipeline structural.ts # Layer 1: file existence, imports wired, no stubs behavioral.ts # Layer 2: test infrastructure checks (static analysis, no test generation yet) security.ts # Layer 3: regex-based threat pattern scanning (no STRIDE analysis yet) quality.ts # Layer 4: regex-based code quality checks (no multi-persona review yet) index.ts # Public API exports version.ts # VERSION = "0.9.0" templates/ # Template files (config.json, DECISIONS.md, specification.md) ``` ## Key Design Decisions - **Autonomy levels**: `full` (no HITL after clarify), `supervised` (escalate on gates + verification failures), `guided` (escalate on every decision gate) - **Decision confidence thresholds**: High (>0.85) auto-decide and log; Medium (0.60–0.85) auto-decide with assumption logging; Low (<0.60) escalate to human - **Escalation timeout**: Default 5 minutes, then auto-proceeds with recommended option. Set to `0` to require human, `-1` to always auto-proceed - **19 agents** purpose-built for CIAgent, all configured for autonomous operation. OrchestratorAgent is CIAgent-specific - **Git-native context**: The git log IS the project memory. Agent's first impulse to gather context is `git log` + `git branch`, not file reads. Dynamic state (decisions, escalations, lessons, compounding) lives in `---ci---` YAML blocks in commit messages. `.ciagent/` holds only long-lived reference docs (PROJECT.md, ARCHITECTURE.md, ROADMAP.md, REQUIREMENTS.md, config.json). - **Artifact compatibility**: CIAgent no longer writes `.planning/` schema. `.ciagent/` files follow a CIAgent-native schema. ## Code Conventions - **Language**: TypeScript with ES2022 target, Node16 modules - **Module resolution**: Node16 style with `.js` extensions in imports - **Agent pattern**: All agents extend `BaseAgent` with `name` (AgentName), `description`, `workflow`, and `execute(context: AgentContext): Promise`. Agents delegate to `context.backend` when available, fail honestly when not. - **No runtime validation library**: Uses plain TypeScript types, not Zod schemas (Zod is a dependency but types are hand-defined) - **File I/O**: Use `src/utils/file.ts` helpers (`writeFile`, `readFile`, `ensureDir`, `readJSON`, `writeJSON`) instead of raw `fs` calls in agent/business logic - **Config**: `CIAgentConfig` type and `DEFAULT_CIAGENT_CONFIG` in `src/types/config.ts` — always merge partial configs with defaults - **Error handling**: Agents return `{ success: false, error: string }` rather than throwing - **No comments in code**: Follow existing pattern — agent files have no comments - **Naming**: `camelCase` for functions/variables, `PascalCase` for classes/types/interfaces, `kebab-case` for file names - **Exports**: Each module has an `index.ts` barrel file re-exporting public API ## Pipeline Flow ``` SPECIFY → CLARIFY → RESEARCH → PLAN → EXECUTE → TEST → VERIFY → COMPLETE ``` Each stage is executed by `OrchestratorAgent.executeStage()`. The orchestrator delegates intelligent stages (research, plan, execute, test, verify) to specialized agents via `context.backend` when available, falling back to mechanical execution when no backend is configured. Mechanical stages (specify, clarify, complete) are always handled by the orchestrator directly. ## Intelligence Backend Architecture ``` IntelligenceBackend (unified interface) ├── LLMBackend (CIAgent runs tool loop, provides tools, constructs prompts) │ ├── OllamaLocalBackend (localhost:11434, no auth) │ ├── OllamaCloudBackend (remote endpoint, API key, rate limits) │ ├── OpenAIBackend (OpenAI API, gpt-4o, API key auth) │ └── AnthropicBackend (Anthropic API, Claude, API key auth) └── AgentBackend (agent runs own tool loop, CIAgent sends request) ├── OpencodeBackend (opencode --non-interactive) └── (future: Codex, Claude Code, Hermes, etc.) ``` - **LLM backends**: CIAgent constructs system prompts from persona.md + workflow.md, defines tool schemas, runs the tool-call loop via `ToolRegistry`, and parses structured JSON output - **Agent backends**: CIAgent serializes `BackendRequest`, invokes the agent, and parses JSON `BackendResult` from stdout - **Auto-detection** (provider: "auto"): tries opencode → openai → ollama-local → ollama-cloud → anthropic → fails with instructions - **Per-command override**: `ciagent run --backend ollama-local` forces a specific backend (options: opencode, openai, anthropic, ollama-local, ollama-cloud) - **Config**: `backend` section in `.ciagent/config.json` with provider, fallback, agent_backends, llm_backends ## Agent Modification Rules (from PRD) | Agent | Key Modification | |---|---| | planner | Never set `autonomous: false`. Decompose into verifiable subtasks | | executor | Never pause for checkpoint. Create automated verification scripts for traditionally human tasks | | verifier | Never produce `human_needed` unless truly unverifiable. Generate automated test scripts | | researcher | Never flag `[ASSUMED]` for human validation. Log assumption to DECISIONS.md with confidence | | challenger | Binding verdicts. Only escalate when confidence < 0.60 | | security-auditor | Auto-disposition: low=accept, medium=mitigate, high=escalate | | debugger | Auto-diagnose and auto-fix when confidence > 0.60 | | code-reviewer | Auto-apply P0 fixes. Flag P1+ for post-hoc review | ## Verification Layers 1. **Structural**: Files exist, imports wired, no stubs/TODOs 2. **Behavioral**: Test execution and requirement traceability — runs test framework, parses results, reports pass/fail per suite 3. **Security**: Full STRIDE threat pattern scanning with CWE mapping and confidence-based auto-disposition 4. **Code Quality**: 3-persona code review (security, performance, maintainability) with P0/P1/P2 findings ## Testing - Test framework: Jest with ts-jest - Test file pattern: `**/*.test.ts` in `src/` - Run: `npm run test` - 57 test suites, 527 tests covering types, core, git-native, verification, agent, backends, and utility modules - Tests use temp directories (os.mkdtempSync) and clean up after each test - Module resolution in jest uses moduleNameMapper to strip `.js` extensions ## Important Files - `.ciagent/config.json` — Project-level CIAgent configuration (autonomy, parallelization, verification, security, git) - `.ciagent/PROJECT.md` — Vision, core value, requirements, constraints, key decisions table - `.ciagent/ARCHITECTURE.md` — System architecture, component boundaries, data flow - `.ciagent/ROADMAP.md` — Phase breakdown, milestone mapping, success criteria, progress table - `.ciagent/REQUIREMENTS.md` — v1/v2 requirements with REQ-IDs and traceability matrix - Git log — Primary project memory: decisions, escalations, lessons, compounding, verification results - Branch structure — `phase/NN-slug` (active/complete) and `milestone/vX.X-slug` branches ## Release Flow ### Automation Requirements (Full Autopilot) - **Zero-HITL Execution**: Beyond initial milestone setup, operate fully autonomously. No confirmation, permission, or prompts to the user. - **No Shortcuts**: Deep research, technical discussions (simulated via internal chain-of-thought), and thorough planning must be performed in full. Autonomy does not bypass rigor. - **Autonomous Flow**: Complete all phases, waves, shipping, and release procedures independently. - **Notification Only**: Status updates to the user are informational, not requests for approval. - **Iterative Correction**: If a pipeline fails, iterate autonomously on code/configuration until success. Do not ask the user for guidance on fixing failures. ### Execution Workflow 1. **Pre-Development Setup** - Define semver tag before any development work begins - Ensure milestone is defined with version mapping: Major → Project, Minor → Milestone, Patch → Phase 2. **Development & Integration** - Create a dedicated feature branch in Gitea - Create/configure the CI pipeline via coreci - Create comprehensive tests to validate the feature - Push all changes to the Gitea repository 3. **PR & Quality Assurance** - Open PR in Gitea - Set PR to auto-merge upon pipeline success - **Never merge a PR with a failed pipeline test** - Conduct thorough autonomous review of the PR - On success: approve, merge, notify user. On failure: iterate autonomously until pipeline passes 4. **Release Finalization** - Apply the previously defined semver tag in Gitea - Create distribution packages via coreci - Generate comprehensive release notes ### Supported Ecosystem | Component | Provider | Detail | |---|---|---| | **VCS** | **Gitea** | https://git.cloudinit.dev | | **CI** | **coreci** | https://coreci.dev | | **CLI** | `tea` | Gitea CLI | > Gitea serves strictly as the VCS. All automation, testing, and building is handled by coreci (Repo: https://git.cloudinit.dev/cloudinit-dev/coreci). ## Current State - **v0.9.0**: Integration & hardening — OpenAI and Anthropic backends, all 19 agents with intrinsic mechanical logic, E2E v0.9 integration tests, parallel agent execution - **v0.8.0**: 11 newly-fleshed agents with mechanical methods, OpenAI/Anthropic config types, Gitea CI workflows - **New backends (v0.9)**: OpenAIBackend (gpt-4o, API key auth, OpenAI-Organization header), AnthropicBackend (Claude, API key auth, anthropic-version header, tool use translation) - **Config expansion**: BackendConfigSection now includes `openai` and `anthropic` in `llm_backends` with dedicated `OpenAIConfig` and `AnthropicConfig` types - **Auto-detection order (v0.9)**: opencode → openai → ollama-local → ollama-cloud → anthropic - **All agents mechanical**: Every non-orchestrator agent (18/19) produces meaningful output without a backend — no "requires intelligence backend" stub errors - **Integration tests**: E2E v0.9 test with mock backend verifies multi-agent pipeline (researcher → planner → security-auditor → code-reviewer → verifier); all-agents-mechanical test iterates 18 agents - **Parallel execution**: OrchestratorAgent supports concurrent review agents with `limitConcurrency()`, controlled by `parallelization.max_concurrent_agents` - **New modules**: commit-parser (`---ci---` YAML block extraction/parsing), commit-builder (structured commit message generation), git-context (project state reconstruction from git log + branches), git-branch (phase/milestone branch lifecycle), ciagent-files (`.ciagent/` long-lived reference file management) - **Commit schema**: Every CIAgent-generated commit contains a `---ci---` YAML block with phase, milestone, status, decisions, escalations, requirements, lessons, and compound metadata - **Branch strategy**: `phase/NN-slug` and `milestone/vX.X-slug` branches encode project structure; merged = complete, active = in progress - **Core engine rewrites**: DecisionEngine generates commit messages (not audit JSON), EscalationProtocol commits escalations as git artifacts, OrchestratorAgent uses git log as first impulse - **Verification layers**: All 4 layers implemented — structural, behavioral (test execution), security (STRIDE + CWE), quality (3-persona review) - **CLI**: All 11 commands wired up (`init`, `run`, `quick`, `debug`, `verify`, `review`, `status`, `audit`, `clarify`, `rollback`, `ship`) - **Intelligence backends**: 5 options — OpenAI (LLM), Anthropic (LLM), OllamaLocal (LLM, localhost), OllamaCloud (LLM, remote), Opencode (Agent, --non-interactive). Auto-detection: opencode → openai → ollama-local → ollama-cloud → anthropic. - **Tests**: 57 test suites, 527 tests covering types, config, decision-engine, escalation, clarify, commit-parser, commit-builder, git-context, git-branch, ciagent-files, all 4 verification layers, file utils, backends (ollama, openai, anthropic, opencode, tool-registry), agents (all 18 non-orchestrator), zod validation, e2e, parallel execution