feat(P06): docs & hardening — AGENTS.md/README fixes, agent tests, Gitea tests, multi-project tests, version 0.7.0

---ci--- phase: 6 milestone: v0.7.0 plan: 06 task: P06-all status: execute ---/ci---
2026-05-29 18:20:46 +00:00
parent e8c6c5c917
commit a416413c7d
15 changed files with 1031 additions and 21 deletions
@@ -211,9 +211,9 @@ CIAgent uses `.ciagent/config.json` for project configuration:
 ### Pipeline

 ```
-SPECIFY → CLARIFY → RESEARCH → PLAN → EXECUTE → VERIFY → COMPLETE
-                ↕               ↕         ↕          ↕
-           (questions)    (auto-decide) (auto-run) (auto-verify)
+SPECIFY → CLARIFY → RESEARCH → PLAN → EXECUTE → TEST → VERIFY → COMPLETE
+                 ↕               ↕         ↕     ↕      ↕
+            (questions)    (auto-decide) (auto-run) (auto-test) (auto-verify)
 ```

 ### Git-Native Core Modules
@@ -235,7 +235,7 @@ Every autonomous decision is classified by confidence:

 Decisions are committed to git as `decision` type commits. The audit trail is `git log --grep="decisions:"`.

-### 18 Agents
+### 19 Agents

 | Agent | Role | CIAgent Modification |
 |-------|------|----------------|
@@ -244,17 +244,18 @@ Decisions are committed to git as `decision` type commits. The audit trail is `g
 | executor | Task execution | Never pauses for checkpoints |
 | verifier | Output verification | Generates automated tests, not human UAT |
 | researcher | Domain research | Logs assumptions, never flags for human |
+| tester | Integration/e2e tests | Detects and runs existing test files, never writes tests |
 | challenger | Plan stress-testing | Binding verdicts, only escalates <0.60 |
 | security-auditor | Security audit | Auto-dispositions threats |
 | debugger | Bug fixing | Auto-fixes when confidence > threshold |
-| Others | Various | Retained from Learnship |
+| Others | Various | Delegates to active intelligence backend |

 ### Verification Layers

 1. **Structural**: File existence, import/export wiring, no stubs
-2. **Behavioral**: Generated automated tests for must-haves
-3. **Security**: STRIDE analysis with auto-disposition
-4. **Code Quality**: Multi-persona review with P0 auto-fix
+2. **Behavioral**: Test infrastructure and requirement traceability (partially implemented — static analysis, no test generation yet)
+3. **Security**: Regex-based threat pattern scanning with auto-disposition (partially implemented — no STRIDE analysis yet)
+4. **Code Quality**: Regex-based code quality checks (partially implemented — no multi-persona review yet)

 ## Specification Format

@@ -292,9 +293,9 @@ Each escalation is committed as an `escalation` type commit. Resolved escalation

 ## Current Limitations

- **Agent implementations are stubs**: All 18 agents return success immediately. Real LLM-based agent implementations are needed for research, planning, execution, and verification.
+- **Agent implementations**: 5 core agents have intrinsic logic (planner, executor, verifier, researcher, tester); 13 agents delegate to backends. Full LLM-powered agent behavior requires an intelligence backend.
 - **Package not published to npm**: Install from source only until a publishing pipeline is configured.
- **Behavioral/Security/Quality verification layers**: Structural verification is fully implemented; behavioral, security, and quality layers are partially stubbed.
+- **Behavioral/Security/Quality verification layers**: Partially implemented — structural verification is complete; behavioral does static analysis; security does regex-based threat scanning; quality does regex-based code quality checks.

 ## Differences from Learnship