#21589: Post-Performance Roadmap: Milestones A–D (contracts, observability, context discipline, failure economics)

by Doji-Hammer open 2026-02-20 04:21 View on GitHub →

docs channel: telegram gateway cli commands agents size: XL

## Summary Implements Post-Performance Roadmap Milestones **A–D** in one integrated branch. ### Milestone A — Contract enforcement - Adds schema validators and invariant checks for key runtime data structures. - Adds invariant-focused tests to prevent regressions. - **Tests:** 39 ### Milestone B — Observability - Propagates `trace_id` end-to-end and adds telemetry hooks for runtime execution. - **Tests:** observability suite (see CI / test output) ### Milestone C — Context discipline - Introduces hot-state limits and tighter context management. - Adds artifact references and a context budgeter to keep prompts bounded and reproducible. - **Tests:** 59 ### Milestone D — Failure economics - Establishes an error taxonomy and structured retry policy. - Adds escalation logic for persistent/expensive failures. - **Tests:** 66 + 88 ## Notes / Breaking changes - Runtime now enforces stricter contracts (schema validation + invariants). Some previously-tolerated malformed data may now fail fast. - Context handling is stricter (budgets/limits) and may change how large artifacts are referenced/loaded. - Observability introduces trace propagation fields; downstream integrations should tolerate/forward `trace_id` where relevant. ## How to review 1. Review merged milestone branches (A–D) for scoped changes. 2. Start with contracts/invariants (A), then trace propagation (B), then context budgeting/artifacts (C), then failure policy (D). ## Commits This branch is ~24 commits ahead of `main` and includes merge commits for each milestone plus a final conflict-resolution pass.  <h3>Greptile Summary</h3> This PR implements Milestones A-D of the Post-Performance Roadmap, adding contract enforcement, observability, context discipline, and failure economics to OpenClaw. The changes introduce strict schema validation for internal messages, end-to-end trace propagation, hot-state limits with artifact references, and a structured error taxonomy with retry policy enforcement. **Major Changes:** - Contract enforcement schemas (Zod) for all internal message types with dispatcher role exclusivity checks - Observability module with `trace_id` propagation via AsyncLocalStorage and SQLite telemetry storage - Context discipline with hot-state token caps (≤1000 tokens), artifact-by-reference storage, and budget validation - Failure economics with error taxonomy, retry policy enforcer (max 1 retry), circuit breakers, and dead-letter queue - Lobster plugin for workflow execution integrated into the plugin system - Symlink resolution in plugin discovery for linked extensions **Strengths:** - Comprehensive test coverage (39 tests for contracts, 59 for context discipline, 66+88 for failure economics) - Well-documented with inline comments and separate markdown docs - Fail-closed design for budget validation (ambiguous checks treated as violations) - Content-addressable artifact storage with SHA256 deduplication - Structured error taxonomy with clear escalation paths **Areas to Verify:** - Integration between all four milestones is complex - ensure runtime testing covers cross-milestone scenarios - Hot-state is injected into system prompt, which changes prompt composition - verify this doesn't break existing sessions - Symlink resolution in plugin discovery changes behavior - ensure this is documented - SQLite telemetry storage may need cleanup/rotation strategy for long-running deployments <h3>Confidence Score: 4/5</h3> - This PR is largely safe to merge with thorough testing recommended for the integrated runtime behavior - Score reflects well-structured implementation with comprehensive test coverage (200+ tests), clear documentation, and adherence to fail-closed principles. Deducted one point due to complexity of integrating four major milestones simultaneously, which requires careful runtime validation. The changes are substantial (11,968 additions) but follow consistent patterns and include proper error handling. - Pay close attention to `src/agents/pi-embedded-runner/run/attempt.ts` for hot-state system prompt integration and `src/plugins/discovery.ts` for symlink resolution behavior changes <sub>Last reviewed commit: d25b47f</sub>  <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>