#19251: CLI: emit diagnostics for embedded Slack-context runs

by gg2uah open 2026-02-17 15:31 View on GitHub →

commands size: S

## Summary Describe the problem and fix in 2–5 bullets: - Problem: `openclaw agent` embedded runs (including Slack-context runs via `--channel slack` / `runContext.messageChannel=slack`) did not emit `model.usage` diagnostic events. - Why it matters: `diagnostics-otel` turns diagnostic events into trace spans; without `model.usage`, these runs can look invisible in tracing backends. - What changed: `agentCommand` now emits `model.usage` diagnostics from embedded run metadata when diagnostics are enabled and usage is non-zero. - What did NOT change (scope boundary): webhook/message-flow diagnostics (`message.processed`, `webhook.*`) and channel monitor dispatch paths were not changed. ## Change Type (select all) - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [ ] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [x] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes # - Related # ## User-visible / Behavior Changes `openclaw agent` runs now emit OpenTelemetry-trace-producing `model.usage` diagnostics when diagnostics are enabled. ## Security Impact (required) - New permissions/capabilities? (`Yes/No`) No - Secrets/tokens handling changed? (`Yes/No`) No - New/changed network calls? (`Yes/No`) No - Command/tool execution surface changed? (`Yes/No`) No - Data access scope changed? (`Yes/No`) No - If any `Yes`, explain risk + mitigation: ## Repro + Verification ### Environment - OS: macOS - Runtime/container: Node 22 / pnpm - Model/provider: embedded agent run metadata (`agentMeta.usage`) - Integration/channel (if any): Slack-context embedded runs - Relevant config (redacted): `diagnostics.enabled=true` ### Steps 1. Enable diagnostics and run an embedded agent command with Slack context. 2. Execute `openclaw agent ...` where the run resolves `messageChannel=slack`. 3. Inspect diagnostics/OTEL output for `model.usage`. ### Expected - Embedded Slack-context runs emit `model.usage` diagnostics (and therefore OTEL spans when `diagnostics-otel` is active). ### Actual - Before this change, `agentCommand` did not emit `model.usage` for embedded runs. ## Evidence Attach at least one: - [x] Failing test/log before + passing after - [ ] Trace/log snippets - [ ] Screenshot/recording - [ ] Perf numbers (if relevant) Added regression test: - `src/commands/agent.e2e.test.ts` -> `emits model usage diagnostics for embedded Slack-context runs` ## Human Verification (required) What you personally verified (not just CI), and how: - Verified scenarios: - New regression test passes and asserts `emitDiagnosticEvent` is called with `type: "model.usage"` and `channel: "slack"`. - `pnpm build` passes. - `pnpm test` passes. - Edge cases checked: - No diagnostics emission when usage is absent/zero (guarded by `hasNonzeroUsage`). - Channel/provider/model fall back to resolved runtime values if metadata is partial. - What you did **not** verify: - Live Grafana end-to-end ingestion in a remote cloud account. ## Compatibility / Migration - Backward compatible? (`Yes/No`) Yes - Config/env changes? (`Yes/No`) No - Migration needed? (`Yes/No`) No - If yes, exact upgrade steps: ## Failure Recovery (if this breaks) - How to disable/revert this change quickly: - Disable diagnostics (`diagnostics.enabled=false`) or revert this commit. - Files/config to restore: - `src/commands/agent.ts` - Known bad symptoms reviewers should watch for: - Unexpected duplicate `model.usage` diagnostics for `openclaw agent` runs. ## Risks and Mitigations - Risk: - Duplicate diagnostics if another layer starts emitting command-path `model.usage` for the same run. - Mitigation: - Emission is limited to `agentCommand` embedded results and guarded by `hasNonzeroUsage` + `isDiagnosticsEnabled`. ## AI-assisted + Testing - AI-assisted: Yes (Codex) - Testing level: Fully tested for this change (targeted e2e + full `pnpm test` + build) - Prompts/session logs: available on request - Confirmed understanding: Yes, code paths and event payload fields were reviewed before implementation. - Note on required check command: `pnpm check` currently fails in this checkout at `pnpm format:check` with pre-existing repo-wide format mismatches (1119 files), unrelated to this diff.  <h3>Greptile Summary</h3> This PR adds `model.usage` diagnostic event emission to the `agentCommand` CLI path (`src/commands/agent.ts`) for embedded agent runs. Previously, only the auto-reply/webhook path (`src/auto-reply/reply/agent-runner.ts`) emitted these events, leaving CLI-initiated runs (including Slack-context runs via `--channel slack` or `runContext.messageChannel=slack`) invisible in OpenTelemetry tracing backends. - The new diagnostic block mirrors the existing pattern in `agent-runner.ts`: guarded by `isDiagnosticsEnabled(cfg)` and `hasNonzeroUsage(usage)`, with proper fallbacks for provider/model from `fallbackProvider`/`fallbackModel` - Token accounting is correct: `promptTokens = input + cacheRead + cacheWrite`, `total = usage.total ?? promptTokens + output` - Cost estimation uses the same `resolveModelCostConfig` + `estimateUsageCost` utilities - No risk of duplicate emissions — the CLI command path and auto-reply webhook path are mutually exclusive entry points - A focused regression test validates the positive case with Slack-context metadata <h3>Confidence Score: 5/5</h3> - This PR is safe to merge — it adds observability output guarded by feature flags with no changes to data flow or business logic. - The change is narrowly scoped to adding diagnostic event emission behind an existing feature flag (diagnostics.enabled). It follows an identical pattern already proven in agent-runner.ts, introduces no new dependencies or network calls, and has a targeted regression test. The two emission sites (CLI vs webhook) are mutually exclusive, eliminating duplicate event risk. - No files require special attention. <sub>Last reviewed commit: 85479a3</sub>  <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>