#18901: feat(diagnostics-otel): add trace context propagation and GenAI semantic conventions
extensions: diagnostics-otel
size: M
Cluster:
Plugin Enhancements and Fixes
# feat(diagnostics-otel): Add trace context propagation and GenAI semantic conventions
## Summary
This PR adds two related improvements to the diagnostics-otel plugin:
1. **Trace context propagation** — Diagnostic events now carry `traceId` and `parentSpanId` fields, enabling the OTel plugin to create proper parent-child span relationships instead of disconnected root spans.
2. **GenAI semantic convention attributes** — Model usage spans now include standardized `gen_ai.*` attributes alongside existing `openclaw.*` attributes, following the [OTel GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/).
## Trace Hierarchy
Before (all root spans, unlinked):
```
openclaw.message.processed (standalone)
openclaw.model.usage (standalone)
openclaw.model.usage (standalone)
```
After (linked parent-child traces):
```
openclaw.message.processed ← parent span (traceId + spanId)
└── chat claude-opus-4-6 ← child span (same traceId, parentSpanId = parent's spanId)
└── chat claude-opus-4-6 ← child span (same traceId, parentSpanId = parent's spanId)
```
The trace context lifecycle:
1. `logWebhookReceived` generates a 32-hex-char `traceId` (UUID without dashes)
2. `logMessageQueued` stores the `traceId` on the session state when provided
3. `logMessageProcessed` reads the session's `traceId`, generates a 16-hex-char `spanId`, and emits both on the event
4. Model usage events inherit `traceId` and `parentSpanId` from the session state
5. The OTel plugin uses `trace.setSpanContext()` to create child spans under the proper parent
## GenAI Convention Attributes Added
On `model.usage` spans (alongside existing `openclaw.*` attributes):
| Attribute | Value | Source |
|-----------|-------|--------|
| `gen_ai.operation.name` | `"chat"` | Static |
| `gen_ai.system` | Provider name | `evt.provider` |
| `gen_ai.request.model` | Model name | `evt.model` |
| `gen_ai.usage.input_tokens` | Input token count | `evt.usage.input` |
| `gen_ai.usage.output_tokens` | Output token count | `evt.usage.output` |
Span names updated for GenAI conventions:
- Model usage: `chat ${model}` (e.g., `chat claude-opus-4-6`)
- Message processed: unchanged (`openclaw.message.processed`)
## Files Changed
| File | Change |
|------|--------|
| `src/infra/diagnostic-events.ts` | Added optional `traceId` and `parentSpanId` to `DiagnosticBaseEvent` |
| `src/logging/diagnostic-session-state.ts` | Added `traceId` and `currentSpanId` to `SessionState` |
| `src/logging/diagnostic.ts` | Generate and propagate trace context in webhook/message lifecycle |
| `extensions/diagnostics-otel/src/service.ts` | Accept parent context in span creation, add GenAI attributes, update span names |
| `src/infra/diagnostic-events.test.ts` | New: verify trace context fields pass through events |
| `extensions/diagnostics-otel/src/service.test.ts` | Added tests for GenAI attributes and trace context linking |
## Backwards Compatibility
- **Fully backwards compatible**: All trace context fields are optional (`traceId?: string`, `parentSpanId?: string`)
- **No attributes removed**: `openclaw.*` attributes remain on all spans — `gen_ai.*` attributes are added alongside
- **No breaking changes to event types**: `DiagnosticEventInput` omits `ts` and `seq` as before; the new fields are optional in the intersection types
- **Span creation fallback**: When no `traceId` is present, spans are created as root spans (existing behavior)
- The `logMessageQueued` function's `traceId` parameter is optional; existing callers don't need changes
## How to Test
1. **Unit tests:**
```bash
npx vitest run src/infra/diagnostic-events.test.ts
npx vitest run extensions/diagnostics-otel/src/service.test.ts
```
2. **Manual verification with a collector:**
- Configure `diagnostics.otel.endpoint` to point at a local OTLP collector or Jaeger
- Send a message through any channel
- Verify in the trace UI that `openclaw.message.processed` and `chat <model>` spans share the same `traceId` and have a parent-child relationship
- Verify `gen_ai.*` attributes appear on model usage spans
3. **Backwards compat check:**
- Run the full test suite to confirm no regressions
- Verify `openclaw.*` attributes still appear on all spans
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
This PR adds trace context propagation and GenAI semantic convention attributes to the diagnostics-otel plugin. It introduces `traceId`/`parentSpanId` fields on diagnostic events, stores trace context on session state, and adds `gen_ai.*` attributes alongside existing `openclaw.*` attributes on model usage spans. Span names for model usage are updated to follow GenAI conventions (`chat <model>`).
- **Trace context infrastructure**: `DiagnosticBaseEvent` gains optional `traceId` and `parentSpanId`; `SessionState` gains `traceId` and `currentSpanId`. The OTel plugin's `spanWithDuration` now accepts parent context and uses `trace.setSpanContext` to establish parent-child links.
- **GenAI semantic conventions**: Model usage spans include `gen_ai.operation.name`, `gen_ai.system`, `gen_ai.request.model`, `gen_ai.usage.input_tokens`, and `gen_ai.usage.output_tokens` per the OTel GenAI spec.
- **Trace hierarchy concern**: `logMessageProcessed` emits its own generated `spanId` as `parentSpanId`, causing the OTel plugin to create a self-referential parent span rather than the intended root span. See inline comment for details.
- **End-to-end wiring gap**: No existing callers of `logMessageQueued` pass `traceId`, and `model.usage` emitters don't pass trace context, so the propagation chain is incomplete in practice. This may be intentional as groundwork for a follow-up PR.
<h3>Confidence Score: 3/5</h3>
- The PR is backwards-compatible and low-risk for regressions, but the trace hierarchy logic has a bug that will produce incorrect span relationships.
- The GenAI attribute additions and event type changes are clean and backwards-compatible. However, the trace context propagation has a logic issue where message.processed spans become self-referential instead of root spans, and the end-to-end wiring through callers is incomplete. These issues don't break existing functionality but mean the new trace linking feature won't produce the intended parent-child hierarchy.
- `src/logging/diagnostic.ts` — the `parentSpanId` emitted on message.processed events creates a self-referential span parent instead of the intended trace hierarchy.
<sub>Last reviewed commit: 52ad825</sub>
<!-- greptile_other_comments_section -->
<sub>(4/5) You can add custom instructions or style guidelines for the agent [here](https://app.greptile.com/review/github)!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#21290: feat(diagnostics-otel): OpenTelemetry diagnostics with GenAI semant...
by Baukebrenninkmeijer · 2026-02-19
85.9%
#19353: fix(diagnostics-otel): fix cross-chunk module isolation breaking even…
by nez · 2026-02-17
78.4%
#19251: CLI: emit diagnostics for embedded Slack-context runs
by gg2uah · 2026-02-17
75.8%
#4255: fix(diagnostics-otel): complete OpenTelemetry v2.x compatibility
by arbgjr · 2026-01-29
75.7%
#16865: fix(diagnostics-otel): share listeners/transports across module bun...
by leonnardo · 2026-02-15
75.2%
#11530: diagnostics-otel: fix OpenTelemetry v2 resource/logs API compatibility
by erain · 2026-02-07
73.8%
#10199: fix(diagnostics-otel): opentelemetry bug fix
by yourtion · 2026-02-06
71.6%
#12966: feat(logging): Add session context and breadcrumbs to error logs
by trevorgordon981 · 2026-02-10
69.6%
#14719: UI: fix debug event log layout and health history toggle
by detecti1 · 2026-02-12
69.2%
#8270: fix: support snake_case 'tool_use' in transcript repair (#8264)
by heliosarchitect · 2026-02-03
68.5%