#8976: Add structured tracing for agent runs
cli
commands
agents
stale
Cluster:
Security Enhancements and Fixes
This PR adds a low-overhead tracing layer for agent execution.
When enabled via `--trace <path>`, agent runs record structured step data
(LLM prompts, tool calls, outputs, and deterministic state hashes) to disk.
The feature is disabled by default and does not alter execution behavior.
Motivation:
- Make agent failures reproducible
- Improve debuggability without rerunning LLM calls
- Provide a foundation for future replay and regression testing
Scope:
- Optional trace writer and schema
- Deterministic state hashing
- Minimal wiring at existing LLM/tool chokepoints
- No replay, inspection UI, or behavioral changes
All changes are additive and fully optional.
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR introduces an optional structured tracing subsystem for agent runs, enabled via `--trace <path>`. It adds a versioned trace schema (`TraceFile`/entries), deterministic state hashing, and a `TraceWriter` that persists LLM/tool step records plus final run metadata. The agent CLI plumbing wires `--trace` through `agent-via-gateway` → `agentCommand` → embedded runner params, and `runEmbeddedAttempt` records LLM calls and run completion when tracing is enabled.
<h3>Confidence Score: 3/5</h3>
- This PR is likely mergeable but has correctness/performance footguns in the tracing implementation.
- Core wiring is straightforward and optional, but the current writer rewrites the full file on every flush (not append-only/low-overhead) and state hashing can throw depending on real message shapes, causing behavior changes when `--trace` is used. Schema parsing is also too permissive for a boundary function.
- src/agents/tracing/writer.ts, src/agents/pi-embedded-runner/run/attempt.ts, src/agents/tracing/schema.ts
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
**Context used:**
- Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8))
- Context from `dashboard` - AGENTS.md ([source](https://app.greptile.com/review/custom-context?memory=0d0c8278-ef8e-4d6c-ab21-f5527e322f13))
<!-- /greptile_comment -->
Most Similar PRs
#15253: Adding structured log content
by emailhxn · 2026-02-13
76.1%
#15852: fix: pass agentId when resolving IRC session paths
by MisterGuy420 · 2026-02-14
74.3%
#11226: Fix system property assignment in cache-trace.ts
by ygypt · 2026-02-07
72.6%
#9974: refactor(agents): replace console.warn with SubsystemLogger in comp...
by dinakars777 · 2026-02-05
72.5%
#7892: Claude/setup agent firewall ww xsv
by starwreckntx · 2026-02-03
71.8%
#14136: feat: add agent collapse safeguards and fix TUI display on abort
by liangweigain-create · 2026-02-11
71.6%
#11743: fix: remove redundant file reads from AGENTS.md template
by shogunsea · 2026-02-08
71.2%
#8893: fix: enhance subagent error reporting with diagnostic context
by joetomasone · 2026-02-04
71.1%
#8919: Pr/memory flush improvements
by shortbus · 2026-02-04
70.8%
#18889: feat(hooks): add agent and tool lifecycle boundaries
by vincentkoc · 2026-02-17
70.8%