#9049: fix: prevent subagent stuck loops and ensure user feedback
channel: slack
agents
stale
size: XL
Cluster:
Subagent Enhancements and Features
## Problem
When a background subagent fails, OpenClaw can get stuck and the user receives no feedback:
1. **Compaction infinite loop:** `handleAutoCompactionEnd()` calls `noteCompactionRetry()` with no upper bound. When pi-ai keeps signaling `willRetry: true`, the `while(true)` run loop in `run.ts` loops indefinitely — the subagent hangs forever and the user's session appears frozen.
2. **Silent announce failure:** When `runSubagentAnnounceFlow()` fails (e.g. gateway timeout, network error), the error is caught and logged but the user never learns the subagent completed. The only recovery is an app restart via `resumeSubagentRun()`.
3. **Context overflow on completion:** When subagents complete, their full output dumps into the parent session, causing context overflow. The 8000 char truncation was a band-aid; multiple subagents still flood the parent.
## Changes
### Fix 1: Compaction retry circuit breaker
Add `MAX_COMPACTION_RETRIES = 3`. When `handleAutoCompactionEnd` receives a 4th `willRetry: true`, it sets `compactionRetryExhausted = true`, resets the pending count, and resolves the compaction wait. The main loop in `attempt.ts` then sets a `CompactionRetryExhaustedError` as `promptError`. The outer loop in `run.ts` classifies this as a `compaction_failure` (via `isCompactionFailureError`) and returns a user-facing context overflow error — no further retry.
**Files:** `pi-embedded-subscribe.handlers.lifecycle.ts`, `pi-embedded-subscribe.handlers.types.ts`, `pi-embedded-subscribe.ts`, `pi-embedded-runner/run/attempt.ts`
### Fix 2: Announce retry with fallback notification
Wrap the announce delivery section in a retry loop (3 attempts, 2s delay between retries). If all attempts fail, send a brief fallback notification to the requester: *"A background task completed but results could not be delivered."* If even the fallback fails, log the error — the existing retry-on-wake path in `finalizeSubagentCleanup` still applies as last resort.
**Files:** `subagent-announce.ts`
### Fix 3: Stream subagent progress to dedicated threads
Stream progress to dedicated threads (Discord/Slack) instead of dumping full output to parent:
- **Progress threads:** Create a progress thread when subagent spawns (Discord/Slack)
- **Batched digests:** Queue tool events with debounced batching (3s delay, max 5 tools per digest)
- **Brief summaries:** Send only 300 char summary on completion instead of 8000 char full output
- **Fallback for non-threaded channels:** Use `[task-label]` prefixed messages with higher debounce (5s)
**Files:** `subagent-progress-stream.ts` (new), `subagent-registry.ts`, `subagent-announce.ts`, `slack/send.ts`
### Enhancement: Parent context for subagents
Add an optional `context` field to `sessions_spawn`. When provided, it's appended as a `## Background Context` section in the subagent's system prompt. This lets the main agent pass relevant conversation context (user preferences, prior findings) to subagents that would otherwise start with zero context.
**Files:** `sessions-spawn-tool.ts`, `subagent-announce.ts`
## Test plan
- [x] Extended compaction test: send 4+ `auto_compaction_end` events with `willRetry: true`, verify `isCompactionRetryExhausted()` returns `true` and `waitForCompactionRetry()` resolves
- [x] New announce retry tests: mock `callGateway` to fail once then succeed (retry works), fail all 3 attempts (fallback sent), fail everything (returns `false`)
- [x] New `buildSubagentSystemPrompt` tests: verify context section presence/absence
- [x] New progress stream tests: 26 tests covering state management, batching, thread creation, parallel subagents, edge cases
- [x] Updated announce format/retry tests: match new brief summary format
- [x] `pnpm build` — compiles cleanly
- [x] `pnpm check` — no lint/format errors
- [x] `pnpm test` — all tests pass
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Most Similar PRs
#20328: fix(agents): Add retry with exponential backoff for subagent announ...
by tiny-ship-it · 2026-02-18
81.1%
#18205: fix (agents): add periodic retry timer for failed subagent announces
by MegaPhoenix92 · 2026-02-16
77.7%
#13105: fix: debounce subagent lifecycle events to prevent premature announ...
by mcaxtr · 2026-02-10
76.6%
#7584: Tests: align subagent announce wait expectations
by justinhuangcode · 2026-02-03
76.5%
#8893: fix: enhance subagent error reporting with diagnostic context
by joetomasone · 2026-02-04
76.1%
#13167: feat(agents): dispatch Claude Code CLI runs as persistent, resumabl...
by gyant · 2026-02-10
76.0%
#10273: fix(agents): detect and auto-compact mid-run context overflow
by terryops · 2026-02-06
76.0%
#19636: fix(agents): harden overflow recovery observability + subagent term...
by Jackten · 2026-02-18
75.3%
#23166: fix(agents): restore subagent announce chain from #22223
by tyler6204 · 2026-02-22
75.1%
#8313: feat: auto-compaction support for spawned subagent sessions
by vishaltandale00 · 2026-02-03
74.9%