#22642: fix(discord): Discord status state machine 2.0 (clean restart, follow-up to #18248)
docs
channel: discord
size: XL
Cluster:
Signal and Discord Fixes
Supersedes #22563 (frozen draft for history). This is a clean restart from main with minimal diff.
# PR Title (recommended)
`fix(discord): Discord status state machine 2.0 (follow-up to #18248)`
---
## Summary
Describe the problem and fix in 2โ5 bullets:
- Problem:
- v1 (authored by **@victorGPT**, merged as PR #18248) solved baseline transitions, but under stress it could still show **semantic/progress rollback** on the same message: eye emoji and active emojis could flip back and forth (`๐ โ ๐ค`, or `โณ -> ๐ป -> โณ`).
- In bursty multi-message queues, v1 did not reliably provide a clear per-message waiting signal for every queued message.
- Why it matters:
- This creates false progress signals: operators can see a message appear to "go backward" even though execution already moved forward.
- What changed:
- v2 enforces semantic monotonic progression (no ACTIVE/WAITING rollback), keeps ACTIVE-internal icon switches valid, unifies deferred queue outcomes, adds explicit clear/cancel-before-start cleanup, and makes queued-message waiting visibility explicit for burst scenarios.
- What did NOT change (scope boundary):
- No non-Discord behavior changes, no thread/channel rename strategy changes, no new slash command in this PR.
## Change Type (select all)
- [x] Bug fix
- [ ] Feature
- [x] Refactor
- [ ] Docs
- [ ] Security hardening
- [ ] Chore/infra
## Scope (select all touched areas)
- [x] Gateway / orchestration
- [ ] Skills / tool execution
- [ ] Auth / tokens
- [ ] Memory / storage
- [x] Integrations
- [x] API / contracts
- [ ] UI / DX
- [ ] CI/CD / infra
## Linked Issue/PR
- Closes #(fill-if-any)
- Related #18248 (v1 baseline)
- v1 PR (authored by @victorGPT): [#18248](https://github.com/openclaw/openclaw/pull/18248)
## User-visible / Behavior Changes
- v2 explicitly prevents semantic/progress rollback once a message reaches ACTIVE (blocks `ACTIVE -> WAITING`).
- v1-style thrash patterns such as `๐ โ ๐ค` and `โณ -> ๐ป -> โณ` are treated as invalid rollback and suppressed.
- ACTIVE internal switching (thinking/tool variants like `๐ค โ ๐ป/๐ ๏ธ/๐`) remains valid and is not treated as rollback.
- In multi-message bursts, each queued message now shows explicit WAITING (`โณ`) while queued, then transitions when processing actually starts.
- Off mode remains ack-oriented while handling deferred bookkeeping correctly (no premature ack-clear during queued/deferred lifecycle).
- Deferred state cleanup is explicit on queue clear/cancel-before-start paths to avoid stale status artifacts.
### Before vs After (v1 -> v2)
| Area | v1 (#18248) | v2 (this PR) |
| ------------------------------------------ | ---------------------------------------------------------------------------- | --------------------------------------------------------------------------------- |
| Baseline behavior | Faster reaction status machine | Semantic hardening under real queue/deferred stress |
| Semantic/progress rollback on same message | Could still thrash between eye + active emojis (`๐ โ ๐ค`, `โณ -> ๐ป -> โณ`) | Monotonic semantic phases block rollback; no false backward progress |
| Deferred queue outcomes | Partial handling | Unified outcome model (`queued/skipped/dropped/merged/failed`) with cleanup hooks |
| Multi-message queue visibility | Under burst load, some queued messages lacked clear waiting state | Every queued message shows WAITING (`โณ`), then transitions when execution starts |
| Clear/cancel-before-start | Could leave stale deferred state | Explicit failure outcome + deferred cleanup |
| off mode + deferred | Could lose deferred context and clear ack too early in edge cases | Keeps ack-only display while preserving deferred lifecycle bookkeeping |
### Behavior Signatures (quick reviewer view)
- **V1 false rollback signature** (same message): `๐ -> ๐ค -> ๐ป -> ๐ -> ๐ค` or `โณ -> ๐ป -> โณ`
- **V2 monotonic signature** (same message): WAITING `โณ` (if queued) -> ACTIVE (`๐ค/๐ป/๐ ๏ธ/๐`) -> terminal (`โ
/โ`) without semantic backward transition
- **Queue visibility in bursts**: for A/B/C stacked messages, B and C show `โณ` while waiting; each one switches out of `โณ` when its own execution actually begins
## Security Impact (required)
- New permissions/capabilities? (`No`)
- Secrets/tokens handling changed? (`No`)
- New/changed network calls? (`No`)
- Command/tool execution surface changed? (`No`)
- Data access scope changed? (`No`)
- If any `Yes`, explain risk + mitigation:
- N/A
## Repro + Verification
### Environment
- OS: macOS (Darwin arm64)
- Runtime/container: Node.js + pnpm local workspace
- Model/provider: openai-codex/gpt-5.3-codex (coding/review workers)
- Integration/channel (if any): Discord
- Relevant config (redacted):
- `messages.statusReactionMode = full|off`
- `messages.removeAckAfterReply = true|false`
### Steps
1. Send multi-message burst to create queued/deferred transitions.
2. Observe status progression across waiting/active/terminal phases.
3. Test `statusReactionMode=off` with deferred queued flow and `removeAckAfterReply=true`.
4. Trigger clear/cancel-before-start path and ensure no stale deferred status remains.
### Expected
- No semantic rollback after ACTIVE starts.
- No `๐ โ ๐ค` oscillation and no `โณ -> ๐ป -> โณ` fallback for the same message.
- In burst sends, queued messages are clearly signaled as waiting and then promoted when execution starts.
- off mode remains ack-oriented without premature ack clear during deferred flow.
- No stale deferred state after queue-clear/cancel-before-start.
### Actual
- Matches expected in targeted regression tests and local verification commands.
## Evidence
Attach at least one:
- [x] Failing test/log before + passing after
- [x] Trace/log snippets
- [ ] Screenshot/recording
- [ ] Perf numbers (if relevant)
## Human Verification (required)
What you personally verified (not just CI), and how:
- Verified scenarios:
- queued/deferred lifecycle correctness,
- rollback prevention invariants,
- off-mode deferred behavior,
- queue clear/cancel cleanup behavior.
- Edge cases checked:
- error/retry progression,
- cancellation before lifecycle start.
- What you did **not** verify:
- full production-scale long-duration multi-guild soak run.
## Compatibility / Migration
- Backward compatible? (`Yes`)
- Config/env changes? (`No`)
- Migration needed? (`No`)
- If yes, exact upgrade steps:
- N/A
## Failure Recovery (if this breaks)
- How to disable/revert this change quickly:
- Roll back this PR commit range.
- Files/config to restore:
- `src/discord/monitor/message-handler.process.ts`
- `src/discord/monitor/message-handler.process.test.ts`
- `src/auto-reply/reply/queue/*` (outcome/cleanup related)
- Known bad symptoms reviewers should watch for:
- `๐ โ ๐ค` jitter on one message,
- `โณ -> tool -> โณ` fallback,
- ack disappearing too early in off-mode deferred runs,
- stale deferred status after queue clear/cancel.
## Risks and Mitigations
- Risk:
- Deferred cleanup timing tradeoff in rare delayed-retry scenarios.
- Mitigation:
- Unified outcome cleanup + targeted lifecycle regression tests.
- Risk:
- Future call paths bypassing semantic invariants.
- Mitigation:
- Centralized semantic-phase guard in status controller + regression tests for observed live signatures.
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Hardens Discord status reaction state machine to prevent semantic rollback (e.g., `๐ โ ๐ค` or `โณ -> ๐ป -> โณ` on same message) by enforcing monotonic phase progression and unifying deferred queue lifecycle tracking.
## Key Changes
- Implements semantic phase gates (`queued` -> `waiting` -> `active` -> `terminal`) to block backward transitions in `src/discord/monitor/message-handler.process.ts:307-318`
- Introduces deferred status controller registry with lifecycle bridge in `src/discord/monitor/message-handler.process.ts:115-244` to coordinate queue outcomes with status reactions
- Adds `onFollowupQueued` callback plumbing through `GetReplyOptions` and queue system to signal queued/skipped/dropped/merged/failed outcomes
- Establishes per-message waiting visibility for multi-message bursts by detecting active sessions and busy lanes at ingress (`src/discord/monitor/message-handler.process.ts:637-656`)
- Preserves waiting hourglass (`โณ`) until lifecycle actually starts, avoiding premature clear for queued runs
## Test Coverage
- Comprehensive regression tests for monotonic transitions, active-to-waiting rollback prevention, deferred lifecycle coordination, error-retry recovery, and queue outcome emissions
- Validates `statusReactionMode=off` preserves ack-only behavior while maintaining deferred bookkeeping
- Tests cover rate-limiting, debounce, stall timers, and cleanup edge cases
## Config Schema
- Introduces `messages.statusReactionMode` (`full` | `off`) to control transition behavior
- Preserves backward compatibility (defaults to `full`)
- Documents new mode in `docs/channels/discord.md`
<h3>Confidence Score: 4/5</h3>
- Safe to merge with high confidence; well-tested monotonic state machine prevents observed rollback patterns
- Implementation is solid with comprehensive test coverage for core invariants. The semantic phase transition logic is sound and properly gates rollback. Deferred lifecycle coordination is well-architected. Minor complexity around cleanup timing in edge cases (error-retry TTL) is acceptable given unified outcome model and targeted regression test...
Most Similar PRs
#19917: feat(discord): allow disabling intermediate status reactions
by Gitjay11 ยท 2026-02-18
76.3%
#20009: fix(discord): immediately defer interactions to prevent timeouts
by Gitjay11 ยท 2026-02-18
76.3%
#23158: discord: harden preflight/reply path against slow lookup latency
by danielstarman ยท 2026-02-22
74.1%
#17316: fix: ack reaction not removed when block streaming is enabled (Tele...
by czmathew ยท 2026-02-15
73.7%
#21463: fix(discord): prevent WebSocket death spiral + fix numeric channel IDโฆ
by akropp ยท 2026-02-20
73.1%
#17254: fix(discord): intercept text-based slash commands instead of forwar...
by robbyczgw-cla ยท 2026-02-15
72.7%
#22557: fix(discord): coerce exec approval approver IDs to string to preven...
by zwffff ยท 2026-02-21
72.6%
#21346: [AI-assisted] Telegram: add reaction state machine with fallback an...
by Archie818 ยท 2026-02-19
72.5%
#20406: fix(slack): respect replyToMode when computing statusThreadTs in DMs
by QuinnYates ยท 2026-02-18
72.5%
#12953: fix: defer gateway restart until all replies are sent
by zoskebutler ยท 2026-02-10
72.2%