#15383: fix(sessions_send): avoid announce delivery when announce step resolves to ANNOUNCE_SKIP

by Zjianru open 2026-02-13 11:14 View on GitHub →

agents stale size: M

Cluster: Subagent Enhancements and Features

## 中文说明 ### 背景在真实多 agent 编排中（`sessions_send` 的 A2A 流程），我们观察到一个间歇性问题： - 子 agent 在 `Agent-to-agent announce step.` 明确返回 `ANNOUNCE_SKIP` - 但系统仍会向用户通道发送 announce（重复/误发）这与 #14046 的报告一致。 ### 根因分析 `runSessionsSendA2AFlow` 的 announce 发送逻辑依赖 `runAgentStep -> chat.history` 读取最新回复。在某些时序下，`chat.history` 会短暂返回旧文本（stale read），导致： - announce 实际回复已是 `ANNOUNCE_SKIP` - 但读取到的是前一条非 skip 文本 - 因而误触发 `send` ### 修复在发送前新增 `shouldDeliverAnnounce(...)` 保护： 1. 先按原逻辑判断 `announceReply` 2. 再做最多两次轻量二次确认（`readLatestAssistantReply(limit: 20)`） 3. 若确认最新回复为 `ANNOUNCE_SKIP`，则强制不发送 4. 若确认与候选 announce 一致，则允许发送这样可抑制 history 短暂滞后带来的误发。 ### 测试新增两条测试： 1. `sessions_send should not deliver when announce step returns ANNOUNCE_SKIP` 2. `sessions_send should not deliver when announce step is ANNOUNCE_SKIP even if history is stale` 并验证 `src/agents/openclaw-tools.sessions.test.ts` 全量通过（12/12）。 --- ## English ### Background In real multi-agent orchestration (`sessions_send` A2A flow), we observed an intermittent bug: - target agent returns `ANNOUNCE_SKIP` in `Agent-to-agent announce step.` - but an announce is still delivered to the user channel This matches #14046. ### Root cause `runSessionsSendA2AFlow` relies on `runAgentStep -> chat.history` for the announce reply. Under timing races, `chat.history` can briefly return stale content, so: - announce step already produced `ANNOUNCE_SKIP` - but code reads a previous non-skip reply - then incorrectly calls `send` ### Fix Added `shouldDeliverAnnounce(...)` guard before delivery: 1. keep existing check on `announceReply` 2. perform up to two lightweight re-reads (`readLatestAssistantReply(limit: 20)`) 3. if latest reply is `ANNOUNCE_SKIP`, skip delivery 4. only deliver when latest reply matches the candidate announce text This prevents false delivery caused by transient history lag. ### Tests Added two tests: 1. `sessions_send should not deliver when announce step returns ANNOUNCE_SKIP` 2. `sessions_send should not deliver when announce step is ANNOUNCE_SKIP even if history is stale` `src/agents/openclaw-tools.sessions.test.ts` passes fully (12/12). Related: #14046.  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR adds a `shouldDeliverAnnounce()` safeguard to the `sessions_send` A2A flow to prevent delivering an announce when the announce step resolves to `ANNOUNCE_SKIP`, addressing observed stale reads from `chat.history`. It also adds two tests intended to cover direct `ANNOUNCE_SKIP` and a simulated stale-history scenario. Key integration point: `runSessionsSendA2AFlow()` now calls `shouldDeliverAnnounce()` before `callGateway({ method: "send" })`, and `shouldDeliverAnnounce()` re-reads the latest assistant reply via `readLatestAssistantReply()` (which internally calls `chat.history`). <h3>Confidence Score: 3/5</h3> - This PR is directionally correct but has correctness gaps in the new guard and test modeling that should be fixed before merging. - The announce-delivery guard currently doesn’t enforce the advertised “latest reply must equal candidate” invariant, so it can still deliver stale announces under plausible message ordering. Additionally, the new tests’ `chat.history` mock is keyed to `runId` state even though production reads history by `sessionKey`, reducing confidence that the tests validate the real race. - src/agents/tools/sessions-send-tool.a2a.ts, src/agents/openclaw-tools.sessions.test.ts <sub>Last reviewed commit: 419a675</sub>  <sub>(4/5) You can add custom instructions or style guidelines for the agent [here](https://app.greptile.com/review/github)!</sub>  ## 与 #15402 的关系 / Relation to #15402 这是同一问题的“上层防护解法（application-level mitigation）”。 - 本 PR（#15383）在 `sessions_send` announce 投递前做防护判定，优点是改动集中、风险低、可以快速缓解误投递。 - 互补 PR（#15402）从底层改 `agent.wait`，提供确定性的 `finalAssistantText`，从根上减少 history 竞态。 ### 为什么拆成两个 PR / Why split into two PRs 1. 风险面不同：#15383 是局部行为修复；#15402 涉及网关返回契约与更广调用链。 2. 合并策略更灵活：可先合并低风险缓解，再评审底层方案。 3. 回滚更清晰：任一方案可独立回退，不互相阻塞。 ### 结论 / Outcome 两者都针对 #14046；单独合并任一条都能改善，组合合并效果最佳。 Related: #14046, #15402