← Back to PRs

#15383: fix(sessions_send): avoid announce delivery when announce step resolves to ANNOUNCE_SKIP

by Zjianru open 2026-02-13 11:14 View on GitHub →
agents stale size: M
## 中文说明 ### 背景 在真实多 agent 编排中(`sessions_send` 的 A2A 流程),我们观察到一个间歇性问题: - 子 agent 在 `Agent-to-agent announce step.` 明确返回 `ANNOUNCE_SKIP` - 但系统仍会向用户通道发送 announce(重复/误发) 这与 #14046 的报告一致。 ### 根因分析 `runSessionsSendA2AFlow` 的 announce 发送逻辑依赖 `runAgentStep -> chat.history` 读取最新回复。 在某些时序下,`chat.history` 会短暂返回旧文本(stale read),导致: - announce 实际回复已是 `ANNOUNCE_SKIP` - 但读取到的是前一条非 skip 文本 - 因而误触发 `send` ### 修复 在发送前新增 `shouldDeliverAnnounce(...)` 保护: 1. 先按原逻辑判断 `announceReply` 2. 再做最多两次轻量二次确认(`readLatestAssistantReply(limit: 20)`) 3. 若确认最新回复为 `ANNOUNCE_SKIP`,则强制不发送 4. 若确认与候选 announce 一致,则允许发送 这样可抑制 history 短暂滞后带来的误发。 ### 测试 新增两条测试: 1. `sessions_send should not deliver when announce step returns ANNOUNCE_SKIP` 2. `sessions_send should not deliver when announce step is ANNOUNCE_SKIP even if history is stale` 并验证 `src/agents/openclaw-tools.sessions.test.ts` 全量通过(12/12)。 --- ## English ### Background In real multi-agent orchestration (`sessions_send` A2A flow), we observed an intermittent bug: - target agent returns `ANNOUNCE_SKIP` in `Agent-to-agent announce step.` - but an announce is still delivered to the user channel This matches #14046. ### Root cause `runSessionsSendA2AFlow` relies on `runAgentStep -> chat.history` for the announce reply. Under timing races, `chat.history` can briefly return stale content, so: - announce step already produced `ANNOUNCE_SKIP` - but code reads a previous non-skip reply - then incorrectly calls `send` ### Fix Added `shouldDeliverAnnounce(...)` guard before delivery: 1. keep existing check on `announceReply` 2. perform up to two lightweight re-reads (`readLatestAssistantReply(limit: 20)`) 3. if latest reply is `ANNOUNCE_SKIP`, skip delivery 4. only deliver when latest reply matches the candidate announce text This prevents false delivery caused by transient history lag. ### Tests Added two tests: 1. `sessions_send should not deliver when announce step returns ANNOUNCE_SKIP` 2. `sessions_send should not deliver when announce step is ANNOUNCE_SKIP even if history is stale` `src/agents/openclaw-tools.sessions.test.ts` passes fully (12/12). Related: #14046. <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR adds a `shouldDeliverAnnounce()` safeguard to the `sessions_send` A2A flow to prevent delivering an announce when the announce step resolves to `ANNOUNCE_SKIP`, addressing observed stale reads from `chat.history`. It also adds two tests intended to cover direct `ANNOUNCE_SKIP` and a simulated stale-history scenario. Key integration point: `runSessionsSendA2AFlow()` now calls `shouldDeliverAnnounce()` before `callGateway({ method: "send" })`, and `shouldDeliverAnnounce()` re-reads the latest assistant reply via `readLatestAssistantReply()` (which internally calls `chat.history`). <h3>Confidence Score: 3/5</h3> - This PR is directionally correct but has correctness gaps in the new guard and test modeling that should be fixed before merging. - The announce-delivery guard currently doesn’t enforce the advertised “latest reply must equal candidate” invariant, so it can still deliver stale announces under plausible message ordering. Additionally, the new tests’ `chat.history` mock is keyed to `runId` state even though production reads history by `sessionKey`, reducing confidence that the tests validate the real race. - src/agents/tools/sessions-send-tool.a2a.ts, src/agents/openclaw-tools.sessions.test.ts <sub>Last reviewed commit: 419a675</sub> <!-- greptile_other_comments_section --> <sub>(4/5) You can add custom instructions or style guidelines for the agent [here](https://app.greptile.com/review/github)!</sub> <!-- /greptile_comment --> ## 与 #15402 的关系 / Relation to #15402 这是同一问题的“上层防护解法(application-level mitigation)”。 - 本 PR(#15383)在 `sessions_send` announce 投递前做防护判定,优点是改动集中、风险低、可以快速缓解误投递。 - 互补 PR(#15402)从底层改 `agent.wait`,提供确定性的 `finalAssistantText`,从根上减少 history 竞态。 ### 为什么拆成两个 PR / Why split into two PRs 1. 风险面不同:#15383 是局部行为修复;#15402 涉及网关返回契约与更广调用链。 2. 合并策略更灵活:可先合并低风险缓解,再评审底层方案。 3. 回滚更清晰:任一方案可独立回退,不互相阻塞。 ### 结论 / Outcome 两者都针对 #14046;单独合并任一条都能改善,组合合并效果最佳。 Related: #14046, #15402

Most Similar PRs