#10456: fix: preserve persona and language continuity in compaction summaries

by keepitmello open 2026-02-06 14:39 View on GitHub →

agents stale

Cluster: Compaction Safeguards and Summaries

## Background I run a Korean-language persona agent on OpenClaw (custom SOUL.md + IDENTITY.md setup). After long conversations, when auto-compaction kicks in, the agent suddenly starts responding in English for a few turns before recovering. The English narration text also leaks into Telegram messages because of `blockStreamingBreak: "text_end"`. After digging into it, I found the root cause: the SDK's `autoCompact()` in `agent-session.js` hardcodes `customInstructions: undefined` when emitting `session_before_compact`. The summarization prompt and system prompt are both English-only, so the summary always comes out in English. Since the summary gets injected as a `user` message (via `COMPACTION_SUMMARY_PREFIX`), the large block of English text right before the model's next response biases it toward English output. The system prompt (with SOUL.md etc.) is correctly re-injected every run, so the persona eventually recovers — but for the first few turns after compaction, the agent is broken. ## Approach The `customInstructions` parameter already exists in the SDK's `generateSummary()` pipeline — it's just never populated during auto-compaction. Since we can't change the SDK directly, this PR works within the safeguard extension layer: 1. **Config field** — adds `compaction.customInstructions` to the agent config schema, so users can provide explicit instructions if needed. 2. **Default instructions** — when no config is set, a `DEFAULT_COMPACTION_INSTRUCTIONS` constant is injected that tells the summarizer to: - Write the summary body in the conversation's language - Focus on factual content (what was discussed, decisions made, current state) - Keep the SDK's required section headers unchanged - Not translate code, paths, or error messages 3. **Precedence chain** — `event (SDK) → config (runtime) → default constant`, with normalization (trim, empty-string-to-undefined) to prevent blank values from short-circuiting the chain. 4. **All three summarization paths covered** — dropped messages, history, and split-turn prefixes all go through the same resolver. The split-turn path composes the existing `TURN_PREFIX_INSTRUCTIONS` with the resolved instructions. ## Changes | File | What | |------|------| | `compaction-instructions.ts` | New — DEFAULT constant, `resolveCompactionInstructions()`, `composeSplitTurnInstructions()`, Unicode-safe truncation (800 char cap) | | `compaction-instructions.test.ts` | New — 35 tests covering precedence, normalization edge cases, surrogate pair safety, composition | | `zod-schema.agent-defaults.ts` | Add `customInstructions` to compaction schema | | `types.agent-defaults.ts` | Add `customInstructions` to `AgentCompactionConfig` | | `compaction-safeguard-runtime.ts` | Add `customInstructions` to `CompactionSafeguardRuntimeValue` | | `extensions.ts` | Pass config value to runtime via `setCompactionSafeguardRuntime()` | | `compaction-safeguard.ts` | Use `resolveCompactionInstructions()` across all three paths | ## Notes - Only affects `safeguard` mode — `default` mode is untouched. - This is an intentional behavior change for safeguard users: summaries will now include language preservation instructions by default. In my testing this significantly improved post-compaction continuity without affecting summary quality. - The default instructions deliberately avoid persona-specific directives (e.g. "preserve character cues") to prevent the summarizer from injecting persona descriptions into the summary — persona context belongs in the system prompt, not the compaction summary. - The 800-char cap on custom instructions prevents prompt bloat in the multi-stage summarization pipeline (~200 tokens). - Truncation uses `Array.from()` to avoid splitting surrogate pairs (emoji, CJK supplementary characters, etc). ## Test plan - [x] `tsc --noEmit` passes - [x] All 35 new unit tests pass (precedence, empty strings, whitespace, Unicode truncation, composition) - [x] Existing compaction-safeguard tests (17) and config tests (2) still pass - [x] Verified in a live Korean persona session — post-compaction summary now preserves Korean, agent stays in character