#20241: fix(memory-lancedb): consolidate preference keyword/category detection (EN+CS+CJK)
extensions: memory-lancedb
size: XS
## Summary\n- align preference category detection with capture triggers\n- include CJK capture trigger coverage (e.g. 偏好/喜欢/习惯/通常/会)\n- include CJK category detection for preference/decision/entity/fact while preserving EN+CS keywords\n- consolidate follow-up logic from prior split PRs into one branch\n\n## Validation\n-
RUN v4.0.18 /mnt/d/codex_openai/openclaw_codex/.tmp/openclaw_upstream
✓ extensions/memory-lancedb/index.test.ts (12 tests | 1 skipped) 3028ms
✓ memory plugin registers and initializes correctly 3020ms
Test Files 1 passed (1)
Tests 11 passed | 1 skipped (12)
Start at 01:55:32
Duration 5.14s (transform 372ms, setup 532ms, import 53ms, tests 3.03s, environment 0ms)\n-
> openclaw@2026.2.18 protocol:gen /mnt/d/codex_openai/openclaw_codex/.tmp/openclaw_upstream
> node --import tsx scripts/protocol-gen.ts
wrote /mnt/d/codex_openai/openclaw_codex/.tmp/openclaw_upstream/dist/protocol.schema.json\n-
> openclaw@2026.2.18 protocol:gen:swift /mnt/d/codex_openai/openclaw_codex/.tmp/openclaw_upstream
> node --import tsx scripts/protocol-gen-swift.ts
wrote /mnt/d/codex_openai/openclaw_codex/.tmp/openclaw_upstream/apps/macos/Sources/OpenClawProtocol/GatewayModels.swift
wrote /mnt/d/codex_openai/openclaw_codex/.tmp/openclaw_upstream/apps/shared/OpenClawKit/Sources/OpenClawProtocol/GatewayModels.swift\n-
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
This PR consolidates preference keyword and category detection in the `memory-lancedb` extension to support CJK (Chinese) text alongside existing English and Czech patterns. It adds CJK memory capture triggers, expands the English preference regex with additional keywords (`need`, `always`, `never`, `important`, `preferuji`, `nechci`), and introduces CJK-specific branches in `detectCategory` for preference, decision, entity, and fact categories. A thoughtful guard against broad CJK copulas in questions is also included.
- Adds 5 new CJK regex patterns to `MEMORY_TRIGGERS` for capture detection
- Expands `detectCategory` with CJK branches for all categories and broader EN preference keywords
- Adds a question-mark guard to avoid misclassifying CJK questions as facts
- **Bug: duplicate CJK preference regex block** at lines 278-280 in `index.ts` — identical to lines 275-277, making it dead code
- Good test coverage added for all new CJK paths and expanded EN keywords
<h3>Confidence Score: 4/5</h3>
- This PR is safe to merge after removing the duplicate code block; no behavioral risk otherwise.
- The changes are additive regex patterns for CJK support with no risk to existing EN/CS behavior. The only issue is a copy-paste duplicate block that is harmless dead code but should be cleaned up. Tests pass and cover all new branches.
- `extensions/memory-lancedb/index.ts` has a duplicate CJK preference block (lines 278-280) that should be removed.
<sub>Last reviewed commit: 25185e4</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#15896: fix(memory-lancedb): capture even with injected recall context
by aelaguiz · 2026-02-14
75.9%
#16411: fix(agents): support CJK sentence punctuation in block chunker
by ciberponk · 2026-02-14
73.2%
#3401: fix(memory-lancedb): improve autoCapture with turn-by-turn processing
by mike-nott · 2026-01-28
73.1%
#8504: fix: prevent false positives in isSilentReplyText for CJK content
by hanxiao · 2026-02-04
72.0%
#17686: fix(memory): support non-ASCII characters in FTS query tokenization
by Phineas1500 · 2026-02-16
71.8%
#19916: fix: strict silent-reply detection to prevent false positives with ...
by hayoial · 2026-02-18
71.7%
#17624: Fix memory flush YYYY-MM-DD placeholder resolution
by grunt3714-lgtm · 2026-02-16
70.5%
#16669: feat(memory-lancedb): add memory_search/memory_get compatibility re...
by ciberponk · 2026-02-15
70.4%
#22692: fix(memory-lancedb): [P1] add missing runtime deps — plugin broken ...
by mahsumaktas · 2026-02-21
70.1%
#22375: fix(audit): use human-readable labels for RegExp patterns in post-c...
by aldoeliacim · 2026-02-21
69.9%