← Back to PRs

#20147: feat(routing): add L1.5 semantic router for task classification 🤖

by irchelper open 2026-02-18 15:48 View on GitHub →
docs channel: discord gateway agents size: XL
## Summary Add a semantic router (L1.5) between L1 keyword matching and L2 LLM classification for intelligent task-type resolution using local embeddings. - **SemanticRouter class** (`semantic-router.ts`): cosine similarity matching against pre-computed utterance embeddings, with configurable threshold (default 0.68) and confidence gap check (default 0.05) - **Utterance library** (`utterances.ts`): 300+ example utterances across 23 TaskTypes (Chinese/English), FALLBACK excluded from matching - **Task resolver async upgrade** (`task-resolver.ts`): L1 keyword match (sync) → L1.5 semantic match → fallback, with debug logging for hit tracking (`[routing] L1/L1.5/fallback`) - **Short-window context enrichment**: prepends last 2 `InboundHistory` messages to semantic query for terse inputs ("做", "好", "继续") - **LRU embedding cache**: 100-entry query cache to avoid redundant embedding computation - **Routing instance integration**: `setEmbeddingProvider()` on `RoutingInstance` for lazy background init - **Routing instance integration**: opt-in activation via `routing.semantic_router.enabled: true`; no-op when disabled or unconfigured (existing deployments unaffected) - **Zod schema**: `routing.semantic_router` config validation (`enabled`, `threshold`, `min_gap`, `custom_utterances`) ### Design decisions - Zero new dependencies — reuses existing `EmbeddingProvider` (node-llama-cpp / embeddinggemma-300m-qat-Q8_0) - L1 keyword matching intentionally ignores `recentContext` to prevent keyword leakage from history - Confidence gap prevents ambiguous classifications (top-1 vs top-2 score delta < min_gap → fallback) - FALLBACK has no utterances — unmatched inputs naturally fall through - **Opt-in by default** — semantic router only activates when `routing.semantic_router.enabled: true` is explicitly set; existing deployments are unaffected. Custom utterances can be merged per-TaskType via `custom_utterances` to extend or override built-in examples without forking the utterance library. ## Testing - **241 tests**, all passing across 11 test files - Unit tests: cosine similarity, threshold, gap check, cache (hit/miss/eviction/clear), edge cases (empty input, long input, concurrent resolve, re-init) - Integration tests: L1→L1.5→fallback chain, recentContext forwarding, mock SemanticRouter - E2E tests: full `resolveTaskType` + mock EmbeddingProvider, threshold/min_gap config wiring, `setEmbeddingProvider` lifecycle - Schema regression tests: `semantic_router` field validation (full/minimal/strict) - Manager wiring test: verifies `setEmbeddingProvider` called with correct args ```bash npx vitest run src/gateway/routing/ src/config/config.schema-regressions src/auto-reply/reply/get-reply-directives.antiflap src/memory/__tests__/manager-routing-wiring # 11 test files, 241 tests passed npx oxlint src/gateway/routing/ # 0 errors ```

Most Similar PRs