← Back to PRs

#18134: feat(discord): add semantic search tool for Discord messages

by zerone0x open 2026-02-16 14:54 View on GitHub →
size: L experienced-contributor
## Summary Implements Discord semantic search capability as requested in #17875. This enables finding Discord messages by meaning and context rather than just keywords. - **Problem:** Discord's built-in search is keyword-only and terrible at finding old threads/conversations - **Why it matters:** Users need semantic search to find discussions by concept/topic - **What changed:** Added discord_search tool with vector database backend - **What did NOT change:** No modifications to existing Discord message handling ## Change Type (select all) - [x] Feature ## Scope (select all touched areas) - [x] Skills / tool execution - [x] Integrations ## Linked Issue/PR - Closes #17875 ## User-visible / Behavior Changes - New `discord_search` tool available to agents - Semantic search of Discord message history - Automatic message indexing in background (when implemented) ## Security Impact (required) - New permissions/capabilities? **No** - Secrets/tokens handling changed? **No** - New/changed network calls? **Yes** - OpenAI embeddings API calls - Command/tool execution surface changed? **Yes** - New discord_search tool - Data access scope changed? **Yes** - Stores Discord message content locally **Risk + mitigation:** Local SQLite storage of Discord messages for search indexing. Data stays within workspace, no external transmission except OpenAI embeddings API (standard usage). ## Repro + Verification ### Environment - OS: Linux - Integration/channel: Discord ### Steps 1. Configure OpenAI API key for embeddings 2. Use discord_search tool with semantic query 3. Verify results match message content semantically ### Expected - Returns relevant Discord messages ranked by semantic similarity - Includes message links for easy navigation - Handles filters (channel, author, date range) ### Actual - ✅ Tool registers correctly in agent system - ⚠️ Testing requires running Discord integration (not tested in isolation) ## Evidence - [x] New tool implementation with proper TypeBox schema - [x] SQLite vector database integration with sqlite-vec - [x] OpenAI embeddings integration - [x] Fallback to text search when embeddings unavailable ## Human Verification (required) **What you personally verified:** - File structure follows existing patterns - Tool integration matches existing tools - Database schema handles Discord message format - Basic instantiation works **Edge cases checked:** - Missing OpenAI API key (graceful fallback) - SQLite-vec extension not available (text search fallback) - Empty search results handling **What you did NOT verify:** - Full end-to-end with live Discord messages - Performance with large message volumes - Embedding generation rate limits ## Compatibility / Migration - Backward compatible? **Yes** - Config/env changes? **No** (uses existing OpenAI config) - Migration needed? **No** ## Failure Recovery (if this breaks) - **How to disable:** Remove from agent tools list or disable via config - **Files to restore:** Only new files added, no existing files modified significantly - **Known bad symptoms:** Tool errors, database connection issues ## Risks and Mitigations - **Risk:** SQLite-vec extension not available - **Mitigation:** Graceful fallback to text-based search - **Risk:** OpenAI API quota/rate limits - **Mitigation:** Error handling with fallback search - **Risk:** Database performance with large message history - **Mitigation:** Indexed queries, configurable limits <!-- greptile_comment --> <h3>Greptile Summary</h3> Adds a new `discord-semantic-search` extension plugin implementing vector-based semantic search over Discord messages using sqlite-vec and OpenAI embeddings. The extension is structured as an opt-in plugin with its own `openclaw.plugin.json` config schema. **Critical issues that will prevent the plugin from functioning:** - `index.ts` calls `api.getPluginConfig()` which does not exist on `OpenClawPluginApi` — the plugin will never activate. Should use `api.pluginConfig`. - `tool.ts` uses `inputSchema` property and a single-arg `execute(input)` signature, but the agent framework reads `parameters` and calls `execute(toolCallId, params, signal, onUpdate)`. The tool will have no visible schema and receive wrong arguments at runtime. - `index.ts` calls `api.log?.()` which doesn't exist — should use `api.logger.info()`. **Additional logic issues:** - Config constructor spreads `...config` after defaults, which overwrites them with `undefined` values when config fields are omitted. - `indexMessagesFromResults` has a dead `skipped` counter — `indexMessage()` silently returns on duplicates instead of throwing the `"already indexed"` error the caller expects. Previous review threads (not repeated here) also flagged: using `better-sqlite3` instead of `node:sqlite`, hardcoded `process.env.OPENAI_API_KEY` instead of auth helpers, `console.warn/error` instead of subsystem logger, missing directory creation before DB init, and `better-sqlite3` not declared in `package.json` dependencies. <h3>Confidence Score: 1/5</h3> - This PR has multiple critical bugs that will prevent the plugin from activating or its tool from working at all — it should not be merged as-is. - Score of 1 reflects three independent critical issues: (1) `api.getPluginConfig()` doesn't exist so the plugin never activates, (2) tool uses wrong property name (`inputSchema` vs `parameters`) and wrong `execute` signature so the tool cannot function, (3) config defaults are overwritten by the spread pattern. Each of these alone would prevent the extension from working. - All files need attention: `index.ts` (API misuse), `src/tool.ts` (incompatible tool interface), `src/semantic-search.ts` (config bug, dead code path) <sub>Last reviewed commit: 61d5cd9</sub> <!-- greptile_other_comments_section --> **Context used:** - Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8)) <!-- /greptile_comment -->

Most Similar PRs