← Back to PRs

#18595: feat: native PostgreSQL + pgvector memory backend

by IrriVisionTechnologies open 2026-02-16 21:50 View on GitHub →
stale size: L
## Summary Implements RFC #15093 — adds a native `postgres` memory backend (`memory.backend = "postgres"`) that stores and searches memory documents directly in PostgreSQL with pgvector, as an alternative to the QMD CLI sidecar approach. ## Motivation The QMD backend (OpenClaw → shell → QMD CLI → SQLite → GGUF models) creates a fragile chain with 30+ open issues related to empty results, timeouts, cold starts, and subprocess failures. Many users already run PostgreSQL — a native backend eliminates the subprocess layer entirely. ## What it does ### PostgresMemoryManager (~500 lines) - **Hybrid search**: pgvector cosine similarity + tsvector BM25 ranking with configurable weights - **Embedding provider reuse**: Uses OpenClaw's existing Voyage/OpenAI/Gemini providers (no local GGUF models needed) - **Hash-based sync**: Only re-indexes changed files, same efficiency model as QMD - **Auto-migration**: `CREATE TABLE/INDEX IF NOT EXISTS` on first connect - **Connection pooling**: via `pg.Pool` with configurable limits - **Table prefix isolation**: Multiple agents can share one database - **Graceful degradation**: Wrapped in `FallbackMemoryManager` — falls back to builtin on failure ### Config ```json5 { memory: { backend: "postgres", postgres: { connectionString: "postgresql://user:pass@host:5432/dbname", tablePrefix: "openclaw_memory", // default embedding: { provider: "voyage", // or "openai", "gemini" model: "voyage-3-lite", dimensions: 512 }, hybrid: { enabled: true, // default vectorWeight: 0.7, // default textWeight: 0.3 // default } } } } ``` ### Schema (auto-created) ```sql CREATE TABLE openclaw_memory_documents ( id SERIAL PRIMARY KEY, collection TEXT NOT NULL, doc_path TEXT NOT NULL, doc_hash TEXT NOT NULL, content TEXT, snippet TEXT, embedding vector(512), tsv tsvector GENERATED ALWAYS AS (to_tsvector('english', coalesce(content, ''))) STORED, active BOOLEAN DEFAULT true, created_at TIMESTAMPTZ DEFAULT now(), updated_at TIMESTAMPTZ DEFAULT now(), UNIQUE(collection, doc_path) ); ``` ## Dependency `pg` (node-postgres) added as an **optional peer dependency**. Only loaded via dynamic import when `memory.backend = "postgres"` — zero impact on users who don't need it. ## Real-World Validation Built and tested by the IrriVision Technologies team running OpenClaw on an LXC container with PostgreSQL 15 + pgvector. Our proof-of-concept (`hubert-qmd`) indexed 1,145 documents with ~200ms search latency and no cold starts. ## Changes | File | Change | |------|--------| | `src/config/types.memory.ts` | Add `MemoryPostgresConfig` + `MemoryPostgresEmbeddingConfig` types | | `src/config/zod-schema.ts` | Add Postgres config validation schema | | `src/config/schema.help.ts` | Update backend help text | | `src/memory/backend-config.ts` | Add `ResolvedPostgresConfig` type + resolver | | `src/memory/postgres-manager.ts` | `PostgresMemoryManager` implementation (~500 lines) | | `src/memory/search-manager.ts` | Factory integration for postgres backend | | `package.json` | Add `pg` as optional peer dependency | | `src/memory/postgres-config.test.ts` | 5 config resolution tests | Closes #15093 <!-- greptile_comment --> <h3>Greptile Summary</h3> Implements native PostgreSQL + pgvector memory backend as an alternative to QMD CLI sidecar. The implementation includes hybrid search (vector + BM25), hash-based sync, auto-migration, and connection pooling. **Major changes:** - Added ~500 LOC `PostgresMemoryManager` with pgvector cosine similarity and tsvector full-text search - Reuses existing embedding providers (Voyage/OpenAI/Gemini) instead of local GGUF models - Factory integration with graceful fallback to builtin backend - Configuration validation via Zod schema with proper type definitions - Unit tests for config resolution (5 test cases) **Critical issues found:** - SQL injection vulnerability via unsanitized `tablePrefix` (line 138-152, 157-193 in postgres-manager.ts) - SSL certificate validation disabled with `rejectUnauthorized: false` (line 51) - Syntax error in UPSERT query referencing wrong table alias (line 417) **Recommendations:** - Sanitize `tablePrefix` to only allow `[a-zA-Z0-9_]` characters before SQL interpolation - Enable proper SSL certificate validation by default - Fix table reference in ON CONFLICT clause - Add integration tests with real PostgreSQL instance - Consider adding `@types/pg` to devDependencies <h3>Confidence Score: 1/5</h3> - This PR contains a critical SQL injection vulnerability and should NOT be merged without fixes - Score reflects the SQL injection vulnerability in `tablePrefix` handling (lines 138-193) which allows arbitrary SQL execution, plus the disabled SSL certificate validation that creates MITM attack surface. These are security-critical issues that must be resolved before merge. - Pay close attention to `src/memory/postgres-manager.ts` (SQL injection + SSL security) and `src/memory/backend-config.ts` (needs tablePrefix sanitization) <sub>Last reviewed commit: ff42b39</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs