llm vs sqlite-vec — replacement evaluation
Evaluation of two candidate substrates for the role qmd plays in Commonplace today. For what qmd is used for and what the replacement must cover, see README.md; for the qmd problems being addressed, see qmd-issues.md.
Both candidates share the key property qmd lacks: search is pure vector math; embedding generation is a separate, pluggable concern. They also avoid qmd MCP's single-client daemon problem — parallel agents can run independent read-only search commands against the same SQLite-backed state without negotiating session ownership. Reseek was checked and excluded from the comparison because it is a hosted second-brain product, not a local substrate.
Option A — llm CLI with SQLite storage
Simon Willison's llm CLI (pipx install llm) plus an embeddings plugin.
- CLI verbs:
llm embed-multi <collection> --files <dir> '**/*.md' -m <model>;llm similar <collection> -c "query" -n 5;llm collections list. - Process model: pure CLI. No server, no daemon, no port. Each invocation is a short-lived Python process that opens the SQLite file, runs, and exits. Parallel agents just open the same DB; SQLite handles concurrent readers natively.
- Storage: single SQLite DB, path controllable via
--databaseorLLM_USER_PATH. Easy to place under.qmd/(or.search/) in the repo. - Embedding backends: plugins cover OpenAI (built-in), Voyage, Gemini, Cohere, Ollama, sentence-transformers. We can pick a CPU-friendly local model (
all-MiniLM-L6-v2viallm-sentence-transformers) or an API. No GPU ever required. - Collections: first-class concept, named and stored in the same DB.
- Incremental refresh:
embed-multiskips unchanged IDs when the input stream repeats them; for directory-tree refresh we'd script "list files, pass to stdin" and letllmde-dup by ID. - Metadata: arbitrary JSON per record, indexable.
Strengths: working CLI surface, large plugin ecosystem, single well-known maintainer, standard SQLite schema. Adopting it collapses "install + configure + command surface" in one step.
Gaps vs qmd surface:
- No get / multi-get by URI — trivial to replace with Read against the stored path.
- No built-in keyword search — we already use rg for that; no regression.
- No MCP endpoint — the dropped requirement.
- No directory → collection binding in config. We'd keep the qmd-collections.yml shape (renamed) as a Commonplace-side config and feed paths to embed-multi from a small commonplace-search driver.
Option B — sqlite-vec SQLite extension with a Commonplace-owned driver
Alex Garcia's sqlite-vec (vec0 virtual table, vec_distance_* functions) embedded into a commonplace-search Python module.
- CLI verbs: we write them.
commonplace-search update,commonplace-search query, plus whatever else we need. - Process model: no process at all.
sqlite-vecis a loadable SQLite extension — a single.so/.dylibthat gets loaded into whatever SQLite client already opened the DB (Python'ssqlite3, thesqlite3CLI, etc.). No daemon, no port, and even less surface thanllm. Concurrent reads are SQLite's default; writes serialize via the file lock. - Storage: single SQLite DB with a
vec0virtual table and a sibling metadata table. We own the schema entirely. - Embedding backends: we pick. Easiest: OpenAI
text-embedding-3-smallvia theopenaiSDK, orsentence-transformersfor offline. Pluggable only if we invest. - Collections: just a column in our table. Trivial.
- Incremental refresh: we own the mtime / content-hash bookkeeping.
- Metadata: arbitrary columns.
Strengths: smallest runtime surface, zero non-SQLite processes, cleanest tailored data model, no third-party CLI conventions to fight, single-file distribution. No plugin stack to vet for sandbox friendliness.
Gaps vs qmd surface: everything in the user-visible CLI must be built. That includes the driver for "walk a directory, diff against stored content hashes, call the embedder for changed docs, write vectors and metadata". Rough size: a few hundred lines of Python. No plugin ecosystem — switching embedder means code changes.
Hybrid — llm as the embedder, sqlite-vec as the store
Use llm's plugin ecosystem solely for embedding generation (run the CLI non-interactively or call the Python library), and persist into a sqlite-vec table we own. Gets the plugin breadth of Option A with the schema control of Option B, at the cost of maintaining the glue.
Side-by-side
| Concern | llm |
sqlite-vec |
|---|---|---|
| CLI already exists | Yes | No — we build it |
| Server / daemon required | No — short-lived CLI process per call | No — SQLite extension loaded in-process |
| Storage is SQLite | Yes (schema fixed by llm) |
Yes (we own schema) |
| Embedding backends | Plugin ecosystem | Whatever we import |
| Collections | First-class | A column we add |
| Incremental refresh | Content-ID de-dup; we drive the list | We drive everything |
| Sandbox friendliness | No GPU need; DB path configurable | No GPU need; nothing except SQLite |
| Parallel Codex sessions | Multiple CLI reads against one DB; no MCP session ownership | Multiple SQLite readers by default; no daemon |
| Maintainer footprint | Large user base | Large user base |
| Dependency weight | llm + chosen plugin(s) |
SQLite extension (single .so) |
| Build time | Hours | Days |
| Long-term fit for Commonplace-specific needs | Medium — we bend to its schema | High — we own the shape |
Recommendation
Lead with Option A (llm) for v1. Reasons:
- It closes the gap fastest: the CLI verbs exist, the storage is SQLite, the sandbox story is clean, and the embedding choice is deferrable.
- The parts of qmd we actually use map onto
llmcommands directly:embed-multireplacesupdate+embed,similarreplacesquery, collection scoping replaces-c <collection>. - If
llm's schema ever pinches — for example if we need custom metadata not expressible in its JSON column, or much higher throughput — Option B is a migration, not a rewrite: the embeddings are just floats, and the corpus is just markdown files on disk.
Keep Option B (sqlite-vec) as the fallback if (a) llm's schema or CLI assumptions become load-bearing against us, or (b) we want to eliminate one more upstream dependency once the shape has stabilized.
Reject the hybrid unless we hit a specific need that forces it.