GBrain

Type: agent-memory-system-review · Status: current · Tags: related-systems, trace-derived

GBrain is Garry Tan's personal-brain system: a Bun/TypeScript CLI with a pluggable embedded-Postgres or Supabase engine, an MCP server generated from the same operations contract, a library of fat markdown skills that an external agent executes, and a recipe catalog that turns integration setup (email, calendar, voice, Twitter, meetings) into markdown instructions an agent installs against the user's own infrastructure. The shipped code has moved from "files-plus-index" toward "thin harness with deterministic primitives," with a stronger editorial line that the skillpack — not the code — is the product.

Repository: https://github.com/garrytan/gbrain

Core Ideas

PGLite is now the default engine, and the code is engine-pluggable. src/core/engine-factory.ts dispatches on config.engine to either PGLiteEngine (embedded Postgres 17.5 via @electric-sql/pglite, plus vector and pg_trgm extensions) or PostgresEngine (Supabase/self-hosted). gbrain init creates a local brain in seconds with no server, and gbrain migrate --to supabase|pglite moves data bidirectionally when the brain outgrows local. Both backends implement the same BrainEngine interface, and search fusion, embedding, and chunking live above the engine on SearchResult[] arrays. The zero-config path is now real, which matters because the earlier review's "pays the Postgres cost early" claim no longer holds.

The page model is unchanged: compiled truth above the line, timeline below. docs/guides/compiled-truth.md codifies the contract explicitly — compiled truth is REWRITTEN when evidence changes; timeline is APPEND-ONLY; every compiled-truth claim must trace to timeline entries; the first standalone --- after frontmatter is the split marker. This is the most stable structural idea in the repo, and the guides now tie staleness detection ("compiled truth older than latest timeline entry") to this contract rather than to generic freshness metrics.

Contract-first operations fan out to CLI, MCP, and tools-json. src/core/operations.ts still holds ~30 operations with params, handler, mutating, and cliHints. The MCP stdio server and the gbrain --tools-json output regenerate from the same array. What is new is breadth: dedicated command modules for backlinks, lint, publish, report, integrations, migrate-engine, and files now sit next to the engine as deterministic primitives that skills invoke.

Deterministic primitives for quality enforcement. src/commands/backlinks.ts scans entity references in markdown, detects missing back-links from entity pages, and fixes them by appending dated timeline entries — zero LLM calls. src/commands/lint.ts catches LLM preamble artifacts, wrapping code fences, placeholder dates, and missing frontmatter with regex rules and a --fix flag. src/commands/publish.ts strips [Source: ...] citations, confirmation numbers, and internal cross-links, then produces an AES-GCM encrypted self-contained HTML. This is the concrete form of the "push intelligence UP into skills, push execution DOWN into deterministic tooling" doctrine that the ethos essays preach.

The skillpack has been broken out and given an architectural thesis. The old monolithic docs/GBRAIN_SKILLPACK.md is now an index over 17 standalone guides in docs/guides/ (brain-agent-loop, entity-detection, compiled-truth, cron-schedule, quiet-hours, sub-agent-routing, meeting-ingestion, diligence-ingestion, idea-capture, live-sync, upgrades-auto-update, ...) plus two ethos essays: THIN_HARNESS_FAT_SKILLS.md and MARKDOWN_SKILLS_AS_RECIPES.md. The skills directory now contains a publish skill and a shared _brain-filing-rules.md that other skills reference as cross-cutting filing law. The intelligence loop still lives in markdown that an external agent must execute, but the skill files are now smaller, more modular, and cross-link to each other.

Integrations are "recipes," not pipelines. src/commands/integrations.ts is a standalone module that parses recipe markdown files in recipes/ (email-to-brain, calendar-to-brain, twilio-voice-brain, x-to-brain, meeting-sync, credential-gateway, ngrok-tunnel). Each recipe is a YAML-fronted markdown file that tells an agent what credentials to ask for, what CLI to install, how to validate, and what cron to register. The integrations command does the listing, status, and doctor checks deterministically; the setup flow is explicitly delegated to skills. The "Homebrew for personal AI" framing in the essay — markdown as a distribution format the agent compiles into a local implementation — is now wired into the codebase.

The dream cycle and other cron jobs remain prescribed in docs, not executed by code. docs/guides/cron-schedule.md lists 20+ recurring jobs (email every 30 min, X every 30 min, meeting sync 3x/day, calendar weekly, daily briefing, weekly maintenance, nightly dream cycle), but the scheduler itself is still a cron expression the operator registers with their platform. The skills/migrations/ directory is new: per-version migration files with feature_pitch frontmatter that the auto-update agent reads post-upgrade and executes. That is the strongest new wiring between declared schedule and actual agent behavior, but it is one-shot post-upgrade, not a continuous scheduler.

Recipe-driven agent work still leans heavily on trace ingestion into durable entity pages. Email, calendar events, meetings, voice calls, and social posts all flow into people/, companies/, concepts/, or media/ pages; the ingest skill enforces back-linking, inline [Source: ...] citations on every fact, and timeline entries duplicated on every mentioned entity's page. Source attribution is now codified — the lint command will flag pages without it, and publish strips it before sharing.

Comparison with Our System

Dimension	GBrain	Commonplace
Primary substrate	Markdown brain in the user's own git repo, derived PGLite or Postgres index	Markdown files in git, scoped derived indexes only where needed
Knowledge shape	Personal-intelligence pages (people, companies, concepts, media) with compiled-truth plus timeline	Methodology notes, reference docs, ADRs, instructions, workshop artifacts
Engine posture	Pluggable PGLite (default, zero-config) or Supabase Postgres, bidirectional migration	Plain files plus `rg` plus narrow validation/index commands
Agent contract	Fat markdown skills and recipes that an external agent executes	Skills and type instructions embedded in the KB's own methodology
Operations surface	Contract-first `operations.ts` regenerates CLI, MCP, tools-json	Standalone `commonplace-*` CLI commands; no unified operation registry
Quality enforcement	Deterministic `backlinks`, `lint`, `publish` commands with `--fix` flags	Deterministic `validate-notes` plus semantic review bundles and human gates
Learning loop	Agent ingests traces (meetings, email, social, voice) into entity pages; cron-prescribed dream cycle consolidates	Workshop-to-library distillation with human+agent curation and review gates
Link model	Typed DB edges plus wiki-style back-links enforced by `check-backlinks`	Markdown links with articulated relationship prose in the argument
Governance	RLS checks, idempotent imports, version snapshots, lint, back-link audit, `gbrain doctor`	Frontmatter validation, semantic review, type instructions, linking contracts

GBrain and Commonplace now agree more explicitly on the architectural shape: thin harness, deterministic primitives at the bottom, markdown skills on top, and the model as the runtime that binds them. The shipped code has moved toward this position since the previous review — PGLite default, standalone command modules, broken-out guides, explicit ethos essays. The durable divergences are still about what the knowledge is for. GBrain answers high-volume personal-intelligence questions about people, companies, deals, and ideas; Commonplace preserves transferable methodology claims. That keeps entity identity, recipe integrations, and back-link enforcement central to GBrain, and keeps link semantics, claim maturation, and review quality central to Commonplace.

Two specific gaps are worth naming. GBrain's lint is regex-based artifact detection; Commonplace's validate-notes and review bundles check frontmatter discipline and semantic coherence. GBrain's check-backlinks enforces a structural invariant (every mention of a page has a timeline entry on the target); Commonplace has no equivalent automated back-link pass, relying on link articulation during /connect. These are complementary automations, not competitors.

Borrowable Ideas

Deterministic lint for authored artifacts, with a --fix mode. Ready to borrow. GBrain's lint catches LLM preambles, wrapping ```markdown fences, placeholder YYYY-MM-DD dates, missing frontmatter — the mechanical tax of agent-authored content. Commonplace writes and reviews a lot of agent-authored notes; a commonplace-lint pass focused on common artifact patterns would complement validate-notes without duplicating it. Frontmatter-level quality is already covered; prose-level artifact detection is not.

Back-link audit as a deterministic command. Needs adaptation. GBrain's check-backlinks treats every entity mention as an obligation and a timeline entry as the fix. Commonplace's link model is relational rather than transactional — a grounds link is not the same as a mention. But the pattern of "find asymmetric or missing links and surface them for resolution" generalizes: a command that finds notes referenced in argument prose but not linked formally would be a useful pre-review gate.

Bidirectional engine migration as a first-class command. Just a reference for now. GBrain's gbrain migrate --to supabase|pglite is cleaner than most data-migration stories because both sides implement the same interface. Commonplace has no analogue because it does not currently own a database, but if that ever changes, the two-backend-one-interface pattern (with a migration command that is the only engine-crossing code path) is the right shape.

Separating the integration catalog from the setup flow. Ready to borrow as a framing. GBrain's recipes are catalog entries (metadata plus setup instructions) that the integrations command lists and validates, while the actual installation is a skill the agent runs. The boundary — deterministic catalog in code, conversational setup in markdown — maps cleanly onto how Commonplace could handle workshop-to-library promotion or external-source ingestion: a typed catalog with deterministic status checks, plus a skill for the judgment-bearing work.

Per-version migration files that the auto-update agent executes. Needs a use case first. GBrain's skills/migrations/v*.md files are executed once post-upgrade by the auto-update agent to ensure existing installs get new setup steps. Commonplace does not currently ship to external users, but if it starts versioning its own skills and instructions, this pattern — declarative migration markdown read by a scheduler — is a cleaner handoff than hoping users re-read the changelog.

Source attribution as a hard lint rule. Ready to borrow at a smaller scope. GBrain requires [Source: ...] citations on every fact in a brain page; the lint and publish commands treat them as structural. Commonplace already wants provenance on claims in notes, but enforcement is informal. A lint rule that requires a citation-like reference on factual claims in structured-claim notes would be a cheap structural pressure.

Curiosity Pass

Does "thin harness, fat skills" describe what the repo does, or what the author wants other systems to do? The ethos essays are sharp and the architecture in the repo now matches them more closely — broken-out guides, deterministic command modules, MCP generated from operations. But the repo still carries fat skill files that assume a capable external agent (OpenClaw, Hermes) to execute them faithfully, and the dream cycle is still a cron expression the operator installs on their own platform. The harness is thin because the agent is big, not because the logic is simple. That is an honest trade, but worth naming: the capability is split across code, skills, and the agent's own judgment, and the boundary is not always obvious from reading the repo alone.

Compiled-truth plus timeline is still the strongest idea, and it is starting to pay interest. The lint, back-link, and publish commands all assume the page structure. The dream cycle guide defines stale pages in terms of timeline-vs-compiled-truth comparison. The migrations pattern uses feature_pitch frontmatter. The format has gone from a page convention to an operational contract that multiple tools depend on. That is exactly how a knowledge shape earns its keep — by becoming the reference that other mechanisms coordinate against.

"Markdown is code" risks overclaiming, but the mechanism is real. The recipes in recipes/ are genuinely agent-installable — not because the markdown is executed, but because it encodes the setup specification densely enough that a frontier model can realize it against the user's infrastructure. The claim "there is no upstream code, only a description of what to build" in MARKDOWN_SKILLS_AS_RECIPES.md overstates things (the recipes assume GBrain's own CLI, API keys, and cron); but the narrower claim — that sufficient specificity in markdown becomes a portable installation artifact when coupled with a capable agent — is defensible, and the recipes directory is the evidence.

Pluggable engines reduce a real barrier. The previous review worried that "GBrain pays the Postgres/pgvector cost early." With PGLite as default, that cost is two seconds and zero accounts for small brains, with a migration path when scale forces it. The curious question is whether the engine abstraction survives real workload divergence — for instance, if Supabase gains features (realtime, remote MCP, OAuth) that PGLite cannot mirror, the interface starts to leak.

Chunking strategy is still simpler than advertised. The README advertises three chunking strategies (recursive, semantic, LLM-guided) dispatched by content type. src/core/chunkers/ and related modules exist, but recursive chunking is still the dominant path in importFromContent and embed.ts. The claim is not vapor — the code is there — but the live behavior is simpler than the three-tier story the README tells.

What to Watch

Whether the migrations pattern in skills/migrations/ generalizes from post-upgrade one-shots into a continuous agent-executed scheduler, or whether cron remains the only actual runtime contract
Whether the engine abstraction holds as Supabase-specific features (remote MCP, OAuth, storage) expand and PGLite stays single-process
Whether the recipe catalog becomes a genuine distribution format (third-party recipes, signature verification, dependency resolution) or stays a first-party catalog maintained in-tree
Whether lint grows from artifact regex into semantic quality checks, and whether check-backlinks evolves from structural audit into relationship-aware link analysis
Whether the dream cycle's consolidation, contradiction surfacing, and merge-quality checks appear in code, or continue to be described in guides and executed by the agent's judgment

Trace-derived learning placement. GBrain consumes a wide variety of agent-observable signals (inbound messages, email threads, calendar events, meeting transcripts, social posts, voice calls) and prescribes that every signal terminate in durable markdown pages. Trace source. Conversation messages, service events (email, X, calendar, meeting transcripts), and voice transcripts — each recipe defines its own trigger (cron every 30 min for email/X, 3x/day for meetings, real-time for voice) and ingestion boundary. Extraction. Per-recipe extraction into entity pages: compiled-truth rewrites, timeline appends, cross-reference links, source citations. The oracle is the external agent's judgment plus the filing rules in skills/_brain-filing-rules.md and the iron laws in each skill (back-linking, citation). Promotion target. Symbolic artifacts only — markdown pages with structured compiled-truth-plus-timeline format, stored in the user's git repo; the Postgres/PGLite index is derived. No weights, no fine-tuning, no compiled runtime state. Scope. Per-entity and cross-task — pages accumulate across meetings, emails, and conversations, and the dream cycle is meant to consolidate them further. Timing. Online during deployment (entity detection fires on every message), online scheduled (cron jobs every 30 min to weekly), and staged in cycles (nightly dream cycle for consolidation, weekly maintenance, per-version migration skills). On the survey's axes: axis 1 (ingestion pattern) has no exact bucket here; GBrain is closest to service-owned trace backend because recipe-defined collectors normalize heterogeneous traces into compiled-truth-plus-timeline markdown, but the backend is user-owned and recipe-mediated rather than a shared schema-owning service. Axis 2 (substrate class) places GBrain firmly in symbolic artifact learning — the artifact is the markdown page, the maintenance path is compiled-truth rewrite plus append-only timeline, and the backend is either embedded Postgres or Supabase. GBrain strengthens the survey's observation that "trace ingestion should terminate in domain pages, not a generic memory bucket," and it extends it with the recipe-as-distribution pattern — each trace pathway ships as a self-contained markdown installer. This does not warrant a new subtype; it is a sharper instance of the service-owned-backend pattern where the service boundary is drawn at the recipe catalog rather than a shared server.

Relevant Notes:

files beat a database for agent-operated knowledge bases — contrasts: GBrain keeps markdown as the source of truth but now ships a zero-config PGLite index, narrowing the database tax considerably
trace-derived learning techniques in related systems — extends: GBrain's recipe catalog is a new boundary for trace ingestion that sits between service-owned backends and single-session extensions
Distillation — exemplifies: compiled-truth rewriting is distillation of timeline evidence into a current synthesis, codified as a page-local contract
pointer design tradeoffs in progressive disclosure — extends: GBrain's chunks, search scores, and stale flags are query-time and page-local pointers; the recipe frontmatter is an authored pointer for the agent
Claw learning loops must improve action capacity not just retrieval — exemplifies: GBrain's recipes target communication, briefing, meeting prep, and publishing, not only question answering
Napkin — compares: both adapt markdown knowledge for agents; Napkin stays closer to Obsidian/file tooling, GBrain now ships a PGLite engine plus recipe catalog
Cognee — contrasts: cognee is pipeline-first and developer-managed; GBrain is agent-operated with deterministic primitives and human-readable compiled pages

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search