KB design
Type: index · Status: current
How agent-operated knowledge bases are built, installed, and evaluated. Architecture decisions, skill design, and the evaluation loop for the knowledge system itself. For document structure and types, see document-system. For the learning theory knowledge bases draw on, see learning-theory.
Architecture
- context-efficiency-is-the-central-design-concern-in-agent-systems — context is the scarce resource in agent systems; context cost has two dimensions (volume and complexity) and nearly every architectural pattern is a response to one or both
- areas-exist-because-useful-operations-require-reading-notes-together — areas are operational scopes for orientation and comparative reading; boundaries should optimize yield-per-context, not taxonomy
- files-not-database — files with git beat a database for agent KBs: universal interface, free versioning, zero infrastructure; derived indexes solve scale without replacing the source of truth
- commonplace-architecture — the commonplace repo structure: kb/, scripts/, and how they compose
- commonplace-installation-architecture — how commonplace installs into projects: symlinks, CLAUDE.md generation, directory layout
- agents-md-should-be-organized-as-a-control-plane — theory for AGENTS.md as a control plane: invariants, routing, escalation boundaries, nested topology, and exclusion rules
- instruction-specificity-should-match-loading-frequency — CLAUDE.md should be a slim router, not a manual; match instruction specificity to loading frequency
- scenario-decomposition-drives-architecture — deriving architectural requirements from concrete user stories decomposed into step-by-step context needs; confirms the loading hierarchy
- scenarios — concrete use cases the knowledge system must serve
- generate-instructions-at-build-time — generate CLAUDE.md and routing tables at build time rather than maintaining them by hand
- indirection-is-costly-in-llm-instructions — in LLM instructions, every layer of indirection costs context and interpretation overhead on every read, unlike code where indirection is nearly free at runtime
- frontloading spares execution context — pre-compute static parts of instructions and insert results; the mechanism is partial evaluation; indirection elimination and build-time generation are specific cases
- injectable configuration extends frontloading to installation-specific values — values static per-installation but variable across installations (sibling repo paths, local tools) are frontloadable through config the orchestrator injects into sub-agent frames
Skills & Methodology
- agent-statelessness-makes-routing-architectural-not-learned — all knowledge routing infrastructure (skills, type templates, routing tables, naming conventions) is permanent architecture for LLM agents, not scaffolding that learners outgrow
- capability-placement-should-follow-autonomy-readiness — capabilities belong in skills only when autonomy-ready; otherwise keep them in instructions or methodology artifacts, not AGENTS inventories
- skills-derive-from-methodology-through-distillation — methodology-to-skill derivation is distillation (extracting procedures from reasoning in the same medium), distinct from codification and constraining
- methodology-enforcement-is-constraining — instructions, skills, hooks, and scripts form a constraining gradient for methodology; practices start stochastic and harden as they prove out
- instructions-are-typed-callables — skills and tasks are typed callables: they accept document types as input and produce types as output, and should declare their signatures
- ad-hoc-prompts-extend-the-system-without-schema-changes — the other end of the spectrum: ad hoc instructions notes absorb new requirements without any schema change; the collections problem is a concrete example
- design-methodology-borrow-widely-filter-by-first-principles — borrow from software engineering, library science, knowledge management — but filter through first principles before adopting
- agent-statelessness-means-harness-should-inject-context-automatically — since agents never internalize, the harness should inject context automatically rather than relying on agent initiative
- instructions-are-skills-without-automatic-routing — reusable distilled procedures in kb/instructions/ — same format as skills but without activation triggers; invoked when a human points the agent at them
- distillation-status-determines-directory-placement — (seedling) hunch that procedural artifacts distilled for execution belong in kb/instructions/; the boundary is "distilled into a procedure", not just "compressed"
Evaluation
- what-works — proven patterns: prose-as-title, template nudges, frontmatter queries, discovery-first
- what-doesnt-work — anti-patterns and insufficient evidence: auto-commits, queue overhead
- needs-testing — promising but unconfirmed: extract/connect/review cycle, input classification
- what-cludebot-teaches-us — techniques from cludebot worth borrowing, what we already cover, and what to watch for at scale
- prompt-ablation-converts-human-insight-to-deployable-framing — methodology for testing prompt framings: vary only the framing against a known-correct target, analyze mechanisms, deploy the winner as instruction
Design Principles
- a good agentic KB maximizes contextual competence through discoverable, composable, trustworthy knowledge — unifying theory: three properties (discoverable, composable, trustworthy) serve contextual competence under bounded context; three learning operations (constraining, distillation, discovery) improve them; Deutsch's reach criterion measures knowledge quality
- a-knowledge-base-should-support-fluid-resolution-switching — good thinking requires moving between abstraction levels; KB quality should be measured by how fluidly it supports this resolution-switching, not just retrieval accuracy
- mechanistic constraints make Popperian KB recommendations actionable — bridges Popperian conjecture-and-refutation with bounded-context mechanics and proposes concrete upgrades (falsifiers, contradiction passes, oracle-aware hardening)
- Alexander's patterns connect to knowledge system design at multiple levels — (speculative) pattern language as document types, generative processes as codification, centers as mutual reinforcement in the note graph
Workshop Layer
- a-functioning-kb-needs-a-workshop-layer-not-just-a-library — the library type system models durable knowledge but not work-in-motion with state machines, dependencies, and expiration
-
active-campaign-understanding-needs-a-single-coherent-narrative-not-composed-notes — working understanding during an active campaign needs holistic rewrite, not graph composition; theorist exemplifies this as a workshop artifact
-
vibe-noting — (seedling) vibe coding works because code is inspectable, not just verifiable; a KB adds that same inspectability to knowledge work, enabling a similar flywheel for reasoning
Gaps
- automating-kb-learning-is-an-open-problem — the KB already learns through manual human+agent work; the open problem is automating the judgment-heavy mutations (connections, groupings, synthesis)
- claw-learning-is-broader-than-retrieval — a Claw's learning must improve action capacity (classification, planning, communication), not just retrieval; needs different knowledge types and evaluation criteria
- deep-search-is-connection-methodology-applied-to-temporarily-expanded-corpus — /connect's dual discovery and articulation testing are corpus-agnostic, so deep search means temporarily expanding the corpus with web results
Decisions
- 001-generate-topic-links-from-frontmatter — replace LLM-generated Topics footers with deterministic script
Reference material
- Toulmin argument — formal argumentation model (claim/grounds/warrant/qualifier/rebuttal/backing) that grounds claim-title conventions and the
structured-claimtype - Agentic Note-Taking 23: Notes Without Reasons — practitioner validation of propositional links over embedding-based adjacency; confirms the Goodhart risk in quality signals
- A-MEM: Agentic Memory for LLM Agents — academic paper implementing Zettelkasten-inspired automated memory with link generation and memory evolution; provides empirical evidence for boiling cauldron mutations and scaling data for embedding-based linking
- Context Engineering for AI Agents in OSS — empirical study of AGENTS.md/CLAUDE.md adoption in 466 OSS projects; validates the loading-frequency principle's content categories, provides evolution data showing constraining maturation in the wild, and confirms the dual-audience split between human READMEs and machine context files
Related Tags
- document-system — types, writing conventions, and validation that the KB's documents follow
- learning-theory — the learning mechanisms (constraining, codification, distillation) that KB operations instantiate
- computational-model — PL concepts (scheduling, partial evaluation, scoping) that inform KB architecture; the scheduling notes moved here
- links — linking methodology, navigation, and link contracts
- maintenance — detection, operations, and dynamics that keep the KB healthy over time
- related-systems — external system comparisons
All notes
- 004-Replace areas with tags — Replaces the areas frontmatter field with freeform tags and restructures index pages to have both curated and generated sections, decoupling navigation from comparative reading
- A functioning knowledge base needs a workshop layer, not just a library — The current type system models permanent knowledge (library) but not in-flight work with state machines, dependencies, and expiration (workshop) — tasks are a prototype of the missing layer, and a functioning knowledge base needs both plus bridges between them
- A good agentic KB maximizes contextual competence through discoverable, composable, trustworthy knowledge — Theory of why commonplace's arrangements work — three properties (discoverable, composable, trustworthy) serve contextual competence under bounded context; accumulation is the basic learning operation (reach distinguishes facts from theories); constraining, distillation, and discovery transform accumulated knowledge; Deutsch's reach criterion distinguishes knowledge that transfers from knowledge that merely fits
- A knowledge base should support fluid resolution-switching — Good thinking requires moving between abstraction levels — broad for context, narrow for mechanism, back out for pattern. A KB's quality should be measured by how fluidly it supports this resolution-switching, not just retrieval accuracy.
- Active-campaign understanding needs a single coherent narrative, not composed notes — Why durable-knowledge graph composition (many linked notes) is wrong for tracking understanding during active engineering — a single holistically rewritten narrative maintains the coherence that working memory requires
- Ad hoc prompts extend the system without schema changes — When a new requirement doesn't fit existing types or skills, writing an ad hoc instructions note absorbs it without any schema change — the collections problem is a concrete example
- Agent statelessness makes routing architectural, not learned — Agents never develop navigation intuition — every session is day one — so all knowledge routing infrastructure (skills, type templates, routing tables, naming conventions, activation triggers) is permanent architecture, not scaffolding that learners outgrow
- Agent statelessness means the harness should inject context automatically — Since agents can't carry vocabulary or decisions between reads, the harness should auto-inject referenced context — definitions once per session, ADRs when relevant. The trigger mechanism (type, link semantics, term detection) is an open question; the need follows directly from statelessness.
- AGENTS.md should be organized as a control plane — Theory for deciding what belongs in AGENTS.md using loading frequency and failure cost, with layers, exclusion rules, and migration paths
- Alexander's patterns connect to knowledge system design at multiple levels — Christopher Alexander's pattern language, generative processes, and centers may connect to our knowledge system design at multiple levels — from structured document types to codification to link semantics. Vague but persistent.
- Always-loaded context has two surfaces with different affordances — CLAUDE.md enforces universal constraints (imperative/push); skill descriptions advertise opt-in capabilities (suggestive/pull) — guidance belongs on whichever surface matches its enforcement model
- Areas exist because useful operations require reading notes together — Areas are defined by operations that require reading notes together — orientation and comparative reading — which need sets that are both small enough for context and related enough to yield results
- Automating KB learning is an open problem — The KB already learns through manual work (every improvement is capacity change per Simon). The open problem is automating the judgment-heavy mutations — connections, groupings, synthesis — which require oracles we can't yet manufacture.
- Capability placement should follow autonomy readiness — Capability artifacts should be placed by autonomy readiness so AGENTS.md stays free of inventories and only routes or constrains behavior
- Claw learning is broader than retrieval — A Claw's learning loop must improve action capacity (classification, planning, communication), not just retrieval — question-answering is one mode among many
- Commonplace architecture — The commonplace repo's own internal layout — what exists, what's missing, and the decision to put global types in CLAUDE.md instead of kb/types/
- Commonplace installation architecture — Design for how commonplace installs into a project — two trees (user's kb/ and framework's commonplace/), operational artifacts copied for prompt simplicity, methodology referenced for deeper reasoning
- Context efficiency is the central design concern in agent systems — Context — not compute, memory, or storage — is the scarce resource in agent systems; context cost has two dimensions (volume and complexity) that require different architectural responses, making context efficiency the central design concern analogous to algorithmic complexity in traditional systems
- Deep search is connection methodology applied to a temporarily expanded corpus — Design exploration for a deep search skill that reuses /connect's dual discovery and articulation testing on web search results, building a temporary research graph before bridging to KB
- Design methodology — borrow widely, filter by first principles — We borrow from any source but adopt based on first-principles support — except programming patterns, which get a fast pass because the bet is that knowledge bases are a new kind of software system
- Distillation status determines directory placement — Hunch that procedural artifacts distilled for execution belong in kb/instructions/ — the directory boundary is "distilled into a procedure", not "compressed" or "frequently loaded"
- Enforcement without structured recovery is incomplete — The enforcement gradient covers detection and blocking but has no recovery column — recovery strategies (corrective → fallback → escalation) are the missing layer, and oracle strength determines which are viable at each level
- Files beat a database for agent-operated knowledge bases — Files beat a database early on — a schema commits to access patterns before you know them, and files let you constrain incrementally while getting free browsing, versioning, and agent access from day one
- Frontloading spares execution context — Pre-computing static parts of LLM instructions and inserting results spares execution context — the primary bottleneck in instructing LLMs; the mechanism is partial evaluation applied to instructions with underspecified semantics
- Generate KB skills at build time, don't parameterise them — KB skills should be generated from templates at setup time, not parameterised with runtime variables — applying the general principle that indirection is costly in LLM instructions
- Indirection is costly in LLM instructions — In code, indirection (variables, config, abstraction layers) is nearly free at runtime — in LLM instructions, every layer of indirection costs context and interpretation overhead on every read
- Injectable configuration extends frontloading to installation-specific values — Values static within an installation but variable across installations — sibling repo paths, local tool locations — are frontloadable through configuration the orchestrator resolves and injects into sub-agent frames; the context savings depend on sub-agent isolation since injection into the main context just adds tokens
- Instruction specificity should match loading frequency — The loading hierarchy (CLAUDE.md → skill descriptions → skill bodies → task docs) should match instruction specificity to loading frequency — always-loaded context competes for attention every session
- Instructions are skills without automatic routing — Reusable distilled procedures that live in kb/instructions/ — same format as skills but without activation triggers or CLAUDE.md routing entries; invoked when a human points the agent at them
- Instructions are typed callables with document type signatures — Skills and tasks are typed callables — they accept document types as input and produce types as output, and should declare their signatures like functions declare parameter types.
- MCP bundles stateless tools with a stateful runtime — MCP forces stateless tool operations through a persistent server process — most tools are pure functions that don't need session state, connections, or lifecycle management, but pay the complexity tax anyway
- Mechanistic constraints make Popperian KB recommendations actionable — Bounded context and underspecification don't just permit conjecture-and-refutation — they require it; derives three concrete practices (falsifier blocks, contradiction-first connection, rejected-interpretation capture) from KB mechanics.
- Methodology enforcement is constraining — Instructions, skills, hooks, and scripts form a constraining gradient for methodology — from underspecified and indeterministic (LLM interprets and may not follow) to fully deterministic (code always runs), with hooks occupying a middle ground of deterministic triggers with indeterministic responses
- Needs testing — Promising ideas without enough evidence — extract/connect/review cycle, input classification before processing
- Prompt ablation converts human insight into deployable agent framing — Methodology for testing prompt framings — uses controlled variation against a human-verified finding to identify which cognitive moves agents can reliably execute, then deploys the winning framing as instruction
- Scenario decomposition drives architecture — Deriving architectural requirements by decomposing concrete user stories into step-by-step context needs — not from abstract read/write operations but from what the agent actually has to load at each stage, in both the commonplace repo and installed projects
- Scenarios — Concrete use cases for the knowledge system — upstream change analysis and proposing our own changes
- Skills derive from methodology through distillation — The methodology→skill relationship is distillation (extracting operational procedures from discursive reasoning in the same medium) — distinct from codification (prompt→code phase transition) and constraining (narrowing output distribution)
- The fundamental split in agent memory is not storage format but who decides what to remember — Comparative analysis of eleven agent memory systems across six architectural dimensions — storage unit, agency model, link structure, temporal model, curation operations, and extraction schema — revealing that the agency question (who decides what to remember) is the most consequential design choice and that no system combines high agency, high throughput, and high curation quality.
- Two context boundaries govern collection operations — Any note collection faces two context boundaries — a full-text boundary where all bodies can be loaded together, and an index boundary where all titles+descriptions fit — creating three operational regimes that govern areas, /connect, and whole-KB operations differently
- Vibe-noting — Vibe coding works because code is inspectable, not just verifiable — a KB adds that same inspectability to knowledge work, enabling augmentation even where automation is blocked on oracle construction
- What cludebot teaches us — Techniques from cludebot worth borrowing — what we already cover, what to adopt now, and what to watch for as the KB grows
- What doesn't work — Anti-patterns and areas with insufficient evidence — auto-commits, queue overhead, validation ceremony, session rhythm
- What works — Patterns proven valuable in practice — prose-as-title, template nudges, frontmatter queries, semantic search via qmd, discovery-first, public/internal boundary