Notes Directory
Type: index
- ADR-001: Generate Topic links from frontmatter
- A functioning knowledge base needs a workshop layer, not just a library (note) — The current type system models permanent knowledge (library) but not in-flight work with state machines, dependencies, and expiration (workshop) — tasks are a prototype of the missing layer, and a functioning knowledge base needs both plus bridges between them
- A good agentic KB maximizes contextual competence through discoverable, composable, trustworthy knowledge (note) — Theory of why commonplace's arrangements work — three properties (discoverable, composable, trustworthy) serve contextual competence under bounded context; accumulation is the basic learning operation (reach distinguishes facts from theories); stabilisation, distillation, and discovery transform accumulated knowledge; Deutsch's reach criterion distinguishes knowledge that transfers from knowledge that merely fits
- A knowledge base should support fluid resolution-switching (note) — Good thinking requires moving between abstraction levels — broad for context, narrow for mechanism, back out for pattern. A KB's quality should be measured by how fluidly it supports this resolution-switching, not just retrieval accuracy.
- Active-campaign understanding needs a single coherent narrative, not composed notes (note) — Why durable-knowledge graph composition (many linked notes) is wrong for tracking understanding during active engineering — a single holistically rewritten narrative maintains the coherence that working memory requires
- Ad hoc prompts extend the system without schema changes (note) — When a new requirement doesn't fit existing types or skills, writing an ad hoc instructions note absorbs it without any schema change — the collections problem is a concrete example
- 002-inline-global-types-in-writing-guide (adr) — Decision to inline note and structured-claim templates into WRITING.md so the agent gets type structure and writing conventions in a single hop — eliminates one read for the two most common note types
- 003-connect-skill-discovery-strategy (adr) — Design options and scaling strategy for how the connect skill discovers candidate connections — index-first with semantic search backup, and what changes when the KB grows
- Agent statelessness makes routing architectural, not learned (note) — Agents never develop navigation intuition — every session is day one — so all knowledge routing infrastructure (skills, type templates, routing tables, naming conventions, activation triggers) is permanent architecture, not scaffolding that learners outgrow
- Agent statelessness means the harness should inject context automatically (structured-claim) — Since agents can't carry vocabulary or decisions between reads, the harness should auto-inject referenced context — definitions once per session, ADRs when relevant. The trigger mechanism (type, link semantics, term detection) is an open question; the need follows directly from statelessness.
- Agentic systems interpret underspecified instructions (note) — LLM-based systems have two distinct properties — semantic underspecification of natural language specs (the deeper difference from traditional programming) and execution indeterminism (present in all practical systems) — the spec-to-program projection model captures the first, which indeterminism tends to obscure
- AGENTS.md should be organized as a control plane (note) — Theory for deciding what belongs in AGENTS.md using loading frequency and failure cost, with layers, exclusion rules, and migration paths
- Agents navigate by deciding what to read next (note) — An agent doing a task navigates by deciding what to read — links, index entries, search tools, and skill descriptions are all pointers with varying amounts of context for that decision
- Alexander's patterns connect to knowledge system design at multiple levels (note) — Christopher Alexander's pattern language, generative processes, and centers may connect to our knowledge system design at multiple levels — from structured document types to crystallisation to link semantics. Vague but persistent.
- Areas exist because useful operations require reading notes together (note) — Areas are defined by operations that require reading notes together — orientation and comparative reading — which need sets that are both small enough for context and related enough to yield results
- Areas (index) — Hub for all knowledge areas, linking each curated area index so readers can browse the KB by conceptual domain rather than by directory
- Automated tests for text (note) — Text artifacts can be tested with the same pyramid as software — deterministic checks, LLM rubrics, corpus compatibility — built from real failures not taxonomy
- Automating KB learning is an open problem (note) — The KB already learns through manual work (every improvement is capacity change per Simon). The open problem is automating the judgment-heavy mutations — connections, groupings, synthesis — which require oracles we can't yet manufacture.
- Backlinks — use cases and design space (note) — Analysis of where backlinks (inbound link visibility) would concretely help agents working in the KB — use cases, trade-offs, and design options
- The bitter lesson stops at calculators (structured-claim) — The bitter lesson has a boundary — calculators vs vision features illustrate when exact solutions survive scaling and when they don't
- Capability placement should follow autonomy readiness (note) — Capability artifacts should be placed by autonomy readiness so AGENTS.md stays free of inventories and only routes or constrains behavior
- Claim notes should use Toulmin-derived sections for structured argument (structured-claim) — Three independent threads converged on Toulmin's argument structure — adopting Toulmin sections as base type
structured-claimseparates claim-titled notes (any note) from fully argued claims (the type) - Claw learning is broader than retrieval (note) — A Claw's learning loop must improve action capacity (classification, planning, communication), not just retrieval — question-answering is one mode among many
- Commonplace architecture (note) — The commonplace repo's own internal layout — what exists, what's missing, and the decision to put global types in CLAUDE.md instead of kb/types/
- Commonplace installation architecture (note) — Design for how commonplace installs into a project — two trees (user's kb/ and framework's commonplace/), operational artifacts copied for prompt simplicity, methodology referenced for deeper reasoning
- Computational model (index) — Index of notes applying programming language theory to LLM instructions — scoping, homoiconicity, partial evaluation, typing; the computational model of LLM-based systems viewed through PL concepts
- Context efficiency is the central design concern in agent systems (note) — Context — not compute, memory, or storage — is the scarce resource in agent systems; context cost has two dimensions (volume and complexity) that require different architectural responses, making context efficiency the central design concern analogous to algorithmic complexity in traditional systems
- CLAUDE.md is a router, not a manual (note) — CLAUDE.md should be a slim router to task-specific docs, not a comprehensive manual — because it's loaded every session
- Convert still requires semantic description
- Crystallisation (note) — Definition — crystallisation is stabilisation that crosses a medium boundary, the phase transition from natural language to executable code where medium, consumer, and verification regime all change — the far end of the stabilisation spectrum
- Decomposition rules for bounded-context scheduling (note) — Practical rules for symbolic scheduling over bounded LLM calls — separate selection from joint reasoning, choose representations not just subsets, save reusable intermediates in scheduler state
- Deep search is connection methodology applied to a temporarily expanded corpus (note) — Design exploration for a deep search skill that reuses /connect's dual discovery and articulation testing on web search results, building a temporary research graph before bridging to KB
- Deploy-time learning is agile for human-AI systems (note) — Argues deploy-time learning and agile share the same core innovation — co-evolving prose and code — but deploy-time learning extends it by treating some prose as permanently load-bearing
- Deploy-time learning: The Missing Middle (note) — Deploy-time learning fills the gap between training and in-context — repo artifacts provide durable, inspectable adaptation through three mechanisms (stabilisation, crystallisation, distillation) with a verifiability gradient from prompt tweaks to deterministic code
- Design methodology — borrow widely, filter by first principles (note) — We borrow from any source but adopt based on first-principles support — except programming patterns, which get a fast pass because the bet is that knowledge bases are a new kind of software system
- Deterministic validation should be a script (note) — Half of /validate's checks are hard-oracle (enums, link resolution, frontmatter structure) and could run as a Python script in milliseconds instead of burning LLM tokens via the skill
- Directory-scoped types are cheaper than global types (note) — Global types tax every session's context; directory-scoped types load only when working in that directory — most structural affordances are directory-local, so the type system should match that economy
- Discovery is seeing the particular as an instance of the general (note) — Critiques the topic-vs-mechanism linking dichotomy — discovery varies by abstraction depth, not link kind. The hard problem is positing a new general concept and simultaneously recognizing existing particulars as instances of it. Darwin, Fleming, and mathematical lemma extraction share this dual structure.
- Distillation (note) — Definition — distillation is targeted extraction from a larger body of reasoning into a focused artifact shaped by specific circumstances (use case, context budget, agent) — one of two co-equal learning mechanisms alongside stabilisation, and the dominant one in knowledge work
- Distilled artifacts need source tracking at the source (note) — Distilled artifacts should not link back to sources (focus), but sources should link forward to distilled targets ("Distilled into:") so that source changes trigger staleness review of downstream artifacts
- Document classification (spec) — Taxonomy overview — the base types table and migration from old flat types; global field definitions, status, and traits live in types/note.md
- Document system (index) — Index of notes about document types, writing conventions, validation, and structural quality — how notes are classified, structured, and checked
- Document types should be verifiable (note) — Document types should assert verifiable structural properties, not subject matter — with a base type + traits model inspired by gradual and structural typing
- Error correction works with above-chance oracles and decorrelated checks (note) — Error correction for LLM output is viable whenever the oracle has discriminative power (TPR > FPR) and checks are decorrelated — amplification cost scales with 1/(TPR-FPR)² and independence of errors
- Error messages that teach are a stabilisation technique (note) — The most effective stabilisation artifacts simultaneously constrain (block wrong output) and inform (teach the fix) — because in agent systems the error channel is an instruction channel; fills the gap between the stabilisation gradient's layers and the context they deliver
- Files beat a database for agent-operated knowledge bases (note) — Files with git beat a database for agent-operated knowledge bases — universal interface, free versioning, no infrastructure to maintain
- First-principles reasoning selects for explanatory reach over adaptive fit (note) — Deutsch's adaptive-vs-explanatory distinction — explanatory knowledge has "reach" (transfers to new contexts) because it captures why, not just what works; grounds the KB's first-principles filter as selecting for reach over fit
- Frontloading spares execution context (note) — Pre-computing static parts of LLM instructions and inserting results spares execution context — the primary bottleneck in instructing LLMs; the mechanism is partial evaluation applied to instructions with underspecified semantics
- Generate KB skills at build time, don't parameterise them (note) — KB skills should be generated from templates at setup time, not parameterised with runtime variables — applying the general principle that indirection is costly in LLM instructions
- Human-LLM differences are load-bearing for knowledge system design (note) — Knowledge systems both inherit human-oriented materials and produce dual-audience documents (human + LLM), making human-LLM cognitive differences a first-class design concern rather than a generic disclaimer
- Human writing structures transfer to LLMs because failure modes overlap (note) — Human writing genres evolved to prevent specific reasoning failures; the same structures help LLMs because LLMs exhibit empirically demonstrated human-like failure modes (content effects on reasoning) — per-convention transfer evaluation, not wholesale analogy
- Indirection is costly in LLM instructions (note) — In code, indirection (variables, config, abstraction layers) is nearly free at runtime — in LLM instructions, every layer of indirection costs context and interpretation overhead on every read
- Information value is observer-relative because extraction requires computation (note) — A deterministic transformation adds zero classical information but can make structure accessible to bounded observers — this reframe connects distillation and discovery depth as instances of the same gap.
- Injectable configuration extends frontloading to installation-specific values (note) — Values static within an installation but variable across installations — sibling repo paths, local tool locations — are frontloadable through configuration the orchestrator resolves and injects into sub-agent frames; the context savings depend on sub-agent isolation since injection into the main context just adds tokens
- Inspectable substrate, not supervision, defeats the blackbox problem (note) — Chollet frames agentic coding as ML producing blackbox codebases — crystallisation counters this not by requiring human review but by choosing a substrate (repo artifacts) that any agent can inspect, diff, test, and verify
- Instructions are skills without automatic routing (note) — Reusable distilled procedures that live in kb/instructions/ — same format as skills but without activation triggers or CLAUDE.md routing entries; invoked when a human points the agent at them
- Instructions are typed callables with document type signatures (note) — Skills and tasks are typed callables — they accept document types as input and produce types as output, and should declare their signatures like functions declare parameter types.
- KB design (index) — Index of notes about agent-operated KB architecture, operations, and evaluation — how agent-operated knowledge bases are built, installed, and assessed
- Learning is not only about generality (note) — Per Simon, any capacity change is learning; accumulation is the most basic learning operation and reach is its key property — facts (low reach) vs theories (high reach); capacity also decomposes into generality vs a reliability/speed/cost compound
- Learning theory (index) — Index of notes about how systems learn, verify, and improve — accumulation as the basic operation (reach as its key property), stabilisation and distillation as transformation mechanisms, discovery as the source of high-reach theories, oracle theory, and memory architecture
- Legal drafting solves the same problem as context engineering (note) — Law has centuries of methodology for writing natural language specifications interpreted by a judgment-exercising processor — the same problem as context engineering for LLMs. Legal techniques (defined terms, structural conventions, precedent) are stabilisation techniques native to the underspecified medium; law mostly lacks crystallisation because statutes remain natural language.
- Link contracts framework — source material (note) — Reference framework for systematic, testable linking — link contracts, intent taxonomy, automated checks, agent implications
- Link graph plus timestamps enables make-like staleness detection (note) — Existing links already encode dependency information; comparing note and target timestamps flags notes that may be stale without any new annotation, analogous to make's file-based rebuild logic.
- Link strength is encoded in position and prose (note) — Not all links are equal — inline premise links ("since [X]") carry more weight than footer "related" links. Position and prose encode commitment level, creating a weighted graph that affects traversal, scoring, and quality signals.
- Links (index) — Index of notes about linking — how links work as decision points, navigation modes, link contracts, and automated link management
- LLM context is a homoiconic medium (note) — LLM context windows are homoiconic — instructions and data share the same representation (natural language tokens), so there is no structural boundary between program and content, producing both the extensibility benefits and the scoping hazards of Lisp, Emacs, and Smalltalk
- LLM context is composed without scoping (note) — LLM context is flat concatenation — no scoping, everything global, producing dynamic scoping's pathologies (spooky action at a distance, name collision, inability to reason locally) but without even a stack; sub-agents are the one mechanism that provides isolation through lexically scoped frames
- LLM-mediated schedulers are a degraded variant of the clean model (note) — When the agent scheduler lives inside an LLM conversation it becomes bounded and degrades; three recovery strategies — compaction, externalisation, factoring into code — restore the clean separation to increasing degrees
- Maintenance operations catalogue should stage distillation into instructions (note) — Catalogue of periodic KB maintenance operations and distillation status, used as a staging ground before promotion into kb/instructions procedures
- Mechanistic constraints make Popperian KB recommendations actionable (note) — Bounded context and underspecification don't just permit conjecture-and-refutation — they require it; derives three concrete practices (falsifier blocks, contradiction-first connection, rejected-interpretation capture) from KB mechanics.
- Memory management policy is learnable but oracle-dependent (note) — AgeMem learns on two substrates — facts accumulated in memory (low-reach) and policy learned in weights (when to accumulate, distil, curate) — confirming memory policy is vision-feature-like; but the learning depends on task-completion oracles, which is exactly the evaluation gap that makes automating KB learning hard
- RLM ephemeral code prevents accumulation — RLM discards generated code after each run — the single design choice that separates it from llm-do
- Methodology enforcement is stabilisation (note) — Instructions, skills, hooks, and scripts form a stabilisation gradient for methodology — from underspecified and indeterministic (LLM interprets and may not follow) to fully deterministic (code always runs), with hooks occupying a middle ground of deterministic triggers with indeterministic responses
- Needs testing (review) — Promising ideas without enough evidence — extract/connect/review cycle, input classification before processing
- Notes need quality scores to scale curation (note) — As the KB grows, /connect will retrieve too many candidates — note quality scores (status, type, inbound links, recency, link strength) filter candidates and prioritise what's worth connecting
- The bitter lesson boundary is a gradient, not a binary (structured-claim) — The bitter lesson boundary is a gradient — oracle strength (how cheaply and reliably you can verify correctness) determines where a component sits and how to invest engineering effort
- Periodic KB hygiene should be externally triggered, not embedded in routing (note) — Periodic hygiene checks belong in externally triggered operations (user request, scheduler, CI), not in always-loaded routing instructions
- Programming practices apply to prompting (note) — Programming practices — typing, testing, progressive compilation, version control — apply to LLM prompting and knowledge systems, with semantic underspecification and execution indeterminism making some practices harder in distinct ways
- Quality signals for KB evaluation (note) — Catalogues graph-topology, content-proxy, and LLM-hybrid signals that could be combined into a weak composite oracle to drive a mutation-based KB learning loop without requiring usage data.
- Agent Skills for Context Engineering (note) — Skill-based context engineering framework — 14 instructional modules covering attention mechanics, multi-agent patterns, memory, evaluation. Strong on operational patterns, weaker on learning theory.
- Ars Contexta (note) — Claude Code plugin that generates knowledge systems from conversation, backed by 249 research claims. Ancestor of our KB — we borrowed link semantics, propositional titles, and three-space architecture, then diverged in theory and structure.
- ClawVault (note) — TypeScript memory system for AI agents with scored observations, session handoffs, and reflection pipelines — has a working workshop layer where we have theory, making it the strongest source of borrowable patterns for ephemeral knowledge
- Related Systems (index) — Comparable knowledge/agent systems tracked for evolving ideas, convergence signals, and borrowable patterns
- sift-kg (note) — LLM-powered document-to-knowledge-graph pipeline with schema discovery, human-in-the-loop entity resolution, and interactive visualization
- Siftly (note) — Next.js + SQLite bookmark ingestion system whose deterministic-first, resumable enrichment pipeline offers concrete patterns for scaling KB source loading with explicit progress state
- Thalo entity types compared to commonplace document types (note) — Reference for borrowing recurring note shapes from Thalo — their entity types (opinion, reference, lore, journal, synthesis) map onto our types with concrete gaps still open (supersedes links, source status tracking)
- Thalo (note) — Custom plain-text language for knowledge management with Tree-Sitter grammar, typed entities, 27 validation rules, and LSP. Makes the same programming-theory-over-psychology bet we do, but went further into formalization with a custom DSL.
- Eric Evans: AI Components for a Deterministic System
- Granular Software
- Professional Software Developers and AI Agent Use
- RLM Implementations vs llm-do
- RLM (Recursive Language Model) — For Programmers
- Shesha vs llm-do
- Reliability dimensions map to oracle-hardening stages (note) — The four reliability dimensions from Rabanser et al. (consistency, robustness, predictability, safety) each harden a different oracle question — mapping empirical agent evaluation onto the oracle-strength spectrum
- Analysis: Adaptation of Agentic AI (arXiv:2512.16301) — Analysis of agentic AI adaptation paper and llm-do implications
- What Survives in Multi-Agent Systems — Analysis of what multi-agent patterns will survive stronger models
- Scenario decomposition drives architecture (note) — Deriving architectural requirements by decomposing concrete user stories into step-by-step context needs — not from abstract read/write operations but from what the agent actually has to load at each stage, in both the commonplace repo and installed projects
- Scenarios (note) — Concrete use cases for the knowledge system — upstream change analysis and proposing our own changes
- Semantic review catches content errors that structural validation cannot (note) — Four specific semantic checks (enumeration completeness, grounding alignment, boundary-case coverage, internal consistency) that require LLM adversarial reading — structural validation catches form errors but misses content errors like incomplete enumerations that contradict their own grounding definitions
- Skills derive from methodology through distillation (structured-claim) — The methodology→skill relationship is distillation (extracting operational procedures from discursive reasoning in the same medium) — distinct from crystallisation (prompt→code phase transition) and stabilisation (narrowing output distribution)
- Operational signals that a component is a softening candidate
- Solve low-degree-of-freedom subproblems first to avoid blocking better designs
- Spec mining is crystallisation's operational mechanism
- Stabilisation and distillation both trade generality for reliability, speed, and cost (note) — Both learning mechanisms — stabilisation (constraining) and distillation (extracting) — sacrifice generality for compound gains in reliability, speed, and cost; they differ in the operation and how much compound they yield
- Stabilisation during deployment is continuous learning (note) — Continuous learning — adapting deployed systems to new data and tasks — is what stabilisation with versioned artifacts already achieves per Simon's definition; fine-tuning and prompt optimization target the same behavioral changes through different mechanisms
- Stabilisation (note) — Definition — stabilisation constrains the space of valid interpretations an underspecified spec admits, from partial narrowing (conventions, structured sections) to full commitment (stored outputs, deterministic code) — one of two co-equal learning mechanisms alongside distillation
- Stale indexes are worse than no indexes (note) — An agent trusts an index as exhaustive — a missing entry doesn't trigger search, it makes the note invisible
- Storing LLM outputs is stabilization (note) — Choosing to keep a specific LLM output resolves semantic underspecification to one interpretation and freezes it against execution indeterminism — the same stabilizing move the parent note describes for code, applied to artifacts
- Structure activates higher-quality training distributions (note) — Structured templates like Evidence/Reasoning sections steer autoregressive generation toward higher-quality training data (scientific papers, legal analyses) rather than unstructured web text — the structure acts as a distribution selector
- Structured output is easier for humans to review (note) — Separated Evidence and Reasoning sections let human reviewers check facts and logic independently — a purely readability argument that doesn't depend on LLM behavior at all
- Symbolic scheduling over bounded LLM calls is the right model for agent orchestration (note) — Agent orchestration is best modelled as an unbounded symbolic scheduler making bounded LLM calls; the scheduler chooses decompositions, prompt representations, and intermediate artifacts over its evolving symbolic state
- Text testing framework — source material (note) — Reference framework for automated text testing — contracts per document type, test pyramid (deterministic/LLM rubric/corpus), production workflow
- The frontloading loop is an iterative optimisation over bounded context (note) — Extending frontloading from a single partial-evaluation step to an iterative loop reveals a sequential optimisation problem — at each step the orchestrator selects what to frontload into a fixed-capacity sub-agent window, with each iteration's results expanding the knowledge available for the next selection
- Three-space agent memory maps to Tulving's taxonomy — Agent memory split into knowledge, self, and operational spaces mirrors Tulving's semantic/episodic/procedural distinction
- Three-space memory separation predicts measurable failure modes — The three-space memory claim is testable because flat memory predicts specific cross-contamination failures
- Title as claim enables traversal as reasoning (note) — When note titles are claims rather than topics, following links between them reads as a chain of reasoning — the file tree becomes a scan of arguments, and link semantics (since, because, but) encode relationship types
- Traversal improves the graph (note) — Every traversal is a read-write opportunity — agents should log improvement opportunities during reading, then process them separately to avoid context-switching
- Two context boundaries govern collection operations (note) — Any note collection faces two context boundaries — a full-text boundary where all bodies can be loaded together, and an index boundary where all titles+descriptions fit — creating three operational regimes that govern areas, /connect, and whole-KB operations differently
- Two kinds of navigation (note) — Link-following is local with context; search is long-range with titles/descriptions; indexes bridge both modes
- Type system enforces metadata that navigation depends on (note) — Descriptions don't appear spontaneously — they exist because the note base type requires them; without enforcement, metadata degrades and navigation collapses to opening every document
- Type system (index) — Index of notes about the document type system — why types exist, what roles they serve, how they improve output quality, and how they're structured
- {NNN}-{decision-title} (adr)
- {area-name} index (index)
- {System name} (note)
- {Claim as title — an assertion, not a topic label} (structured-claim)
- Types give agents structural hints before opening documents (note) — Types and descriptions let agents make routing decisions without loading full documents — the type says what operations a document affords, the description filters among instances of that type
- Unified calling conventions enable bidirectional refactoring between neural and symbolic (note) — When agents and tools share a calling convention, components can move between neural and symbolic without changing call sites — llm-do demonstrates this with name-based dispatch over a hybrid VM
- Unit testing LLM instructions requires mocking the tool boundary (note) — Skills are programs whose I/O boundary is tool calls — mocking that boundary creates controlled environments for testing whether instructions produce correct behavior, complementing text artifact testing with instruction-level regression detection
- What cludebot teaches us — Techniques from cludebot worth borrowing — what we already cover, what to adopt now, and what to watch for as the KB grows
- What doesn't work (review) — Anti-patterns and areas with insufficient evidence — auto-commits, queue overhead, validation ceremony, session rhythm
- What works (review) — Patterns proven valuable in practice — prose-as-title, template nudges, frontmatter queries, semantic search via qmd, discovery-first, public/internal boundary
- Why directories despite their costs (note) — Directories buy one–two orders of magnitude of human-navigable scale over flat files, and enable local conventions per subsystem — but each new directory taxes routing, search config, skills, and cross-directory linking
- Why notes have types (note) — Six roles of the type system — navigation hints, metadata enforcement, verifiable structure, local extensibility, output quality through structured writing discipline, and maturation through stabilisation
- The wikiwiki principle: lowest-friction capture, then progressive refinement in place (note) — Ward Cunningham's wiki design principle — minimize capture friction, then refine in place — is the animating idea behind the text→note→structured-claim crystallisation ladder
- Writing styles are strategies for managing underspecification (note) — The five empirically observed context-file writing styles (descriptive, prescriptive, prohibitive, explanatory, conditional) are not stylistic variation — they correspond to different strategies for narrowing the interpretation space agents face, trading off constraint against generalisability