Learning theory

Type: kb/types/index.md · Status: current

How systems learn, verify, and improve. These notes define learning mechanisms, verification gradients, and memory architecture that KB design draws on but that aren't KB-specific — they apply to any system that adapts through durable artifacts, including but not limited to inspectable ones.

The collection is organized around deploy-time learning as the unifying framework. Accumulation — adding knowledge to the store — is the most basic learning operation, with reach as its key property: facts sit at the low end, theories at the high end. Two orthogonal mechanisms (constraining and distillation) transform accumulated knowledge. A third operation (discovery) produces the high-reach theories that are accumulation's most valuable items.

Foundations

agentic-systems-interpret-underspecified-instructions — two distinct properties (semantic underspecification and execution indeterminism); the spec-to-program projection model, semantic boundaries, and the constrain/relax cycle
learning-is-not-only-about-generality — accumulation is the most basic learning operation, with reach as its key property (facts at the low end, theories at the high end); capacity decomposes into generality vs a reliability/speed/cost compound; Simon's definition grounds the decomposition
continual-learning-open-problem-is-behaviour-not-knowledge — splits continual learning into knowledge accumulation (solved by ordinary engineering) and behaviour change (open); names readable system-definition artifacts as the cheap behaviour-change mechanism alongside expensive weight updates
llm-learning-phases-fall-between-human-learning-modes — LLM phases (pre-training, in-context, deploy-time) occupy intermediate positions on the evolution-to-reaction spectrum rather than mapping 1:1 to human learning modes; warns against literal human-LLM learning analogies
in-context-learning-presupposes-context-engineering — in-context learning depends on deploy-time learning to select and organize the right knowledge; Amodei's "no continual learning needed" claim relocates the learning to the system layer rather than eliminating it

Deploy-time Learning

The organizing framework: deployed systems adapt through symbolic artifacts — durable, inspectable, and verifiable — filling the gap between training and in-context learning.

deploy-time-learning-the-missing-middle — three timescales of system adaptation; co-evolving prose and code as agile-style deploy-time learning (prose and code co-evolve, hybrid as end state); concrete before-and-after examples of constraining at different grades
the verifiability gradient — the ladder deploy-time artifacts sit on, from restructured prompts through schemas and evals to deterministic code; hardening moves artifacts along it in both directions
axes-of-artifact-analysis — four-field artifact analysis: storage substrate, representational form, lineage, and behavioral authority over the operative part or consumption path
changing-requirements-conflate-genuine-change-with-disambiguation-failure — reframes agile: "changing requirements" hide late-surfacing interpretation errors in underspecified specs; short iterations bound interpretation-error propagation, not just change-response latency
specification strategy should follow where understanding lives — names the lifecycle choice across spec-first, bidirectional, and behavior-extracted approaches; the right strategy depends on whether understanding is present before work, discovered during execution, or only visible after repeated runs
evaluation automation is phase-gated by comprehension — concretizes the lifecycle for eval loops: comprehension and specification must precede optimization, or automation amplifies the wrong objective
constraining-and-distillation-both-trade-generality-for-reliability-speed-and-cost — both mechanisms sacrifice generality for compound gains in reliability, speed, and cost; they differ in the operation (constraining vs extracting) and how much compound they yield
fixed-artifacts-split-into-exact-specs-and-proxy-theories — determines when constraining can be hardened confidently (spec IS the problem) vs when relaxing must remain available (spec approximates the problem); composition failure is the tell that specs are theories, not definitions

Constraining

Constraining the interpretation space — from partial narrowing (conventions) to full commitment (deterministic code). The primary mechanism for hardening deployed systems.

constraining — definition and spectrum: storing an output, writing a convention, adding structured sections, extracting deterministic code; codification is the far end where the medium itself changes from natural language to executable code
storing-llm-outputs-is-constraining — the simplest instance: keeping a specific LLM output resolves underspecification to one interpretation; develops the generator/verifier pattern and verbatim risk
constraining-during-deployment-is-continuous-learning — AI labs' continuous learning is achievable through constraining with versioned artifacts, which beats weight updates on inspectability and rollback
spec-mining-as-codification — codification's operational mechanism: observe behavior, extract deterministic rules, grow the calculator surface monotonically
operational-signals-that-a-component-is-a-relaxing-candidate — five testable signals (paraphrase brittleness, isolation-vs-integration gap, process constraints, unspecifiable failures, distribution sensitivity) for detecting when to reverse codification
error-messages-that-teach-are-a-constraining-technique — the dual-function property: effective enforcement artifacts simultaneously constrain and inform, because in agent systems the error channel is an instruction channel
enforcement-without-structured-recovery-is-incomplete — the enforcement gradient covers detection and blocking but not recovery; maps ABC's corrective → fallback → escalation onto each enforcement layer, with oracle strength determining viable recovery strategies
codify-versus-LLM decision heuristics — synthesis: four lenses on the codify-vs-LLM decision (spec completeness, oracle strength, interpretation space, pattern stability) with evidence they come apart at the edges

Codification Lifecycle

Codification decisions split into three separable questions:

Specification strategy should follow where understanding lives — when to commit: before execution, during execution, or after repeated observation.
Codify-versus-LLM decision heuristics — what to commit: exact-spec subproblem, verifiable operation, stable pattern, or still-underspecified judgment.
Unified calling conventions enable bidirectional refactoring — how to commit and reverse: keep neural and symbolic components behind the same callable interface.
Codification and relaxing navigate the bitter lesson boundary — why reversibility matters: codification is a bet that may need relaxing when scale shows the commitment was wrong.

Leaving one question unanswered creates a predictable failure: phase errors without timing, wrong commitments without target heuristics, and big-bang rewrites without reversible interfaces.

Distillation

Targeted extraction from a larger body of reasoning into a focused artifact shaped by use case, context budget, or agent. Orthogonal to constraining — you can distil without constraining (extract a skill, still underspecified) or constrain without distilling (store an output, no extraction from reasoning).

distillation — definition: the rhetorical mode shifts to match the target (argumentative → procedural, exploratory → assertive); the dominant mechanism in knowledge work because it creates new artifacts from existing reasoning

Information & Bounded Observers

information-value-is-observer-relative — deterministic transformations add zero classical information but can make structure accessible to bounded observers; names the gap that distillation and discovery each describe operationally
epiplexity-eli5 — ELI5 explanation of epiplexity through encrypted messages, shuffled textbooks, CSPRNGs, and chess notation; contrasts surprise, shortest description, and observer-relative usable structure
minimum-viable-vocabulary-is-the-naming-set-that-most-reduces-extraction-cost-for-a-bounded-observer — reframes "minimum viable ontology" as the vocabulary that maximally reduces extraction cost for a bounded observer entering a domain; synthesizes information-value, discovery, and distillation
first-principles-reasoning-selects-for-explanatory-reach-over-adaptive-fit — Deutsch's adaptive-vs-explanatory distinction: explanatory knowledge transfers because it captures why, not just what works; grounds the KB's first-principles filter as selecting for reach

Discovery

A third operation, distinct from both constraining and distillation: positing a new general concept and simultaneously recognizing existing particulars as instances of it. Discovery produces theories — the highest-reach items accumulation can store.

discovery-is-seeing-the-particular-as-an-instance-of-the-general — the dual structure of discovery (posit the general, recognize the particular); three depths from shared feature through shared structure to generative model; the hard problem is recognition, not linking

Synthesis

Raw accumulation does not create usable memory — accumulation preserves material, but ingress work is what gives remembered material handles, scope, relationships, provenance, trust signals, and lifecycle pressure
agent context is constrained by soft degradation not hard token limits — the binding constraint is the soft degradation curve (dilution, compositional collapse, and relevance/interference), not the hard token limit; programmatic constructability is the genuine differentiator
soft-bound traditions as sources for context engineering strategies — catalog of twelve traditions with transfer assessment: what's already working, what's plausible, and what blocks transfer (optimization target mismatch, feedback absence, different failure modes)

Oracle & Verification

Moved to LLM interpretation errors — oracle theory, error correction, reliability dimensions, and the augmentation/automation boundary now live in the dedicated error-theory area. Key notes:

error-correction-works-above-chance-oracles-with-decorrelated-checks — the core theory of error correction via decorrelated weak oracles
oracle-strength-spectrum — the gradient from hard to no oracle that determines engineering priorities

Memory & Architecture

three-space-agent-memory-maps-to-tulving-taxonomy — agent memory split into knowledge, self, and operational spaces mirrors Tulving's semantic/episodic/procedural distinction
flat-memory-predicts-specific-cross-contamination-failures-that-are-empirically-testable — the three-space claim is testable: flat memory predicts specific cross-contamination failures
inspectable-artifact-not-supervision-defeats-the-blackbox-problem — codification counters the blackbox problem not by requiring human review but by choosing readable artifacts (code, prompts, schemas) that any agent can inspect, diff, test, and verify
A-MEM: Agentic Memory for LLM Agents — academic paper: Zettelkasten-inspired agent memory with automated link generation; flat single-space design provides a test case for whether three-space separation matters at QA-benchmark scale
memory-management-policy-is-learnable-but-oracle-dependent — AgeMem's RL-trained memory policy demonstrates low-reach accumulation (facts) and distillation (STM); frames memory policy as a proxy theory over exact-spec operations, but requires a task-completion oracle the KB cannot yet provide
agent memory is a crosscutting concern, not a separable niche — memory decomposes into storage (solved), retrieval/activation (context engineering), and learning (learning theory); the hard problems live at the intersections, not inside a standalone "memory system"
Multi-Agent Memory from a Computer Architecture Perspective — computer-architecture analogy for multi-agent memory: shared/distributed paradigms, three-layer hierarchy, and consistency protocols as the critical unsolved problem
Graphiti — temporally-aware knowledge graph with bi-temporal edge invalidation; strongest temporal model in the surveyed memory systems and strongest counterexample to files-first architecture

Applications

unified-calling-conventions-enable-bidirectional-refactoring — when agents and tools share a calling convention, constraining and codification become local operations; llm-do as primary evidence
programming-practices-apply-to-prompting — typing, testing, progressive compilation, and version control transfer from programming to LLM prompting, with probabilistic execution making some practices harder
ad-hoc-prompts-extend-the-system-without-schema-changes — the counterpoint: sometimes staying at the prompt level is the right choice; ad hoc instructions absorb new requirements faster than schema changes
legal-drafting-solves-the-same-problem-as-context-engineering — law as an independent source discipline for the underspecified instructions problem: precedent and codification are constraining; legal techniques are native to the underspecified medium
Ephemeral computation prevents accumulation — ephemeral vs persistent artifacts as inverse of codification; discarding generated artifacts trades accumulation for simplicity
Ephemerality is safe where embedded operational knowledge has low reach — synthesizes Kirsch's four barriers with the reach concept: the ephemeral/malleable boundary sits where embedded operational knowledge crosses from low reach (adaptive, safe to discard) to high reach (explanatory, must accumulate)

Reference material

Context Engineering for AI Agents in OSS — empirical study of AGENTS.md/CLAUDE.md evolution in 466 OSS projects; commit-level analysis shows constraining maturation trajectory confirming continuous learning through versioned artifacts
On the "Induction Bias" in Sequence Models — 190k-run empirical study showing transformers need orders-of-magnitude more data than RNNs for state tracking; architectural induction bias determines data efficiency and weight sharing, grounding the computational bounds dimension of learning capacity

llm-interpretation-errors — oracle theory, error correction, and reliability dimensions migrated here; the error-theory area applies verification concepts specifically to LLM interpretation failures
tags — applies learning theory to KB architecture and evaluation; methodology-enforcement-is-constraining bridges both areas
document-system — the type ladder (text→note→structured-claim) instantiates the constraining gradient for documents

Other tagged notes

Activate Behavior-Changing Memory Before The Mistake - Behavior-changing memory must activate before relevant actions rather than waiting for explicit retrospective search
Agent memory needs discoverable, composable, trusted knowledge under bounded context - Frames discoverable, composable, trusted remembered knowledge as the minimal artifact-quality basis for agent memory under bounded context.
An agentic KB maximizes contextual competence through discoverable, composable, trusted knowledge - Retired note kept as a backlink target; its general memory-quality claim and KB-specific ingress claim now live in narrower successor notes.
Apparent success is an unreliable health signal in framework-owned tool loops - When framework-owned tool loops recover from broken tools via agent workarounds, final success stops being a reliable signal that the underlying scripts and workflows are healthy
Automated synthesis is missing good oracles - Generating synthesis candidates (cross-note connections, novel combinations) is easy — LLMs do it readily. The hard part is evaluating whether a candidate is genuine insight or noise.
Behavioral authority - Definition - behavioral authority records who consumes a retained artifact, through which channel, and with what force
Brainstorming: how reach informs KB design - Brainstorming on Deutsch's "reach" concept applied to KB notes — reach is a maintenance risk signal (not a retrieval signal) because high-reach revisions break downstream reasoning silently
Choosing what to learn requires both validity and learning-value gates - Separates two promotion checks for learning loops: whether a candidate is trustworthy enough to learn from, and whether learning it would improve the current system.
Designing a Memory System for LLM-Based Agents - Derives agent-memory design pressures and links to a requirements inventory for agents designing or evaluating memory systems
Diagnostic richness constrains outer-loop learning quality - Outer-loop learning depends on inspectable failure evidence, not only on the oracle used to select winning candidates
Evaluate Memory By Effects, Not By Existence - Memory should be evaluated by downstream effects on tasks, artifacts, answers, behavior, context efficiency, and lineage alignment
Evolving understanding needs re-distillation, not composition - When understanding evolves, reconciling fragments into a coherent picture can exceed effective context; a pre-distilled narrative keeps the whole picture within feasible bounds
Knowledge artifact - Definition - a knowledge artifact is a retained artifact consumed as evidence, reference, context, explanation, or advice
Known-target discovery benchmarks show reachability, not discovery closure - Distinguishes backcast and reinvention benchmarks from autonomous discovery: they show that target insights are reachable from supplied ingredients, not that a system can select and verify new discoveries prospectively.
Lineage - Definition - lineage records the source dependencies needed to invalidate, regenerate, retire, or review retained behavior-shaping artifacts
Links encode conditional possibilities, not obligations - Links encode conditional possibilities, not obligations — every label must name a specific reader-need (the condition under which following pays off); content required for all reachable readers should be inlined, not linked
LLM debugging starts with retry-versus-rewrite triage - The two-phenomena model makes the first LLM debugging question diagnostic — is the failure a bad execution of a good interpretation (retry) or a consistent choice of a bad interpretation (rewrite the spec)? — because the fixes differ and do not substitute
LLM↔code boundaries are natural checkpoints - At each LLM↔code transition both semantic underspecification and execution indeterminism collapse simultaneously, making these boundaries natural places to anchor debugging, testing, and refactoring
Memory design adds operational axes to artifact analysis - Memory design needs operational policy axes (capture, derivation, activation, authority assignment, lifecycle, evaluation) on top of substrate, form, lineage, and behavioral authority
Opacity is a scale threshold, not a class property - Opacity is not a representational form; any representation becomes practically opaque at sufficient scale, though distributed-parametric artifacts cross that threshold earliest.
Operative part - Definition - an operative part is the behavior-affecting content, structure, parameterization, or mechanism within a retained artifact or consumption path
Orchestration strategies and run-state have opposite persistence economics - Inside a host-language scheduler, run-state K is task-specific so it has near-zero cross-task reuse value and should stay ephemeral, while select-strategies recur and are expensive to rediscover so they are the high-value promotion target — RLM discards both, losing the valuable half
Progressive constraining commits only after patterns stabilize - Constraining via LLM code generation freezes a single projection of the spec in one shot, but progressive constraining observes behavior across many runs and commits only the interpretations that consistently emerge
Psychology-to-agent transfer needs per-principle failure-mode testing - Brainstorming a methodology for evaluating cognitive-science-to-agent transfer — assembled from three existing KB notes and tested against Youssef's five psychology principles as worked examples
Representational form - Definition - representational form classifies how an operative part is encoded and consumed: prose, symbolic, distributed-parametric, or mixed
Retained artifact - Definition - a retained artifact is retained state that a later agentic loop can consume in a behavior-shaping way, regardless of storage substrate
Reverse compression is when LLM output expands without adding information - LLMs can inflate a compact seed into verbose prose that carries no more extractable structure — the test for whether a KB resists this is whether notes accumulate epiplexity across the network, not just token count
RLM, Tendril, and llm-do place symbolic work at different persistence boundaries - Compares RLM, Tendril, and llm-do as three placements for symbolic work and interfaces: ephemeral REPL code, workspace-local generated tools, and durable unified callables
Selector-loaded review gates could let review-revise learn from accepted edits - Brainstorm on learning reusable review gates from accepted note edits: mine candidate gates from before/after diffs, store them atomically, and load a bounded subset into future reviews
Short composable notes maximize combinatorial discovery - The library's purpose is to produce notes that can be co-loaded for combinatorial discovery — short atomic notes are a consequence of this goal; longer synthesized artifacts belong in workshops or distilled instructions
Silent disambiguation is the semantic analogue of tool fallback - When an agent silently resolves unacknowledged material ambiguity in a spec, final success hides that the contract failed to determine the path — an extension of the tool-fallback observability problem
Storage substrate - Definition - storage substrate records where retained state persists, as an operational field distinct from form, lineage, and authority
System-definition artifact - Definition - a system-definition artifact is a retained artifact consumed with instruction, enforcement, routing, validation, configuration, evaluation, or learning force
System-definition artifacts are crystallized reasoning under context scarcity - Heuristic system-definition artifacts (tips, playbooks, rules) are mostly crystallized reasoning; under unbounded context heuristic prose collapses into knowledge artifacts plus read-time derivation, while authority-bearing constraints and symbolic codification persist for other reasons
Systematic prompt variation serves verification and diagnosis, not explanatory-reach testing - Controlled prompt variation either decorrelates checks or measures brittleness under fixed task semantics; Deutsch's variation test instead changes the explanation to test mechanism and reach
The adaptation survey corroborates memory requirements but misses artifact governance - The agentic-adaptation survey supports the memory requirements map by treating memory and skills as adaptive tools, but it needs substrate, form, lineage, and authority governance to become design guidance
The four-field record exposes an efficiency, security, and sovereignty risk triad - The four artifact-analysis fields exist to surface three architectural review concerns over retained behavior — efficiency, security, and sovereignty — with sovereignty (owner control to inspect, regenerate, delete, roll back) as the new axis
The readable-artifact loop is the tractable unit for continual learning - Within substrate coevolution, the readable pair (prose + symbolic) is the tractable unit to build a first automated loop around — shared context, current tempo, and an existing codification boundary make joint optimization clean; the pair is also under-explored relative to distributed-parametric optimization
Treat continual learning as substrate coevolution - Behaviour change spans three representational forms — distributed-parametric, prose, and symbolic — so the coevolution question is how their improvement loops relate, not which is the real locus of learning
Use Trace-Derived Extraction As Meta-Learning - Trace-derived extraction is an after-the-fact learning path that must respect signal quality, review, and readable-artifact versus distributed-parametric learning boundaries

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search