Learning theory
Type: kb/types/tag-readme.md · Status: current
How systems learn, verify, and improve. These notes define learning mechanisms, verification gradients, and memory architecture that KB design draws on but that aren't KB-specific — they apply to any system that adapts through durable artifacts.
The area is organized around deploy-time learning as the unifying framework. Accumulation — adding knowledge to the store — is the most basic learning operation, with reach as its key property: facts sit at the low end, theories at the high end. Two orthogonal mechanisms (constraining and distillation) transform accumulated knowledge; a third operation (discovery) produces the high-reach theories that are accumulation's most valuable items.
The kinds of notes under this tag
Every note carrying learning-theory also carries at least one of these child tags (enforced by validation — the typed routing below is trustworthy):
- deploy-time-learning — the framework itself: adaptation through durable inspectable artifacts, learning fundamentals, and feedback quality
- constraining — narrowing the interpretation space, from conventions to deterministic code; codification, relaxing, and the decision heuristics
- distillation — targeted extraction of use-shaped artifacts from larger reasoning
- discovery — positing a general concept and recognizing particulars as its instances; reach as what it produces
- artifact-analysis — the four-field vocabulary (substrate, form, lineage, authority) for retained behavior-shaping artifacts
- agent-memory — memory architecture: spaces, contamination, policy learnability, and the crosscutting decomposition
- llm-interpretation-errors — oracle theory, error correction, and reliability; the error-theory area applies verification concepts to LLM interpretation failures
Start here
- deploy-time learning is the missing middle — the unifying framework: three timescales of system adaptation
- learning is not only about generality — accumulation with reach as its key property; Simon's definition grounds the decomposition
- agentic systems interpret underspecified instructions — the underspecification foundation: spec-to-program projection and the constrain/relax cycle
- the verifiability gradient — the ladder deploy-time artifacts sit on
- constraining and distillation both trade generality for reliability, speed, and cost — how the two transforming mechanisms relate
- discovery is seeing the particular as an instance of the general — the third operation, and why recognition is its hard problem
Related Tags
- tags — the hub; applies learning theory to KB architecture and evaluation
- document-system — the type ladder (text→note→structured-claim) instantiates the constraining gradient for documents
- context-engineering — where in-context learning meets the system layer that selects and organizes knowledge
Other tagged notes
- Abstract an experience into a lesson only when you can state where the lesson stops - Abstract an episode into a lesson only when you can state its boundary, else preserve the instance; an over-generalized lesson is one that drops the condition clause
- Activate Behavior-Changing Memory Before The Mistake - Behavior-changing memory must activate before relevant actions rather than waiting for explicit retrospective search
- Ad hoc prompts extend the system without schema changes - Any system with an LLM agent layer can absorb new requirements through natural language prompts without changing the deterministic base
- Adaptation signals choose pressure; artifact analysis chooses the retained surface - Maps agentic-adaptation signals onto artifact-analysis axes so KB learning records which retained surface changes, what authority it gains, and how to review it
- Agent context is constrained by soft degradation, not hard token limits - Agent context is bounded by silent reliability degradation across volume, complexity, and relevance/interference, not just by provider token limits
- Agent memory is a crosscutting concern, not a separable niche - Memory decomposes into storage (solved), retrieval/activation (context engineering), and learning (learning theory) — treating it as a standalone category hides that the hard problems are at the intersections
- Agent memory needs discoverable, composable, trusted knowledge under bounded context - Frames discoverable, composable, trusted remembered knowledge as the minimal artifact-quality basis for agent memory under bounded context.
- An accepted edit verifies the change, not the rule - Human acceptance of an edit is a strong oracle for 'this change was wanted here' but a weak oracle for 'this generalizes' — mining rules from accepted edits inherits instance-level verification while the generalization step stays oracle-poor
- An agentic KB maximizes contextual competence through discoverable, composable, trusted knowledge - Retired note kept as a backlink target; its general memory-quality claim and KB-specific ingress claim now live in narrower successor notes.
- An LLM's generation confidence tracks typicality, not soundness - An LLM's next-token confidence measures how typical a continuation is, not whether it's true or valid; the two are decoupled, so soundness can't be read off confidence and needs a separate check
- An outcome check licenses replay; a rule needs the process verified - Outcome and process verification are two verify-rung oracles — an outcome check licenses replaying an instance, not distilling a rule, since only a process check inspects the generalizing mechanism
- Apparent success is an unreliable health signal in framework-owned tool loops - When framework-owned tool loops recover from broken tools via agent workarounds, final success stops being a reliable signal that the underlying scripts and workflows are healthy
- Automated synthesis is missing good oracles - Generating synthesis candidates (cross-note connections, novel combinations) is easy — LLMs do it readily. The hard part is evaluating whether a candidate is genuine insight or noise.
- Axes of artifact analysis - Artifact analysis records retained behavior-shaping artifacts by storage substrate, representational form, lineage, and behavioral authority so review evidence, invalidation, and rollback follow how artifacts actually act
- Behavioral authority - Definition - behavioral authority records who consumes a retained artifact, through which channel, and with what force
- Brainstorming: how reach informs KB design - Brainstorming on Deutsch's "reach" concept applied to KB notes — reach is a maintenance risk signal (not a retrieval signal) because high-reach revisions break downstream reasoning silently
- Changing requirements conflate genuine change with disambiguation failure - Agile's 'changing requirements' hide two distinct phenomena — genuine change (world moved) and late discovery that downstream specs committed to a wrong interpretation of an underspecified upstream spec — short iterations limit interpretation-error propagation, not just change-response latency
- Choosing what to learn requires both validity and learning-value gates - Separates two promotion checks for learning loops: whether a candidate is trustworthy enough to learn from, and whether learning it would improve the current system.
- Codification - Definition — codification is constraining that crosses from natural language into a symbolic artifact with formal semantics or assigned consequences; executable code is the main practical KB case
- Codification and relaxing navigate the bitter lesson boundary - Since you can't identify which side of the bitter lesson boundary you're on until scale tests it, practical systems must codify and relax — with spec mining avoiding the vision-feature failure mode
- Codify-versus-LLM decision heuristics - Four lenses on the codify-vs-LLM decision — spec completeness, oracle strength, interpretation space, pattern stability — collected from across the KB, with evidence they come apart at the edges
- Constraining during deployment is continuous learning - Continuous learning can happen outside of weights; constraining is one symbolic-artifact form where prompts, schemas, tools, and tests accumulate durable adaptive capacity during deployment
- Continual learning's open problem is behaviour, not knowledge - Continual learning's hard part is behaviour change; knowledge accumulation fits ordinary stores, while durable behaviour change comes through weights or readable artifacts
- Designing a Memory System for LLM-Based Agents - Derives agent-memory design pressures and links to a requirements inventory for agents designing or evaluating memory systems
- Diagnostic richness constrains outer-loop learning quality - Outer-loop learning depends on inspectable failure evidence, not only on the oracle used to select winning candidates
- Enforcement without structured recovery is incomplete - The enforcement gradient covers detection and blocking but has no recovery column — recovery strategies (corrective → fallback → escalation) are the missing layer, and oracle strength determines which are viable at each level
- Ephemeral computation prevents accumulation - Ephemeral computation — discarding generated artifacts after use — trades accumulation for simplicity, making it the inverse of codification
- Ephemerality is safe where embedded operational knowledge has low reach - Kirsch's barriers all mark cases where software carries decisions that must survive into future runs, users, and audits; ephemerality is safe only when that knowledge stays local
- Error messages that teach are a constraining technique - In agent systems the error channel is an instruction channel — making errors teach the fix is nearly free and eliminates the agent's need to diagnose, an orthogonal axis to enforcement strength
- Evaluate Memory By Effects, Not By Existence - Memory should be evaluated by downstream effects on tasks, artifacts, answers, behavior, context efficiency, and lineage alignment
- Evaluation automation is phase-gated by comprehension - Optimization loops require manual error analysis and judge calibration before automation can improve behavior rather than just score
- Evolving understanding needs re-distillation, not composition - When understanding evolves, reconciling fragments into a coherent picture can exceed effective context; a pre-distilled narrative keeps the whole picture within feasible bounds
- Fixed artifacts split into exact specs and proxy theories - Fixed artifacts are safe when their spec fully captures the problem; they are risky when they encode proxy theories whose components may not compose into the larger capability
- Flat memory predicts specific cross-contamination failures that are empirically testable - Flat memory predicts three cross-contamination failures — search pollution, identity scatter, insight trapping — testable via an observation protocol against real agent systems
- In-context learning presupposes context engineering - In-context learning only works when the right knowledge reaches the context window — the selection machinery that ensures this is itself learned and refined over deployment
- Information value is observer-relative - The value of information depends on the observer — prior knowledge, computational capacity, tools, and goals determine what they can extract. Grounds distillation, discovery, and context arrangement as observer-relative operations.
- Inspectable artifact, not supervision, defeats the blackbox problem - Chollet frames agentic coding as ML producing blackbox codebases — codification counters this not by requiring human review but by choosing readable artifacts (code, prompts, schemas) that any agent can inspect, diff, test, and verify
- Knowledge artifact - Definition - a knowledge artifact is a retained artifact consumed as evidence, reference, context, explanation, or advice
- Known-target discovery benchmarks show reachability, not discovery closure - Distinguishes backcast and reinvention benchmarks from autonomous discovery: they show that target insights are reachable from supplied ingredients, not that a system can select and verify new discoveries prospectively.
- Legal drafting solves the same problem as context engineering - Legal drafting parallels context engineering because both write ambiguous natural-language specifications for judgment-based interpreters, but law develops constraining more than codification
- Lineage - Definition - lineage records the source dependencies needed to invalidate, regenerate, retire, or review retained behavior-shaping artifacts
- Links encode conditional possibilities, not obligations - Links encode conditional possibilities, not obligations — every label must name a specific reader-need (the condition under which following pays off); content required for all reachable readers should be inlined, not linked
- LLM debugging starts with retry-versus-rewrite triage - The two-phenomena model makes the first LLM debugging question diagnostic — is the failure a bad execution of a good interpretation (retry) or a consistent choice of a bad interpretation (rewrite the spec)? — because the fixes differ and do not substitute
- LLM generation relaxes a goal it can't satisfy and hides the constraint a human writer stalls on - A human writer stalls at the constraint they can't satisfy; an LLM instead ships fluent output that looks solved but silently drops it — hiding the error, so the check falls on the reader
- LLM learning phases fall between human learning modes rather than mapping onto them - Pre-training acquires both structural priors (evolution's role in humans) and world knowledge in one pass — making it and in-context learning intermediate on the evolution-to-reaction spectrum
- LLM↔code boundaries are natural checkpoints - At each LLM↔code transition both semantic underspecification and execution indeterminism collapse simultaneously, making these boundaries natural places to anchor debugging, testing, and refactoring
- Memory design adds operational axes to artifact analysis - Memory design needs operational policy axes (capture, derivation, activation, authority assignment, lifecycle, evaluation) on top of substrate, form, lineage, and behavioral authority
- Memory management policy is learnable but oracle-dependent - AgeMem stores facts in memory but learns the governing policy in distributed-parametric state; it is a clean durable-learning case, but one that depends on task-completion oracles the KB lacks
- Methodology enforcement is constraining - Instructions, skills, hooks, and scripts form a constraining gradient for methodology — from underspecified and indeterministic (LLM interprets and may not follow) to fully deterministic (code always runs), with hooks occupying a middle ground of deterministic triggers with indeterministic responses
- Minimum viable vocabulary is the naming set that most reduces extraction cost for a bounded observer - Reframes "minimum viable ontology" as an optimization problem — the vocabulary that, once acquired, maximally reduces a bounded observer's extraction cost for a domain; grounds the pedagogical intuition of "conceptual thresholds" in the KB's information-theoretic framework
- Opacity is a scale threshold, not a class property - Opacity is not a representational form; any representation becomes practically opaque at sufficient scale, though distributed-parametric artifacts cross that threshold earliest.
- Operational signals that a component is a relaxing candidate - Six operational signals — five early-detection (paraphrase brittleness, isolation-vs-integration gap, process constraints, unspecifiable failure modes, distribution sensitivity) plus composition failure as late-stage confirmation — for shifting confidence about whether a component encodes theory or specification.
- Operative part - Definition - an operative part is the behavior-affecting content, structure, parameterization, or mechanism within a retained artifact or consumption path
- Orchestration strategies and run-state have opposite persistence economics - Inside a host-language scheduler, run-state K is task-specific so it has near-zero cross-task reuse value and should stay ephemeral, while select-strategies recur and are expensive to rediscover so they are the high-value promotion target — RLM discards both, losing the valuable half
- Progressive constraining commits only after patterns stabilize - Constraining via LLM code generation freezes a single projection of the spec in one shot, but progressive constraining observes behavior across many runs and commits only the interpretations that consistently emerge
- Prose has no reliable dereference, so a declared fact must be reinforced where it applies - Code resolves a name to its value everywhere; LLM-read prose has no such dereference, so a fact stated once may not govern where it applies — restate it at the point of use, kept honest by a check
- Psychology-to-agent transfer needs per-principle failure-mode testing - Brainstorming a methodology for evaluating cognitive-science-to-agent transfer — assembled from three existing KB notes and tested against Youssef's five psychology principles as worked examples
- Raw accumulation does not create usable memory - Accumulation preserves material, but usable agent memory requires ingress work that adds handles, scope, relationships, provenance, trust signals, and lifecycle pressure.
- Representational form - Definition - representational form classifies how an operative part is encoded and consumed: prose, symbolic, distributed-parametric, or mixed
- Retained artifact - Definition - a retained artifact is retained state that a later agentic loop can consume in a behavior-shaping way, regardless of storage substrate
- Reverse compression is when LLM output expands without adding information - LLMs can inflate a compact seed into verbose prose that carries no more extractable structure — the test for whether a KB resists this is whether notes accumulate epiplexity across the network, not just token count
- RLM, Tendril, and llm-do place symbolic work at different persistence boundaries - Compares RLM variants, Tendril, and llm-do as placements for symbolic work and interfaces: ephemeral REPL code, typed RLM combinators, workspace-local generated tools, and durable unified callables
- Short composable notes maximize combinatorial discovery - The library's purpose is to produce notes that can be co-loaded for combinatorial discovery — short atomic notes are a consequence of this goal; longer synthesized artifacts belong in workshops or distilled instructions
- Silent disambiguation is the semantic analogue of tool fallback - When an agent silently resolves unacknowledged material ambiguity in a spec, final success hides that the contract failed to determine the path — an extension of the tool-fallback observability problem
- Soft-bound traditions as sources for context engineering strategies - Survey of twelve soft-bound traditions as candidate sources for context engineering strategies, with a three-tier assessment of what transfers, what's plausible, and what's blocked
- Spec mining is codification's operational mechanism - Operationalizes codification by extracting deterministic verifiers from observed stochastic behavior — the mechanism that converts blurry-zone components into calculators
- Specification strategy should follow where understanding lives - Among durable artifacts, spec-first, bidirectional spec, and spec mining fit different phases: when understanding is available upfront, discovered during execution, or only visible after observation
- Storage substrate - Definition - storage substrate records where retained state persists, as an operational field distinct from form, lineage, and authority
- Storing LLM outputs is constraining - Choosing to keep a specific LLM output resolves semantic underspecification to one interpretation and freezes it against execution indeterminism — the same constraining move the parent note describes for code, applied to artifacts
- System-definition artifact - Definition - a system-definition artifact is a retained artifact consumed with instruction, enforcement, routing, validation, configuration, evaluation, or learning force
- System-definition artifacts are crystallized reasoning under context scarcity - Heuristic system-definition artifacts (tips, playbooks, rules) are mostly crystallized reasoning; under unbounded context heuristic prose collapses into knowledge artifacts plus read-time derivation, while authority-bearing constraints and symbolic codification persist for other reasons
- Systematic prompt variation serves verification and diagnosis, not explanatory-reach testing - Controlled prompt variation either decorrelates checks or measures brittleness under fixed task semantics; Deutsch's variation test instead changes the explanation to test mechanism and reach
- The adaptation survey corroborates memory requirements but misses artifact governance - The agentic-adaptation survey supports the memory requirements map by treating memory and skills as adaptive tools, but it needs substrate, form, lineage, and authority governance to become design guidance
- The four-field record exposes an efficiency, security, and sovereignty risk triad - The four artifact-analysis fields exist to surface three architectural review concerns over retained behavior — efficiency, security, and sovereignty — with sovereignty (owner control to inspect, regenerate, delete, roll back) as the new axis
- The readable-artifact loop is the tractable unit for continual learning - Within substrate coevolution, the readable pair (prose + symbolic) is the tractable unit to build a first automated loop around — shared context, current tempo, and an existing codification boundary make joint optimization clean; the pair is also under-explored relative to distributed-parametric optimization
- Three-space agent memory echoes Tulving's taxonomy but the analogy may be decorative - The value of separating knowledge, self, and operational memory is that each has a different lifecycle — accumulation, slow evolution, and high churn; whether the Tulving mapping adds explanatory power beyond different retention policies is open
- Trace-derived memory earns authority per operation, not at capture - Trace-derived memory arrives as a record, not knowledge — authority is earned through post-capture operations (verify, distill, consult) with increasingly hard oracles; stores that stall before verification accumulate guesses masquerading as knowledge
- Treat continual learning as substrate coevolution - Behaviour change spans three representational forms — distributed-parametric, prose, and symbolic — so the coevolution question is how their improvement loops relate, not which is the real locus of learning
- Underspecification and indeterminism complicate programming for prompts in distinct ways - Indeterminism doubles test runs (statistical testing over distributions); underspecification doubles test targets (spec analysis for ambiguity). Conflating the two leads to misdiagnosis
- Unified calling conventions enable bidirectional refactoring between neural and symbolic - When agents and tools share a calling convention, components can move between neural and symbolic without changing call sites — llm-do demonstrates this with name-based dispatch over a hybrid VM
- Use Trace-Derived Extraction As Meta-Learning - Trace-derived extraction is an after-the-fact learning path that must respect signal quality, review, and readable-artifact versus distributed-parametric learning boundaries