Distillation
Type: note · Status: current
One of two co-equal learning mechanisms in deployed agentic systems, alongside stabilisation. Distillation is targeted extraction — taking a body of reasoning and producing a focused artifact shaped by specific circumstances: a use case, a context budget, an agent.
Why distillation exists
Different operational contexts need different things from the same body of knowledge. An agent connecting notes needs a step-by-step procedure — not fifteen methodology notes about Toulmin argument structure, link contracts, and title-as-claim conventions. An agent validating notes needs a different extraction from the same methodology. A smaller-context agent needs a more compressed version of either.
Agent statelessness makes this architectural rather than convenient. Each session starts fresh, so the reasoning that produced a procedure can't be "remembered" — it must either be loaded (expensive) or distilled into something that fits the context budget.
How distillation works
The rhetorical mode shifts to match the target. The content is selected and compressed to fit the circumstances. What stays constant is the medium — unlike crystallisation, distillation typically stays in natural language consumed by an LLM.
| Source → Distillate | Rhetorical shift | Target |
|---|---|---|
| Methodology → Skill | Argumentative → procedural | Agents performing a specific workflow |
| Workshop → Note | Exploratory → assertive | Future agents and sessions needing the insight |
| Research → Design principle | Observational → prescriptive | Decision-making in a particular area |
Targeting is itself information loss — selecting what's relevant to one context means discarding what's relevant to others. This is why the source persists: it serves many targets, and each distillation chooses a different subset. Multiple distillations of the same source are normal. Reading only the /connect skill, you can connect notes but can't adapt the procedure to a novel situation. The methodology notes handle that.
The dominant mechanism in knowledge work
Most KB learning is distillation. The typical cycle: explore messily → notice patterns → extract insight → write a note. The resulting note might then get stabilised (better description, structured sections, eventually code), but the initial learning act — the one that creates new knowledge — is extraction from a larger body of reasoning.
Stabilisation constrains what already exists. Distillation creates something new from something larger. In a knowledge system, creation matters more than hardening — you need something worth hardening first.
Relationship to stabilisation
Stabilisation and distillation are orthogonal — they operate on different dimensions of the same artifacts:
| Not distilled | Distilled | |
|---|---|---|
| Not stabilised | Raw capture (text file, session notes) | Extracted but loose (draft skill, rough note) |
| Stabilised | Committed but not extracted (stored output, frozen config) | Extracted AND hardened (validated skill, crystallised script) |
Stabilisation asks: how constrained is this artifact? Distillation asks: was this artifact extracted from something larger?
You can distil without stabilising (extract a skill — still natural language, still underspecified). You can stabilise without distilling (store an LLM output — no extraction from reasoning involved). The full compound gain comes when both apply.
Not distillation: moving a validation check to code (crystallisation — the operation is commitment, not extraction); storing an LLM output (stabilisation — commitment, no extraction from reasoning).
Relevant Notes:
- stabilisation — co-equal mechanism: constraining the interpretation space, orthogonal to distillation
- crystallisation — the far end of stabilisation; sometimes follows distillation (extract a procedure, then crystallise it to code)
- skills derive from methodology through distillation — the full argument for distillation as the mechanism behind skill creation
- agent statelessness makes routing architectural — why distillation is architecturally necessary: context budget constraints
- deploy-time learning — the substrate (repo artifacts) through which distillation operates
- learning is not only about generality — foundation: capacity decomposes into generality vs reliability+speed+cost; distillation trades source completeness for operational efficiency
- information value is observer-relative — grounds: reframes distillation as bounded information extraction; deterministic transformations create information for bounded observers
- Epiplexity (Finzi et al., 2026) — grounds: epiplexity measures theoretically what distillation does operationally — quantifies extractable structure for a given observer under computational bounds
Topics: