Context engineering
Type: note · Status: seedling · Tags: computational-model
Context engineering is the architectural discipline of designing systems around bounded-context computation. The immediate problem is getting the right knowledge into a bounded context at the right time, but the scope is wider than prompt assembly. If context is the governing constraint, the structures that determine what can be loaded, when, and what survives across boundaries also belong to context engineering.
Anthropic defines it as "strategies for curating and maintaining the optimal set of tokens during LLM inference" (Anthropic, 2025). This KB's treatment is consistent with that operational view but broader in architectural scope, because context efficiency is the central design concern: when bounded context is the scarce resource, whole-system structure must be designed around it.
The operational core decomposes into four components within a single bounded call:
Routing — deciding what knowledge is relevant before loading it. The instruction-specificity/loading-frequency match (always-loaded → on-reference → on-invoke → on-demand) is routing. CLAUDE.md as a router is routing. Retrieval-oriented descriptions that let agents decide "don't follow this" without loading the target are routing.
Loading — assembling the prompt from selected knowledge. The select function in the bounded-context orchestration model formalizes this: given state K and budget M, build prompt P with |P| ≤ M. Loading includes both what to include and how to frame it — the same knowledge under different framing has different extractable value.
Scoping — isolating what each consumer sees. Sub-agents as lexically scoped frames is scoping. The flat context has no scoping; architecture must impose it.
Maintenance — keeping loaded context healthy over time. Compaction, observation masking, and the workshop layer's holistic-rewrite discipline are maintenance. Without maintenance, context accumulates debris that degrades reasoning even when token counts are low.
Distillation — compressing knowledge for a specific task under a context budget — is the main operation these components perform, but not the only one. The bounded-context orchestration model formalizes the machinery: the solve loop where a symbolic scheduler drives routing, loading, and scoping decisions for each bounded LLM call.
Architectural scope beyond a single call
The operational core succeeds or fails based on decisions made before and after prompt assembly:
Storage format — knowledge stored in forms that are cheap to retrieve selectively. Notes, descriptions, and indexes are context-engineering structures because they determine whether routing can happen before full loading.
Knowledge lifecycle — how raw interaction becomes reusable knowledge and how that knowledge is curated over time. A KB that only accumulates transcripts has already failed the context problem upstream.
Session boundaries — a system can inherit transcript history by default or treat each call as a fresh assembly problem. Session history should not be the default next context is context engineering at the boundary level, not just the prompt level.
Inter-agent communication — when sub-agents return compressed artifacts instead of full transcripts, the boundary itself becomes a context-engineering primitive. Execution boundaries are natural sites for distillation.
Tool and interface design — tool descriptions, instruction surfaces, and generated interfaces consume context budget too. Frontloading and build-time generation shift interpretive cost out of the live context window.
A system with poor storage shape, transcript-oriented boundaries, or verbose tool surfaces cannot be rescued by a clever selector alone.
Relevant Notes:
- distillation — the main operation context engineering performs: compressing knowledge for a task under a budget
- context efficiency is the central design concern — grounds: if bounded context is the governing cost model, context engineering must be architectural rather than local to prompt assembly
- bounded-context orchestration model — formalisation: the select/call loop that structures context engineering decisions
- instruction specificity should match loading frequency — mechanism: the routing hierarchy (always-loaded → on-demand)
- LLM context is composed without scoping — mechanism: sub-agents as the scoping component
- agents navigate by deciding what to read next — mechanism: routing through retrieval-oriented descriptions
- session history should not be the default next context — extends: session-boundary design determines whether context is inherited as transcript or reassembled per call
- frontloading spares execution context — extends: interface and instruction design can move interpretive cost out of the live context window
- legal drafting solves the same problem as context engineering — parallel: law's centuries of methodology for the same problem
- in-context learning presupposes context engineering — extends: in-context learning only works when context engineering has selected the right knowledge; this makes context engineering a prerequisite, not just an optimization
- a functioning KB needs a workshop layer not just a library — extends: knowledge lifecycle and artifact shape are upstream determinants of what later becomes cheap to load
- agent runtimes decompose into scheduler context engine and execution substrate — component view: the runtime's context engine is the operational core inside the broader architectural discipline named here