Context engineering

Type: note · Status: seedling · Tags: computational-model

Context engineering is the architectural discipline of designing systems around bounded-context computation. The immediate problem is getting the right knowledge into a bounded context at the right time, but the scope is wider than prompt assembly. If context is the governing constraint, the structures that determine what can be loaded, when, and what survives across boundaries also belong to context engineering.

Anthropic defines it as "strategies for curating and maintaining the optimal set of tokens during LLM inference" (Anthropic, 2025). This KB's treatment is consistent with that operational view but broader in architectural scope, because context efficiency is the central design concern: when bounded context is the scarce resource, whole-system structure must be designed around it.

The operational core decomposes into four components within a single bounded call:

Routing — deciding what knowledge is relevant before loading it. The instruction-specificity/loading-frequency match (always-loaded → on-reference → on-invoke → on-demand) is routing. CLAUDE.md as a router is routing. Retrieval-oriented descriptions that let agents decide "don't follow this" without loading the target are routing.

Loading — assembling the prompt from selected knowledge. The select function in the bounded-context orchestration model formalizes this: given state K and budget M, build prompt P with |P| ≤ M. Loading includes both what to include and how to frame it — the same knowledge under different framing has different extractable value.

Scoping — isolating what each consumer sees. Sub-agents as lexically scoped frames is scoping. The flat context has no scoping; architecture must impose it.

Maintenance — keeping loaded context healthy over time. Compaction, observation masking, and the workshop layer's holistic-rewrite discipline are maintenance. Without maintenance, context accumulates debris that degrades reasoning even when token counts are low.

Distillation — compressing knowledge for a specific task under a context budget — is the main operation these components perform, but not the only one. The bounded-context orchestration model formalizes the machinery: the solve loop where a symbolic scheduler drives routing, loading, and scoping decisions for each bounded LLM call.

Architectural scope beyond a single call

The operational core succeeds or fails based on decisions made before and after prompt assembly:

Storage format — knowledge stored in forms that are cheap to retrieve selectively. Notes, descriptions, and indexes are context-engineering structures because they determine whether routing can happen before full loading.

Knowledge lifecycle — how raw interaction becomes reusable knowledge and how that knowledge is curated over time. A KB that only accumulates transcripts has already failed the context problem upstream.

Session boundaries — a system can inherit transcript history by default or treat each call as a fresh assembly problem. Session history should not be the default next context is context engineering at the boundary level, not just the prompt level.

Inter-agent communication — when sub-agents return compressed artifacts instead of full transcripts, the boundary itself becomes a context-engineering primitive. Execution boundaries are natural sites for distillation.

Tool and interface design — tool descriptions, instruction surfaces, and generated interfaces consume context budget too. Frontloading and build-time generation shift interpretive cost out of the live context window.

A system with poor storage shape, transcript-oriented boundaries, or verbose tool surfaces cannot be rescued by a clever selector alone.


Relevant Notes: