Harness taxonomy convergence
Five sources address agent harness architecture, but they split into two kinds: component decompositions (what a runtime is made of) and operational/control vocabularies (how a runtime is steered and corrected). The alignment question is narrower and more defensible once this split is explicit.
Sources
Component decompositions
- Vtrivedy10 — derives 6 components from model limitations (filesystem, bash, sandboxes, memory/search, context management, long-horizon execution)
- Raschka — derives 6 components from "what makes agent-mode outperform plain chat" (live repo context, prompt shape, tool access, context bloat minimization, session memory, bounded subagents)
- This KB (commonplace) — derives components from first principles of LLM limitations, expressed as implementation rather than taxonomy. CLAUDE.md as control-plane router, skills/instructions as on-demand context loading, file-backed notes as execution substrate, sub-agents for scoped work
Operational / control vocabularies
- Lopopolo — derives 3 pillars from production practice at 1M LOC (constrain, inform, verify/correct) plus entropy management
- Cybernetics thread — reframes as control theory (sensors, actuators, feedback loops, externalized judgment)
The alignment question
The three component decompositions were developed independently. Do they converge on the same structural boundaries?
Separately: the operational vocabularies describe what you do with a runtime, not what it's made of. Do they describe a second axis (governance/maintenance) that the structural decomposition doesn't capture?
Structural convergence table
| Function | Vtrivedy10 | Raschka | Commonplace | KB component |
|---|---|---|---|---|
| Control flow / iteration / decomposition | Long-horizon execution (Ralph Loop) | Bounded Subagents (delegation, recursion limits) | Sub-agents for scoped work; skill orchestration | Scheduler |
| What enters each call | Context management (compaction, progressive loading) | Live Repo Context + Prompt Shape | CLAUDE.md as control-plane router; routing table; progressive disclosure via skills/instructions | Context engine |
| Context maintenance / bloat | Context management | Context Bloat Minimization | Instruction specificity matching loading frequency; on-demand skill bodies | Context engine |
| Working memory / retrieval | Memory/search | Structured Session Memory | MEMORY.md; /connect discovery; search patterns in CLAUDE.md | Context engine + substrate |
| Durable state | Filesystem | — | File-backed notes, sources, instructions; git as versioning layer | Execution substrate |
| Tool execution | Bash | Tool Access | Python scripts (validate, review, selector); bash via harness | Execution substrate |
| Safety boundaries | Sandboxes | — | Permission modes; hooks | Execution substrate |
Governance / maintenance axis
The operational vocabularies and commonplace's own systems populate a second axis that runs across the structural components:
| Function | Lopopolo | Cybernetics | Commonplace |
|---|---|---|---|
| Quality enforcement | Constraints (structural tests, linters) | Externalized judgment | /validate (deterministic); review gates (LLM judgment) |
| Correction | — | — | Fix system (applies corrections from review findings) |
| Drift detection / staleness | Entropy management (cleanup agents) | Feedback loops | Review sweeps; staleness detection; ack for trivial changes |
| Informing | Context engineering (AGENTS.md as map) | Sensors (what the system observes) | CLAUDE.md routing; skill descriptions |
Observations
- The category error is productive. This workshop started as if there were five peer decompositions. The observations reveal a cleaner split: three component taxonomies and two operational vocabularies. The real convergence claim is narrower and more defensible.
- The three component decompositions converge. Vtrivedy10, Raschka, and commonplace independently land on something like scheduler, context engine, and execution substrate. They use different vocabulary and granularity, but the structural boundaries align.
- The runtimes-decompose note's three questions are structural, not operational. "What happens next?" (scheduler), "what does this call see?" (context engine), "where do exact state and actions live?" (substrate). They don't answer "how is the system monitored and corrected over time?" — that's the governance axis.
- Governance emerged as a separate axis, not a fourth component. Commonplace's validate/review/fix systems were built as distinct machinery, not as extensions of the scheduler, context engine, or substrate. Lopopolo's pillars (constrain/inform/verify) and the cybernetics vocabulary (sensors/actuators/feedback) describe the same axis from outside. This is evidence for two axes (structure × governance), not a missing fourth structural component.
- Commonplace is convergence-through-construction, not independent corroboration. The KB was built on the same first principles it uses to analyze external sources. This is strong construct validity (the split was operationally useful enough to build against) but weaker than outside evidence. Vtrivedy10 and Raschka are the independent corroboration; commonplace confirms the split is buildable.
Outcome paths
- If the claim stays structural: fold the three-way convergence table into the convergence section of
agent-runtimes-decompose-into-scheduler-context-engine-and-execution-substrate.md. Mention commonplace as construct validity, not peer evidence. - If the stronger claim emerges — runtime structure and runtime governance are separate axes: that's a standalone note. The structural decomposition answers "what is the runtime made of?" The governance axis answers "how is the runtime monitored and corrected?" Both are needed; neither subsumes the other. The review/fix system documentation we just wrote is evidence that governance needed its own machinery.
Current workshop artifacts for the stronger claim: