Ingest: From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence

Type: kb/sources/types/ingest-report.md

Source: from-entropy-to-epiplexity-rethinking-information-computationally-bounded.md Captured: 2026-03-05 From: https://arxiv.org/html/2601.03220v1

Classification

Type: scientific-paper -- The source is an arXiv paper with formal definitions, theorems, proposed measurement methods, and applications to machine learning phenomena. Domains: information-theory, learning-theory, context-engineering, computational-complexity Author: Marc Finzi, Shikai Qiu, Yiding Jiang, Pavel Izmailov, J. Zico Kolter, and Andrew Gordon Wilson are ML researchers at Carnegie Mellon University and New York University; Kolter and Wilson are established figures in optimization, robust ML, Bayesian deep learning, and ML theory.

Summary

The paper introduces epiplexity, or epistemic complexity, as a measure of structural information extractable by a computationally bounded observer. Its motivating claim is that Shannon entropy and Kolmogorov complexity miss phenomena that matter in learning systems: deterministic generators can improve model capabilities, ordering can change what models learn, and observers can extract structure not obvious from the data-generating process alone. Epiplexity separates time-bounded entropy, which remains unpredictable under a computation budget, from learnable structure, which becomes accessible to a bounded learner. The paper's main value for this KB is not a new agent architecture, but a formal vocabulary for why the same tokens, source, or artifact can have different value depending on the reader's computational budget, prior knowledge, tools, and framing.

Connections Found

The connect report found seven direct connections. The central one is information value is observer-relative: the paper grounds the note's claim by giving a formal ML-facing measure for pattern extractability under bounded computation, though the note is still seedling, so this grounding should remain provisional. It also grounds distillation, because deterministic restructuring can create usable information for a bounded consumer even when it adds no classical information. It grounds reverse compression by supplying a candidate measure for whether expanded prose adds extractable structure. It enables minimum viable vocabulary, because prequential or requential coding could test whether a vocabulary makes more domain structure accessible. It extends codification and relaxing navigate the bitter lesson boundary by offering a way to reason about whether an extracted regularity is genuine learnable structure or an observer-relative artifact. Finally, it grounds context efficiency and extends agent context is constrained by soft degradation by explaining complexity-sensitive context failure as limited structure extraction, not merely token volume.

Extractable Value

  1. Observer-relative information value gets a formal anchor -- The KB already has the concept, but epiplexity gives it a sharper mechanism: value depends on what structure a bounded observer can extract within a computational budget. This is high-reach because it explains distillation, context arrangement, and vocabulary design with one frame. [quick-win]

  2. Distillation can be framed as information creation for bounded observers -- Classical information theory makes deterministic restructuring look informationally neutral or lossy; epiplexity explains why a distillate can be more valuable than its source to a bounded agent. This strengthens distillation without changing its operational definition. [quick-win]

  3. Prequential coding suggests a measurement path for KB artifact quality -- The paper's cheap heuristic, area under the loss curve above final loss, could inspire experiments comparing note structures, vocabulary packs, or context-loading strategies by extractability rather than by token count or subjective clarity. This is promising but method-transfer risk is high because the paper measures model learning on datasets, not one-shot agent reading. [experiment]

  4. Context complexity can be treated as extractability, not only indirection or token load -- The context-efficiency notes currently describe complexity as a cost axis; epiplexity gives a simpler account of that axis: tokens are costly when the relevant structure is hard to extract under bounded compute. This could unify context complexity, framing, ordering, and soft degradation. [deep-dive]

  5. Minimum viable vocabulary becomes testable in principle -- If a vocabulary lowers extraction cost, the before/after difference should show up as more accessible structure for the same observer and corpus. The exact measurement is not ready, but the source gives a concrete target for future evaluation design. [experiment]

  6. Codification bets can be described as bets on durable structure -- Epiplexity gives codification and relaxing a sharper distinction: codify regularities that are stable learnable structure, stay cautious around apparent patterns that only current observers can cheaply see. This is useful as a conceptual check, not yet an engineering rule. [just-a-reference]

Limitations (our opinion)

The snapshot is a compressed source summary, not the full paper text, so this ingest should not lean on unquoted theorem details or empirical results beyond what the snapshot records. The strongest KB use is conceptual grounding, not direct adoption of the paper's measurements.

The central claim is hard to vary at the high level: if observers are computationally bounded, then usable information must depend on extraction cost. But several paper-to-KB transfers are easier to vary. "Use prequential coding to evaluate notes" is an analogy from dataset learning curves to knowledge-artifact usefulness; the source does not test agent note consumption, link traversal, single-session context loading, or human review. That makes the measurement proposal an experiment, not a practice.

The simpler account for many KB applications is "framing lowers search cost." Epiplexity may be the formal theory behind that account, but the KB does not need the full machinery to explain everyday cases like good descriptions, claim-shaped titles, or vocabulary lists. For operational writing guidance, simpler cost and retrieval language may be more actionable.

The paper also appears task-independent in ambition, but the KB's use cases are task-shaped. A note's value depends on what the agent is trying to do, not only on structural extractability in the abstract. This means epiplexity can ground the pattern-extraction side of information value is observer-relative, but it does not replace goal-relative value of information, decision utility, or review-specific judgment.

Update information-value-is-observer-relative.md: add a section distinguishing two observer-relative mechanisms, "pattern extractability under computational bounds" and "goal-relative decision value." Use this source to ground the first mechanism, and keep decision-theory/relevance-theory material for the second. That would let the note absorb epiplexity without overstating it as a complete theory of information value.