Decomposition heuristics for bounded-context scheduling

Type: note · Status: seedling · Tags: computational-model

These rules follow from the symbolic scheduling model. Each heuristic is a transformation between programs — and since any symbolic program with bounded calls is a select/call program, applying a heuristic keeps you within the model's program space. The rules are preliminary — we expect to discover more as the model develops.

What is being optimised

The optimisation problem is:

Choose a decomposition, a prompt-construction strategy, and a schedule of state transformations that maximises expected task utility while respecting the per-call bound M.

Even before underspecified semantics enter, there are several objective terms:

total token traffic between scheduler state and LLM calls
number of agent calls
peak prompt size
information loss from compression
preservation of cross-item interactions needed by later synthesis

This makes the problem different from ordinary knapsack-style context packing. The scheduler must trade off:

early filtering against the risk of discarding something that matters later
aggressive summarisation against the risk of destroying interactions needed for synthesis
many narrow calls against the overhead of orchestration
loading raw state-derived material against saving task-shaped intermediate ones

The first two are about optionality — paying context now to keep options open later. The latter two are about cost structure — choosing between representations and decompositions with different efficiency profiles. Both kinds of trade-off are present in every scheduling decision.

Working heuristics

These are proposed rules, not established principles. The first two have direct empirical grounding (see below); the rest are conjectured from the model's structure.

Separate selection from joint reasoning. First use cheap narrow calls to discover sparsity. Only then pay for wide calls that need multiple items together. When the relevant set is irreducibly dense — most items interact with most others — this separation has limited room to operate; the scheduler may need to fall back on hierarchical merging with explicit interface preservation rather than filtering.

Use symbolic operations wherever exactness is available. Retrieval, thresholding, sorting, prompt assembly, and name-based routing should be outside the LLM window whenever possible.

Save reusable intermediate items in scheduler state. Relevance labels, extracted claims, and task-specific summaries are worth keeping when they are much cheaper to reuse than reconstructing the originals.

Delay expensive co-loading until interactions justify it. Joint loading is valuable only when the task depends on relations between items rather than independent judgments about them.

Commit low-degree-of-freedom choices first. When one decision has only a narrow feasible set and another has many workable options, decide the constrained one first. This preserves optionality and avoids consuming scarce valid placements too early. Example: in a multi-source synthesis task, the output schema (few valid options) should be fixed before selecting which sources to foreground (many workable orderings).

Do not compress away needed interfaces. If the final answer depends on tensions, contradictions, or alignments between sources, summaries should preserve pointers or extracted structures that keep those interactions recoverable.

Choose representations, not just subsets. The main optimisation variable is often not which items to load, but whether to expose bodies, extracts, summaries, or previous synthesis items to the bounded call.

Exploit clean frames recursively. When the relevant set is still too large, apply the same pattern again: filter, cluster, compress, and merge in a tree rather than a flat history. The stopping condition is when decomposition overhead (extra calls, interface loss) exceeds the compositional-complexity benefit — but this note does not yet specify how to detect that threshold.

Empirical grounding

ConvexBench (Liu et al., 2026) validates two rules through distinct mechanisms. "Use symbolic operations wherever exactness is available" is confirmed by the design: expression structure is recovered via deterministic AST parsing rather than LLM calls. Separately, "exploit clean frames recursively" is supported by two results: scoped recursion — pruning history to retain only direct dependencies — recovers F1=1.0 at all depths from F1≈0.2 under flat accumulation, and finer decomposition (10-character sub-functions) consistently outperforms coarser decomposition. The F1 recovery comes from the focused-context scoping strategy (an LLM-call management technique), not from the symbolic parsing step; these are distinct interventions in the source. Both results hold despite trivial token counts (5,331 tokens at depth 100), confirming that the rules respond to compositional complexity, not volume.

MAKER (Meyerson et al., 2025) demonstrates the extreme case: maximal decomposition to m=1 (one step per bounded call) achieves O(s ln s) cost scaling and solves a 1,048,575-step task with zero errors. Without decomposition, cost scales exponentially. This validates "delay expensive co-loading until interactions justify it" — each step depends only on the current disk configuration, so independent calls dominate joint reasoning. The O(s ln s) scaling is a joint result of decomposition and MAKER's voting-based error correction (first-to-ahead-by-k); decomposition alone does not guarantee it. MAKER also illustrates the limit of "exploit clean frames recursively": when every sub-task is atomic, the recursive pattern degenerates to a flat sequence of maximally focused calls — though the source does not frame its architecture as recursive.

Both sources operate in the hard-oracle regime (mechanically verifiable sub-step correctness). Whether these heuristics hold equally for soft-oracle tasks — synthesis, creative work, ambiguous judgment — remains untested.

Open Questions

Which classes of lossy derived items preserve enough structure for later synthesis?
How should a scheduler detect the stopping condition for recursive decomposition — when overhead exceeds compositional-complexity benefit?
Do the heuristics transfer to soft-oracle tasks where per-step correctness cannot be verified mechanically?

Sources: - Liu et al. (2026). ConvexBench: Can LLMs recognize convex functions? — finer decomposition and focused context recover full performance from compositional collapse. - Meyerson et al. (2025). MAKER: Solving a million-step LLM task with zero errors — maximal decomposition achieves O(s ln s) cost scaling on million-step tasks.

Relevant Notes:

symbolic scheduling over bounded LLM calls is the right model for agent orchestration — foundation: the model these rules follow from
context efficiency is the central design concern in agent systems — cost model: context is the scarce resource these rules optimise over
distillation — mechanism: saved intermediate items are often distillations shaped for later reuse
solve low-degree-of-freedom subproblems first to avoid blocking better designs — extends: general ordering heuristic that explains why constraint-setting should happen before flexible synthesis choices
topology, isolation, and verification form a causal chain for reliable agent scaling — extends: "exploit clean frames recursively" implements the topology → isolation step of a proposed causal chain
any symbolic program with bounded calls is a select/call program — certifies: heuristic-guided transformations stay within the model's program space

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search