Inspectable substrate, not supervision, defeats the blackbox problem

Type: note · Status: current · Tags: learning-theory, observability

The claim from ML

Chollet observes that sufficiently advanced agentic coding is essentially machine learning: an optimization process (coding agents) iterates against a goal (spec + tests) until convergence, producing a blackbox artifact (the generated codebase) that is "deployed without ever inspecting its internal logic, just as we ignore individual weights in a neural network."

He predicts classic ML failure modes will follow: overfitting to the spec, Clever Hans shortcuts that don't generalize, data leakage, concept drift. And asks: what will be the Keras of agentic coding — the optimal high-level abstractions for steering this process?

Where the framing breaks

The blackbox analogy holds only if the output substrate is opaque. Neural network weights are opaque to any inspector — human or LLM. But repo artifacts (prompts, schemas, evals, deterministic code) are inherently inspectable. They can be diffed, tested, reverted, and reviewed — by humans or by LLMs. The substrate is what matters, not who reviews it.

An LLM can review a diff and catch a Clever Hans shortcut in generated code. It can run evals and detect overfitting to the test suite. It can compare a codified function against its specification and flag edge cases. None of this is possible with weight updates — not because LLMs lack judgment, but because weights lack structure.

The failure mode mapping

Chollet's predicted ML problems map directly to codification failure modes — but with mitigations that weight-based systems can't match:

ML failure mode	Codification equivalent	Mitigation available
Overfitting to spec	Goodharting on evals	Broader eval sets, LLM-as-judge on unseen cases
Clever Hans shortcuts	Bad assumptions codified confidently	Diff review (human or LLM), property-based tests
Concept drift	Model drift breaking codified prompts	Regression evals, CI gates
Data leakage	Test/train contamination in eval suites	Held-out eval sets, adversarial test generation

Every mitigation relies on the same property: the artifact is inspectable. You can write a test for a function. You can't write a test for a weight.

The real question

Chollet asks "what will be the Keras of agentic coding?" — the abstraction layer that lets humans steer codebase "training" with minimal cognitive overhead. The verifiability gradient is a candidate answer: it tells you which grade of codification to use for each piece of your system, based on how verifiable you need it to be. The constrain/relax cycle is the steering mechanism — codify when patterns emerge, relax when new requirements appear. And crucially, neither the gradient nor the cycle requires a human in the loop. They require an inspectable substrate.

Relevant Notes:

codification — foundation: codification as system-level learning through repo artifacts
deploy-time-learning — the verifiability gradient that determines when and how to codify
Agentic Note-Taking 23: Notes Without Reasons — grounds: embedding latent spaces are opaque substrate; curated propositional links are inspectable — the adjacency-vs-connection distinction is inspectability applied to knowledge architecture
Harness Engineering (Lopopolo, 2026) — exemplifies: 1M lines of agent-generated code, fully repo-hosted, CI-gated, and PR-reviewed — inspectable substrate at production scale with zero manual code
agent runtimes decompose into scheduler context engine and execution substrate — component view: names inspectable repo artifacts and tools as the execution substrate layer of the runtime

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search