The verifiability gradient

Type: kb/types/note.md · Tags: learning-theory

Software 1.0 easily automates what you can specify. Software 2.0 easily automates what you can verify. — Andrej Karpathy, Verifiability

Karpathy identifies three properties that make a task verifiable: resettable (you can retry), efficient (retries are cheap), and rewardable (you can evaluate results automatically). The more verifiable a task is, the more you can hill-climb on it — through RL at training time, or through iteration at runtime.

Symbolic artifacts used in deploy-time learning sit on a gradient of verifiability:

Grade Example Resettable Efficient Rewardable
Restructured prompts Breaking a monolithic prompt into sections Yes No — requires human review No — judgment call
Structured output schemas JSON schemas constraining response format Yes Yes — automated Partial — shape is checked, content is not
Prompt tests / evals Assertions over LLM output across test cases Yes Yes — automated Mostly — statistical pass rates
Deterministic modules Code that replaces what was previously LLM work Yes Yes — automated Yes — pass/fail

Moving down the table, verification gets cheaper and sharper. Restructured prompts need a human to judge quality; deterministic module tests run in milliseconds and return a boolean. Hardened artifacts are diffable, executable, testable, and reviewable — a memory note like "remember to validate emails" is none of those.

The gradient runs both ways

The tempting reading is that learning means climbing the gradient — prompts become schemas become tests become code. But that treats hardening as the whole game. Learning means moving in either direction based on evidence:

  • Tighten when verification holds up across runs. Repeated judgment calls become schemas; stable behavior becomes tests; settled algorithms become code.
  • Loosen when the verification itself turns out to be wrong — a test passing while quality regresses, a schema that excludes valid outputs, an eval that goodharted. When the check that justified a constraint breaks, the constraint loses its warrant.

Verifiability makes the choice possible. You cannot decide whether to tighten or loosen without being able to check whether the current level is working — that is the practical payoff of the gradient's Karpathy-style properties.

Codification and relaxing navigate the bitter lesson boundary develops the bidirectional dynamics and the signals that trigger relaxation. This note specifies the ladder those movements happen along.

Placement, not climbing

The individual practices the gradient unifies — prompt versioning, eval-driven development, CI-gated prompt testing, automated prompt optimisation (DSPy, ProTeGi), evaluator-guided program evolution (FunSearch) — are each well-established in LLMOps. What the gradient adds is a placement test: for any given artifact, you can ask where it currently sits and whether that grade matches its evidential maturity. A judgment call hardened into a deterministic module before its pattern stabilises is misplaced; a stable behavior still living as loose prose is also misplaced.


Relevant Notes: