025-Complete generated indexes are build-time only; agents read curated heads plus scoped rg
Type: ../types/adr.md · Status: accepted
Status: accepted Date: 2026-06-09
Context
- Generated navigation is committed and read whole by agents: the per-collection
dir-index.mdand the per-tag## Other tagged notestail.cp-skill-connect(per 003-connect-skill-discovery-strategy) readskb/notes/dir-index.mdfirst. A full index read costs tens of KB before any candidate body is selected; under sub-agent decomposition, per-fork load then grows with total collection size rather than with the task. - These complete listings are pure denormalization: every title and description already lives in the note's own frontmatter.
- The root cause is an access-mode asymmetry: an agent reads an index whole (linear cost into bounded context, every byte counted), while a human skims, scrolls, and Ctrl-Fs (sublinear). The same complete listing is context debt for the agent and a convenience for the human — which is why we design for the first-time human except where access cost dominates.
- ADR 003 bet that index-first discovery would stay viable by keeping the collection curated and loading the index in portions. Collection growth has made the complete-index read the dominant per-fork cost, so that bet no longer holds.
Decision
Complete generated listings are build-time-only; the agent read path is curated heads plus scoped rg.
- Retire complete generated listings from git and the agent read path — both the per-collection
dir-index.mdand the per-tag## Other tagged notestail. - Regenerate them at mkdocs build time for human readers, where browser scroll and find restore sublinear access. One source of truth (note frontmatter), two materializations.
- Keep curated heads committed at every scope: directory
README.md/COLLECTION.md, and the editorial body of each tag index. The tag index is the tag's README; its generated tail is the detachable part. - Agents discover via curated heads plus scoped
rg; add no new query command, because scoped rg recovers the operative part of the retired index. Documented recipes: by tag,rg -l '^tags:.*\bTAG\b' kb/notes/ --glob '*.md' | xargs -r rg -N --no-heading '^description:\s*' -r ''(thexargs -rguard prevents an empty match from falling back to a whole-repo search); by keyword,rg '^description:' <scoped path>. Codifying these into acommonplace-*command is deferred until a recurring failure justifies it. - Colocation is conditional on weight: a small tag keeps its short generated list colocated under the curated head; a popular tag detaches it, because a curated index is an agent read surface and must stay context-feasible. (Refined by ADR 026: nothing generated is committed at any size — weight became the type contract on the committed curated head instead of a colocation dial.)
This supersedes ADR 003's primary-discovery decision (read dir-index.md first). 003's surviving element — curated focused indexes as a discovery surface — is retained as the curated-heads path; complete-index reads are removed from standard connect discovery.
Consequences
Easier: - Per-fork load scales with the task, not the collection: write, connect, and ingest stop paying tens of KB of denormalized index on every fork. - Adding a note no longer requires keeping a heavy committed index current for agents (003's maintenance burden); the human site regenerates from source.
Harder / migration:
- git rm plus gitignore the ~50 committed dir-index.md files; strip the committed generated tail from tag indexes.
- cp-skill-connect standard discovery is rewritten to drop complete-index reads (curated heads plus scoped rg; broad scans reserved for deep mode).
- Reference sweep: navigation.md, CLAUDE.md Key Indexes, and any README or curated index that links to a dir-index.md.
- The mkdocs build becomes responsible for human-facing inventories.
Risks / watch:
- Scoped rg has a footgun: a tag that matches zero files makes xargs run rg with no path argument, which searches the whole repo. Mitigated by xargs -r; recurring trips would be the signal to codify a command.
- rg yields path + description, not the human H1 title; the path stands in for the title in triage. If title-in-output proves necessary, that is the case a command would justify.
- Description quality stays load-bearing (inherited from 003).
Deferred (mechanism, not direction):
- mkdocs generation mechanism: a commonplace-refresh-indexes --for-build mode the hook calls, or a dedicated plugin (must emit both the directory inventory and the per-tag listing).
- Curated-index weight reporting plus soft/hard thresholds (bytes vs entry count; scope).
- Whether curated indexes must declare focused-routing vs archival, or whether "curated = focused" suffices.
Links
- 003-connect-skill-discovery-strategy — supersedes: this replaces 003's index-first primary discovery while retaining its curated-focused-index surface
- context-efficiency-is-the-central-design-concern-in-agent-systems — rationale: the per-fork cost of a whole-index read is what makes retiring it worthwhile
- two-context-boundaries-govern-collection-operations — rationale: per-fork load is the boundary this decision optimizes
- feasibility-is-the-heaviest-forks-net-load — rationale: feasibility is set by the heaviest fork's net load, which the complete-index read inflates
- index-curation-adds-orientation-that-generation-cannot-produce — rationale: why curated heads stay committed while generated listings move to build time
- cp-skill-connect — procedure: the discovery skill whose standard read path this decision rewrites