Notes need quality scores to scale curation

Type: note · Status: seedling

The /connect skill searches for candidate notes by keyword, then an agent evaluates each candidate for genuine connections. At current scale (~100 notes) this works — the candidate list is short enough for an agent to scan. As the KB grows, keyword searches will return dozens or hundreds of candidates, most not worth evaluating. Curation doesn't scale if every candidate gets equal attention.

The fix is a quality score per note that lets /connect filter and rank candidates before the agent evaluates them. Only the top N candidates get full attention.

Scoring dimensions

Status is the strongest signal. A current note has been reviewed and endorsed — it's worth linking to. A seedling hasn't been vetted; linking to it couples your note to something that might be pruned. An outdated note should rarely be a link target.

Status Score weight Rationale
current highest reviewed, endorsed, stable
speculative medium deliberately exploratory but acknowledged
seedling low unvetted, may be pruned
outdated lowest superseded, kept for reference only

Type reflects structural maturity. A structured-claim with Evidence/Reasoning sections is a more valuable link target than a text with no frontmatter — it's a developed argument you can reason with.

Inbound link count is a social proof signal. A note that many other notes link to has been repeatedly judged worth connecting to. This is the graph-topology version of citation count. But it must be weighted by link strength — ten footer "related" links count less than three inline "since [X]" premise links.

Recency matters differently per content type. A source snapshot from yesterday is more relevant than one from six months ago. A design principle note doesn't decay — older often means more refined. Type-dependent recency decay captures this: different content types age at different rates.

Content type Recency decay Why
Source snapshots, practitioner reports Fast Practices and tools evolve
Design notes, structured claims None Timeless if still current
Task-related, session artifacts Fast Operational, high churn
ADRs None Permanent record

Where scores get used

Connect candidate filtering. /connect searches for keywords, gets N results, filters to top M by score, agent evaluates M. This is the primary driver — without it, /connect doesn't scale past a few hundred notes.

Retrieval ranking. When qmd returns semantic search results, scores rerank them so the best notes surface first.

Quality signals. The quality signals note identifies graph-topology and content-proxy signals. Note scores are the composite of those signals — a single number that summarises "how valuable is this note as a link target?"

Implementation spectrum

At our current scale, none of this needs implementation. When it does, there's a spectrum:

  • Cheapest: /connect filters by status (rg '^status: current') and ignores seedlings. No scoring, just a hard filter. Probably enough for 200-500 notes.
  • Medium: A script computes scores from frontmatter + inbound link count, writes them to a derived index. /connect reads the index for candidate ranking.
  • Full: The SQLite index from the retrieval scoring layer proposal, with composite scores, recency decay, and per-note overrides.

The right point on this spectrum depends on when /connect starts struggling. We'll know because the agent will start producing low-quality connections or taking too long to evaluate candidates.

Open questions

  • Should scores be visible in frontmatter, or computed on demand? Visible scores are queryable but add maintenance burden. Computed scores are always fresh but need tooling.
  • How do you bootstrap scores for new notes? A fresh note with no inbound links scores low, but it might be exactly the right link target. Some grace period or "new note bonus" might be needed.
  • Does filtering by score create a rich-get-richer problem? Well-connected notes get more connections, new notes get ignored. This is the scaling version of the orphan detection problem.

Relevant Notes:

Topics: