Ars Contexta

Type: ../types/agent-memory-system-review.md · Status: current · Tags: trace-derived

Ars Contexta, from agenticnotetaking/arscontexta, is a Claude Code plugin for deriving an agent-operated knowledge system from an onboarding conversation. At the reviewed commit, it is not a standalone memory database. It is a plugin bundle containing setup/help/architecture skills, generated command templates, hook scripts, a generated CLAUDE.md context-file template, platform adapters, reference documents, presets, and a large methodology claim graph that together scaffold and operate a local Markdown vault.

Repository: https://github.com/agenticnotetaking/arscontexta

Reviewed commit: 2acfd5cc4473c4d06c46be63df748e77e00e2746

Last checked: 2026-06-04

Core Ideas

Derivation is the product boundary. /arscontexta:setup is framed as a six-phase derivation engine: detect platform, interview the user, infer configuration dimensions, propose a system, generate files, and validate kernel primitives (skills/setup/SKILL.md, README.md). The distinctive claim is not "use Markdown memory"; it is that a user's vocabulary and domain signals should generate the vault's folders, context file, templates, commands, hooks, and rationale.

The generated memory substrate is local files. The kernel requires Markdown files with YAML frontmatter, wiki links, MOCs, descriptions, topic footers, schema enforcement, self/notes/ops separation, task state, methodology self-knowledge, and session capture (reference/kernel.yaml, reference/three-spaces.md). Optional semantic search via qmd adds an embedding/MCP access layer, but the system is designed to work through plain files, wiki links, ripgrep, hooks, and Claude Code skills (README.md, skills/ask/SKILL.md).

Context efficiency is handled by generated operating discipline, not by a central retriever. The CLAUDE.md generator encodes discovery-first note design, session orientation, routing tables, pipeline compliance, operational learning, and feature blocks (generators/claude-md.md). The strongest context-control move is procedural: split work into skills and, for queue processing, force fresh Task subagents per phase rather than letting one long context degrade (skill-sources/ralph/SKILL.md). Read volume is managed by tree orientation, MOCs, descriptions, field queries, qmd when enabled, and command-specific loading.

Hooks turn some methodology into platform behavior. The plugin-level hook config registers SessionStart orientation and PostToolUse write validation plus async auto-commit (hooks/hooks.json). session-orient.sh prints the workspace tree, previous session state, goals, identity/methodology, recent methodology notes, and threshold warnings for observations, tensions, sessions, and inbox items (hooks/scripts/session-orient.sh). write-validate.sh emits schema warnings on note writes, and auto-commit.sh commits pending vault changes when git automation is enabled (hooks/scripts/write-validate.sh, hooks/scripts/auto-commit.sh).

The methodology graph is a retained design source. The plugin ships methodology/ research claims and reference files; /ask routes questions through research claims, guidance docs, examples, and structured references, using qmd tools when available (skills/ask/SKILL.md, methodology/index.md, reference/claim-map.md). That gives generated systems an inspectable lineage story, although the setup skill's actual reasoning remains an LLM execution rather than a replayable compiler.

Trace-derived learning exists as an operational self-improvement loop. /remember captures explicit corrections, contextual corrections, and mined session patterns into ops/methodology/ or ops/observations/; /rethink triages observations/tensions into promoted notes, implemented changes, methodology notes, archived items, or pending evidence (skill-sources/remember/SKILL.md, skill-sources/rethink/SKILL.md). The implementation is instruction-and-file driven rather than a deterministic transcript parser.

Artifact analysis

  • Storage substrate: files — The primary retained state is a generated local vault of Markdown/YAML, skills, hook scripts, config, queues, session files, observations, tensions, methodology notes, MOCs, and optional git history; optional qmd adds secondary semantic-search indexes but does not replace the file substrate.
  • Representational form: prose symbolic — Prose instructions, methodology claims, notes, manuals, and context files are combined with symbolic YAML frontmatter, JSON hook/plugin manifests, Bash hooks, queue/config schemas, qmd MCP config, and skill manifests.
  • Lineage: authored trace-extracted — The plugin bundle, kernel, feature blocks, templates, presets, and methodology graph are authored; generated vault artifacts are LLM-derived from user conversation plus those authored references; methodology notes and observations can be extracted from user corrections and session records.
  • Behavioral authority: knowledge instruction enforcement routing validation ranking learning — Methodology claims and notes provide knowledge; generated context files and skills instruct agents; hooks enforce or warn; MOCs, descriptions, /ask, /next, qmd, and queues route work; validators check schemas; search/graph commands rank or prioritize candidates; /remember and /rethink update learned operating rules.

Plugin-level skills and manifests. Storage substrate: .claude-plugin/, skills/, and skill-sources/ in the plugin repository. Representational form: JSON manifests and prose skill instructions. Lineage: authored plugin artifacts. Behavioral authority: instruction and routing authority in Claude Code, because slash commands determine how agents derive systems, ask methodology questions, inspect health, add domains, and upgrade (.claude-plugin/plugin.json, skills/setup/SKILL.md).

Generated context file. Storage substrate: the target vault's CLAUDE.md or fallback context file, generated from generators/claude-md.md and feature blocks. Representational form: prose instructions with embedded symbolic routing tables and command names. Lineage: derived from the user's conversation, presets, vocabulary transforms, kernel primitives, and feature-block selection. Behavioral authority: prompt-level instruction and routing authority when Claude Code auto-loads the context file (generators/claude-md.md, platforms/claude-code/generator.md).

Generated vault notes and operational spaces. Storage substrate: local directories such as self/, the domain notes directory, ops/, ops/methodology/, ops/observations/, ops/tensions/, ops/queue/, and ops/sessions/. Representational form: prose Markdown plus YAML frontmatter and wiki links. Lineage: authored by users and agents, generated during setup, or trace-extracted from corrections/session mining. Behavioral authority: ordinary notes are knowledge artifacts; self/, ops/methodology/, config, queue state, and context-related files become system-definition artifacts when the orient hook or skills read them.

Hook scripts. Storage substrate: Bash scripts under plugin hooks and generated platform hook templates. Representational form: symbolic shell code and JSON hook configuration. Lineage: authored plugin/platform artifacts, possibly installed or generated into the target vault. Behavioral authority: enforcement, validation, and push read-back because hooks run on Claude Code events rather than by explicit agent lookup (hooks/hooks.json, platforms/claude-code/hooks/session-capture.sh.template).

Methodology graph and references. Storage substrate: methodology/ and reference/ in the plugin. Representational form: prose claims/docs plus symbolic maps such as kernel.yaml. Lineage: authored source material for derivation and /ask; generated systems cite or embed pieces of it rather than treating it as user trace. Behavioral authority: knowledge, instruction, routing, and validation authority because setup, ask, architect, and generated methodology files consult it when deciding what a vault should contain.

Optional qmd semantic-search layer. Storage substrate: qmd configuration and indexes outside the Markdown note files. Representational form: parametric embeddings plus symbolic MCP configuration. Lineage: derived from the current vault collection. Behavioral authority: ranking and retrieval authority for /ask or generated systems that enable semantic search; actual precision is not verified from static code (README.md, generators/features/semantic-search.md).

Promotion path. Ars Contexta has a real authority ladder: conversation and authored methodology produce a context file, skills, hooks, schemas, templates, and a vault; runtime corrections become methodology notes or observations; repeated observations/tensions can be promoted to durable notes, implemented in system files, or elevated into context-file changes by /rethink. The ladder is powerful but partly LLM-governed: many transitions are specified as skill procedure rather than enforced by deterministic code.

Comparison with Our System

Dimension Ars Contexta Commonplace
Primary purpose Derive and operate a user/domain-specific Claude Code knowledge system Maintain a typed methodology KB and framework for agent-operated KBs
Main substrate Generated local Markdown vault plus Claude Code skills/hooks/context files Git-tracked Markdown collections, type specs, validation, review, sources, reports, generated indexes
Derivation model Conversational setup backed by claims, dimensions, presets, feature blocks, and LLM reasoning Human/agent-authored artifacts constrained by collection contracts, schemas, validation, and review gates
Read-back Coarse push at session start plus pull through commands/search/MOCs/qmd Mostly pull through rg, indexes, links, skills, reports, and explicit review/validation workflows
Governance Generated schemas, hook warnings, auto-commit, task queues, observation/tension triage, methodology drift checks Deterministic validation, semantic review bundles, explicit type specs, replacement archives, git diff discipline
Learning loop Corrections and sessions can become methodology notes, observations, proposals, and system changes Workshop/library promotion, review notes, validation warnings, explicit artifact revision workflows

Ars Contexta is closer to Commonplace than most reviewed systems because it treats the knowledge base as an agent-operated methodology environment rather than a passive memory store. Both systems value plain files, explicit collection/space contracts, validation, search-friendly descriptions, authored links, and operational procedures that agents can execute.

The main divergence is source of authority. Commonplace makes the type spec and validator primary: an artifact's role is declared in frontmatter and checked against a contract. Ars Contexta makes the generated CLAUDE.md, skills, hooks, and methodology folder primary: authority is assembled through a derivation conversation and then carried by Claude Code loading conventions. That is more adaptive for a new user's domain, but harder to audit after the fact because the LLM's derivation reasoning is not a deterministic build artifact unless the generated derivation manifest captures it well.

Ars Contexta also has more push read-back than Commonplace. A generated vault can load workspace structure, identity, goals, methodology, and maintenance conditions at session start without a user asking for them. That improves cold-start continuity, but it also creates context dilution risk: coarse startup context can become stale, over-broad, or too authoritative unless the validation and rethink loops keep it disciplined.

Borrowable Ideas

Treat setup as derivation from constraints, not template selection. Commonplace init flows could ask a few domain questions, choose a collection/profile shape, and write a derivation manifest explaining why. Ready only for consuming-project scaffolds; Commonplace's own methodology layer should stay explicitly authored.

Use a kernel primitive checklist for generated systems. Ars Contexta's kernel.yaml makes the minimum viable memory architecture inspectable. Commonplace could borrow a smaller checklist for project initialization and health reports. Ready now as reference documentation or a validation report, not as a hard gate for every KB.

Surface maintenance conditions at session start. The orient hook's counts for observations, tensions, sessions, and inbox items are a practical push channel. Commonplace could expose a lightweight session-start summary from validation/review/report state. Needs a clear budget so startup context does not become noisy.

Keep operational learning separate from durable knowledge. The ops/observations/ and ops/tensions/ split is compatible with Commonplace's workshop/library distinction. Ready as a candidate pattern for work-in-flight review evidence.

Do not borrow auto-commit-all as-is. The plugin's async hook stages all pending changes in the generated vault (hooks/scripts/auto-commit.sh). That may be acceptable for a single-agent generated vault, but it conflicts with Commonplace's multi-agent worktree discipline and explicit staging rules.

Borrow the "fresh context per phase" pressure, not mandatory subagent spawning everywhere. Commonplace already uses focused skills and review bundles. Ars Contexta's RALPH rule is a useful reminder that quality drops when a long-running agent chains too many phases in one context, but Commonplace should apply it where the task actually crosses phase boundaries.

Write side

Write agency: manual automatic — Users and agents author files directly or through skills, while setup, hooks, /remember, /rethink, /ralph, and generated processing skills can create, update, validate, commit, triage, or promote retained artifacts.

Curation operations: dedup evolve synthesize invalidate promote/remember checks existing methodology and observations before writing, /reweave and /reflect update existing notes/MOCs, /setup and /rethink synthesize new system artifacts or proposals, contradiction handling can supersede or archive older guidance, and observations/tensions can be promoted into notes or implemented system changes.

Trace-derived learning

Trace source: session-logs tool-traces — The intended loop consumes current conversation corrections, stored session files under ops/sessions/, and hook/tool-event surfaces such as SessionStart and PostToolUse.

Learning scope: per-project cross-task — Generated vault learning is project-local, but methodology notes and context-file updates influence future tasks across that vault.

Learning timing: online offline staged — Explicit/contextual /remember can run during work; session mining runs later over stored sessions; /rethink stages triage, pattern detection, proposals, and approved implementation.

Distilled form: prose symbolic — Corrections become prose methodology notes or observations with symbolic YAML; mature findings can alter prose context files, templates, config, or skill instructions.

Trace source. Ars Contexta qualifies as trace-derived learning because it defines durable artifacts derived from agent/user interaction traces. /remember can mine the current conversation for corrections or scan ops/sessions/ for user corrections, repeated redirections, workflow breakdowns, agent confusion, undocumented decisions, and escalation patterns (skill-sources/remember/SKILL.md). The plugin-level hook currently writes session identity/state at SessionStart, while the platform template describes Stop-event session capture; the strongest transcript-mining behavior is specified in the skill procedure rather than proved as a complete transcript-ingestion implementation (hooks/scripts/session-orient.sh, platforms/claude-code/hooks/session-capture.sh.template).

Extraction. The extraction oracle is mostly LLM judgment under detailed skill instructions. /remember classifies friction into methodology notes or observations, deduplicates against existing files, and can mark sessions as mined. /rethink then reads pending observations/tensions, compares them with methodology and config, and classifies each as PROMOTE, IMPLEMENT, METHODOLOGY, ARCHIVE, or KEEP PENDING (skill-sources/remember/SKILL.md, skill-sources/rethink/SKILL.md).

Scope and timing. The loop is vault-scoped and staged. Immediate corrections can become methodology notes during the current work. Stored session records can be mined offline. Accumulated observations and tensions trigger later rethink passes, which may promote insights, update methodology, implement config/context changes, or leave evidence pending.

Survey position. On the trace-derived learning survey, Ars Contexta belongs in the operational-friction-to-methodology family. It strengthens the distinction between raw trace/state files, intermediate observation/tension evidence, and high-authority distilled context or methodology rules. It also shows a weak point: a trace-derived loop can be architecturally real even when much of the extraction and promotion policy is encoded as LLM skill procedure rather than deterministic code.

Read-back

Read-back: both — Retained memory is pushed through Claude Code startup/context and hook outputs, while agents and humans also pull memory through skills, qmd/ripgrep searches, graph commands, MOCs, and file reads.

Read-back signal: coarse — SessionStart orientation loads broad vault structure, current session state, goals, identity, recent methodology notes, and maintenance counts for the whole vault rather than selecting query-specific memory.

Faithfulness tested: no — I found validation hooks and health/verification commands, but no implemented ablation or perturbation test showing that pushed memory changes future agent behavior.

The push edge is strongest in Claude Code. CLAUDE.md is auto-loaded by the host, and session-orient.sh emits workspace structure, session state, goals, identity/methodology, recent methodology snippets, and condition warnings before the agent acts (platforms/claude-code/generator.md, hooks/scripts/session-orient.sh). This is coarse push: the occasion is session start and threshold state, not semantic relevance to the user's current request.

Pull read-back remains substantial. /ask searches the methodology graph and references; /health, /graph, /next, /stats, qmd MCP tools, ripgrep, MOCs, descriptions, and explicit file reads all let an agent request memory when needed (skills/ask/SKILL.md, skill-sources/graph/SKILL.md, README.md). Optional qmd can make pull semantically broader, but it is still invoked as a search/get interface.

Selection scope is project/vault-wide at push time. The orient hook caps some content by convention, such as showing a tree to depth 3 and only the five newest methodology files, but it does not enforce a token budget or quality rank for all injected material. Effective authority is therefore structural but not quality-proven: if the host loads it, it reaches the model; whether the model follows it is untested.

Curiosity Pass

The README and implementation disagree on the hook set. The README names four hooks including Session Capture on Stop, while hooks/hooks.json registers SessionStart and PostToolUse hooks only; the SessionStart script comments that capture moved there, and a Stop hook exists as a platform template (README.md, hooks/hooks.json, hooks/scripts/session-orient.sh, platforms/claude-code/hooks/session-capture.sh.template). The architecture is clear, but the deployed hook boundary should be watched.

The validation layer is narrower than the methodology. write-validate.sh checks frontmatter, description, and topics for notes paths; the kernel and generated instructions describe richer schema, discovery, graph, and session-capture invariants. A generated vault may still rely on skills and agent behavior for many quality gates.

Auto-commit is a double-edged adoption affordance. In a single-user vault, committing after writes protects memory from loss. In a shared worktree, git add -A can sweep unrelated work. Ars Contexta is optimized for generated-vault operation, not multi-agent repository hygiene.

The methodology graph is both evidence and product payload. Because generated systems can read the plugin's methodology graph through /ask, the research claims are not just setup-time references. They remain a live knowledge substrate for architecture advice.

Mandatory fresh subagents are a strong claim about context quality. RALPH's instruction that every queue task must use a Task subagent is unusually explicit. It may be exactly right for heavy processing pipelines, but it depends on the host exposing reliable subagent machinery.

What to Watch

  • Whether hooks/hooks.json and the platform Stop hook converge. That decides how much session capture is actually automatic in installed plugin behavior.
  • Whether setup writes a durable derivation manifest with enough claim-level provenance to audit why each generated file exists.
  • Whether validation grows from warning-only shell checks toward generated schemas that can deterministically enforce the kernel.
  • Whether qmd/search results begin driving startup read-back. That would move Ars Contexta from coarse push plus pull search toward targeted inferred push.
  • Whether /remember --mine-sessions gains deterministic transcript discovery/parsing for Claude Code JSONL or other host logs. That would strengthen the trace-derived classification.
  • Whether generated systems add tests that compare agent behavior with and without pushed methodology/context. That would turn read-back faithfulness from assumed to measured.

Bottom Line

Ars Contexta is an agent-native knowledge-system generator and operating layer. Its memory design is not centered on vectors or a service database; it is centered on deriving a local file graph, generated instructions, skills, hooks, and operational learning loops that Claude Code can run. For Commonplace, the most useful lessons are the kernel checklist, session-start maintenance surfacing, and explicit trace-to-methodology loop. The main caution is that much of the intelligence is procedural LLM instruction, so auditability depends on how well generated artifacts preserve derivation rationale and on how strongly hooks/validators constrain later behavior.

Relevant Notes: