Agent Memory Systems
Type: kb/types/index.md
External systems doing similar work — knowledge management for AI agents, context engineering, structured note-taking. We track these not just to borrow ideas but to watch how they evolve. Convergence across independent projects is a stronger signal than any single design argument.
Navigation: directory index.
Two coverage tiers. Systems with open-source repos get the deep path: clone the repo, read the code, write a review note here. Systems known only from a README or paper get the lightweight path: snapshot a single page into kb/sources/, run /ingest, and optionally add a standard note under source-only/ when the system needs a stable place in this collection. The comparative review synthesises across both tiers. Database-backed memory systems (Mem0, Graphiti, Letta, A-MEM, AgeMem) currently have only lightweight coverage via ingest reports in kb/sources/.
Coverage planning: Adaptation survey review candidates maps the agentic-adaptation survey's memory and skill systems to existing reviews and likely additions.
Systems
- ACE — playbook-learning loop with generator, reflector, and curator roles; strongest nearby artifact-learning analogue to Autocontext, with bullet-level helpful/harmful counters but an append-heavy maintenance path
- Agent-R — iterative self-training pipeline that mines MCTS search trees into corrected conversation traces and fine-tuning data; clearest search-to-weights learning system in this queue
- Agent-S — Simular computer-use agent stack where S1/S2 maintain JSON experience memory, S2.5/S3 shift toward prompt-time reflection and code-agent delegation, and BBoN captions/judges benchmark trajectories
- Agent Skills for Context Engineering — skill-based context engineering reference library loaded as agent guidance; strong on operational patterns, no learning theory
- Agent Workflow Memory — web-agent workflow induction system that distills annotated examples and successful task trajectories into reusable website-scoped prompt workflows; strong trace-to-procedure baseline, but evidence and lifecycle stay outside the workflow artifact
- Agentic Harness Engineering — observability-driven outer loop that evaluates a NexAU coding agent with Harbor, distills rollout traces through Agent Debugger, and promotes evidence-backed edits into prompts, tools, middleware, skills, sub-agents, and memory files
- AgentFly — Memento planner-executor research agent that turns judged benchmark runs into JSONL case memory and optional trained case-selector weights for planner prompt reuse
- AgeMem — source-only paper coverage of an RL-trained LTM/STM memory-management policy; trace-derived trajectory-to-weights case, but no local code-inspected review
- Archie — Arch Linux config repo with Stow-managed multi-root deployment, Incus dev VMs, and agent-executable work-item docs; strong operational packaging, no real knowledge-learning loop
- AriGraph — TextWorld graph-memory agent that distills observations into an in-run triplet graph plus episodic observation store, then uses that memory for planning, retrieval, exploration, and navigation affordances
- Ars Contexta — Claude Code plugin that generates knowledge systems from conversation; ancestor of our KB, upstream source for link semantics and title-as-claim. Includes the "Agentic Note-Taking" article series (@molt_cornelius) — first-person agent testimony from inside the system
- Atomic — database-backed personal KB that stores markdown atoms in SQLite/Postgres, enriches them into embeddings/tags/semantic edges, and maintains per-tag wiki articles; strongest nearby database-first counterexample with a real derived wiki layer
- Amazon Science SAGE — AppWorld Skill Augmented GRPO codebase where generated Python functions act as transient rollout skills whose successful reuse becomes SFT/GRPO training signal
- auto-harness — minimal benchmark-driven outer-loop for improving one coding-agent file with regression-suite promotion and held-out-score gating; strongest reviewed example so far of hard-oracle workshop automation kept intentionally small
- Autocontext — closed-loop control plane for iterative agent improvement via multi-role orchestration (competitor/analyst/coach/architect), tournament evaluation, accumulated playbooks, and MLX distillation; strongest reference for automated iterative learning loops, but context "compilation" is concatenation with budget-aware trimming, not transformation
- ARIS — markdown skill-pack and helper toolkit for autonomous ML research workflows, with project-stage artifacts, a small research wiki, verifier-backed paper audits, and hook-log-derived skill patch proposals
- Awesome Agent Memory — TeleAI-UAGI curated bibliography of agent-memory products, papers, benchmarks, surveys, articles, and workshops; useful source-discovery map, not an implemented memory system
- Binder — local-first typed knowledge graph with markdown/YAML projections, schema-as-data, and immutable transactions; clearest reviewed example here of database-first structure surfaced as editable files
- browzy.ai — terminal personal knowledge base that compiles raw sources into a markdown wiki, uses SQLite FTS as a derived retrieval layer, and writes lightweight session-derived digests and insight drafts
- ByteRover CLI — source-available coding-agent CLI with file-backed
.brv/context-tree, tiered retrieval, live scoring/review/manifest layers, git-like context-tree VC, and four connector modes; strongest production reference so far for packaging file-backed memory into other coding-agent environments, though automatic archiving still looks less central than the paper's broader lifecycle framing - cass-memory — cross-agent procedural memory with three-layer cognitive architecture (episodic/working/procedural), confidence-decayed playbook bullets, and trauma guard; closest production sibling to ACE's playbook-learning loop, with genuine cross-agent session mining
- ClawVault — TypeScript memory system with scored observations, session handoffs, and reflection pipelines; has a working workshop layer where we have theory, strongest source of borrowable patterns for ephemeral knowledge
- Claude Context Guard — Claude Code continuity scaffold built from safeguard files, prompt-defined recovery skills, and light hooks; strongest reviewed example so far of workshop-state preservation without a dedicated runtime
- Cludebot — Generative Agents-inspired memory SDK with five-type taxonomy, type-specific decay, six-phase dream cycles (consolidation + compaction + contradiction resolution + action learning), entity knowledge graph, Hebbian co-retrieval reinforcement, and clinamen anomaly retrieval; richest reviewed trajectory-to-lesson learning loop, heavily oriented to a social-bot use case
- Cognee — pipeline-first knowledge engine (add/cognify/memify/search) with Pydantic-schema graph extraction, poly-store backends (graph + vector + relational), and multi-tenancy; strongest database-side counterexample to files-first architecture, but treats knowledge as a data engineering problem rather than a curation problem
- CocoIndex — Rust-backed incremental indexing framework with a Python dataflow DSL, Postgres tracking tables, and broad vector/graph target connectors; strongest reviewed example so far of derived-index maintenance as a layer below the primary knowledge substrate
- Context Constitution — Letta’s instruction-first governance corpus for agents, treating context management as identity, memory, and continuity policy; strongest reviewed case of a related system defined mainly by doctrine rather than code
- CORAL — multi-agent coding harness with per-agent git worktrees, eval-gated attempt tracking, checkpointed shared notes/skills, and heartbeat prompts; clearest lightweight open-source outer loop for collaborative code search with artifact sharing
- cq — Mozilla.ai's local-first shared agent knowledge commons with SQLite local/team stores, approval-gated team sharing, and plugin-packaged query/propose/confirm loop; strongest reviewed reference so far for lightweight cross-agent operational learning, though the richer trust and guardrails layers remain mostly conceptual
- CrewAI Memory — in-framework memory layer for crews, agents, and flows, with LLM-scoped vector records, composite recall, async save barriers, agent memory tools, and HITL lesson distillation
- Decapod — Rust governance kernel for AI coding agents with proof-gated completion, workspace isolation, and 120+ embedded constitution documents; strongest reference for hard-oracle verification in agent workflows, though constitution claims transformation where the code primarily relocates
- DocMason — repo-native document-analysis workspace with staged/published KB boundaries, multimodal evidence channels, provenance tracing, and sync-time promotion of host interaction logs into published memories
- Dynamic Cheatsheet — test-time adaptive memory with cumulative cheatsheet carryover and optional retrieval-synthesis variants; strong artifact-learning baseline, but the actual maintenance path is whole-document rewrite rather than structured mutation
- engraph — Obsidian vault server with SQLite hybrid index, wikilink graph expansion, section-level writes, and local MCP/HTTP surfaces; strongest local-first derived index over a human note substrate
- EQUIPA — multi-agent coding orchestrator with git-worktree dev/test loops, SQLite run memory, trace-derived lessons/rules/prompt tuning, and partial training-data export from the same execution traces
- ExpeL — cross-task experiential learning pipeline with separate trajectory gathering, rule extraction, prompt-time trace retrieval, and explicit
ADD/EDIT/REMOVE/AGREErule maintenance; clearest trajectory-to-rule artifact-learning example in this queue - Exocomp — Go coding-agent harness with role-scoped tools, sandboxed execution, and file-backed bug/changelog coordination; execution controls are real, but planning and sub-agent workflows are still stubbed
- Fintool — AI agent for professional investors; S3-first with derived PostgreSQL, markdown skills with copy-on-write shadowing, ~2000 eval test cases; strongest production-scale evidence for filesystem-first at commercial grade (lightweight coverage only — ingest report, no repo review)
- GBrain — personal-brain CLI and MCP layer that indexes markdown-derived compiled-truth/timeline pages into Postgres+pgvector, with agent skillpacks for trace-to-entity enrichment and brain maintenance
- G-Memory — multi-agent memory harness with state-graph trajectory capture, task-neighborhood retrieval, and scored text insights; strongest reviewed example so far of mixed memory substrates inside one benchmark agent system
- Gnosis — repo-local Go CLI for agent-written why-memory, with JSONL entries, disposable SQLite FTS search, and doctrine-mediated live capture from coding sessions
- HALO — Context Labs trace-analysis engine that indexes OTel JSONL agent runs, gives an LLM bounded trace-inspection and sandboxed-analysis tools, and turns diagnostic reports into coding-agent-mediated harness edits
- getsentry/skills — Sentry's shared skills repo with a skill-writer meta-skill that codifies the skill creation process: source-driven synthesis with depth gates, labeled iteration, description-as-trigger optimization, and the Agent Skills cross-tool spec; strongest reference for how to systematically create and improve agent skills
- Graphiti — temporally-aware knowledge graph with bi-temporal edge invalidation; strongest counterexample to files-first architecture and strongest temporal model in the surveyed systems (lightweight coverage only — ingest report, no repo review)
- Hindsight — biomimetic agent memory with LLM-driven fact extraction, four-way parallel retrieval (semantic + BM25 + graph + temporal), auto-consolidation into observations, and agentic reflection; strongest production evidence that three-space memory separation yields measurable retrieval gains (LongMemEval SOTA)
- HippoRAG — graph-augmented RAG library that converts documents into OpenIE triples, entity/fact embeddings, and an igraph PageRank retrieval graph; strongest associative-retrieval baseline here, but not an agent-governed memory lifecycle
- Hyalo — Rust CLI for Obsidian-compatible markdown vaults with single-pass scanning, ephemeral snapshot indexes, mutation-safe link operations, and one-command Claude bootstrap
- HyperAgents — self-referential code-agent evolution harness with diff archives, Docker lineage replay, staged benchmark evaluation, and scored parent selection; strongest reference here for outer-loop self-editing over executable agent code, though the checked-in meta agent is much thinner than the framing suggests
- KBLaM — Microsoft Research model-architecture experiment that turns flat key-value KB records into trained attention key/value tensors; strongest reviewed reference so far for moving retrieval into model attention, but not a maintainable agent KB
- LACP — local agent control plane with risk-tier routing, Claude hooks, Obsidian memory automation, and provenance receipts; strongest reviewed reference for governance-heavy local agent operations around existing CLIs
- LLM Wiki (kenhuangus) — executable local-first markdown wiki pipeline with Python ingestion, LLM extraction/merge, BM25 query, monitors, FastAPI/React UI, and a partial prompt-optimization loop; strongest nearby contrast to the promptware-only LLM Wiki protocol
- LLM Wiki — Claude Code plugin and portable AGENTS protocol for topic-isolated compiled markdown wikis; strongest nearby reference for packaging a whole knowledge-system workflow as prompt artifacts rather than executable software
- Letta — agent-self-managed three-tier memory hierarchy using OS analogy (main context ≈ RAM, archival ≈ disk, recall ≈ conversation log); strongest existing exemplar of the agent-self-managed agency model (lightweight coverage only — ingest report, no repo review)
- MentisDB — hash-chained semantic memory ledger with additive ranked retrieval, agent key registry, and immutable skill versioning; strongest reviewed example here of service-shaped durable memory plus a real skill-lifecycle layer
- Mem0 — two-phase add pipeline (extract facts + LLM-judged CRUD reconciliation); purest production example of automated accretion-without-synthesis in the surveyed systems (lightweight coverage only — ingest report, no repo review)
- Memori — Python/TypeScript SDK and hosted memory layer with LLM-client interception, entity/process/session scoping, BYODB storage, conversation/agent-trace augmentation into facts/triples/summaries, and compact prompt-time recall
- MemoryOS — hierarchical conversational memory library plus MCP server that promotes user/assistant dialogue into short-term buffers, mid-term sessions, long-term profile and knowledge stores, with JSON/FAISS and ChromaDB variants
- MemPalace — local-first memory system with verbatim Chroma drawers, wing/room retrieval priors, a sidecar SQLite fact graph, and optional AAAK compression; strongest reviewed reminder so far that raw storage plus good retrieval can outrun heavier extraction stories
- Meta-Harness — Stanford IRIS Lab outer loop for optimizing task-specific harness code from raw traces, with Claude-proposed memory/scaffold candidates, benchmark-gated promotion, and executable artifacts as the learned substrate
- MiroShark — document-to-social-simulation stack with Neo4j graph extraction, cross-platform round memory, heuristic belief drift, and ReACT reporting; strongest nearby reference for graph-grounded simulation loops
- Napkin — Obsidian-vault CLI with
NAPKIN.mdpinned context, TF-IDF overview maps, agent-shaped search defaults, and pi-based auto-distill; strongest reference for adapting a mainstream human note substrate into an agent-facing memory interface - nao — analytics-agent framework that compiles data context into project files, exposes it through file/SQL tools and mentions, and extracts persistent user instructions/profile memories from chat traces
- Nuggets — Pi-coupled personal memory assistant with local HRR nugget files, chat-channel scheduling, and a MEMORY.md promotion bridge; strongest reference so far for tiny file-backed scratch memory, though the promotion loop is only partially wired
- o-o — polyglot HTML/bash living-document system where each file carries its own update contract, rendering, source cache, and Claude dispatch; strongest reviewed example of the file-as-app pattern
- OpenSage — Google ADK-based agent framework with runtime subagent creation, AI-written tools, Neo4j graph memory, Docker sandbox isolation, agent ensemble coordination, and RL training integration; strongest reference for self-modifying agent topology, but knowledge structure is flat and the self-programming claims outrun the implementation
- OpenClaw-RL — live-RL framework that trains from next-state signals; TODO: repo now exists, so this should get a repo-backed review rather than source-only coverage
- OS-Copilot — FRIDAY computer-use agent that promotes judge-scored OS task executions into retrievable Python tools, with a repair-path storage caveat; clearest non-game analogue to Voyager's executable-skill learning pattern
- OpenViking — ByteDance/Volcengine's context database with filesystem-paradigm virtual directories, L0/L1/L2 tiered loading, hierarchical recursive retrieval, and session-driven memory extraction; first production system to make progressive disclosure a native storage primitive, but the "filesystem" is a metaphor over a database, not actual files
- Operational Ontology Framework — filesystem-first project runner with Pin/Spec/Facts/Handoff/Skills artifacts and a thin trace-derived loop that promotes task-output learnings into markdown facts; strongest compact convergence point for operational handoffs as cold-start artifacts
- Pal — Agno-based personal knowledge agent with a dual split between routing metadata, session-derived operational learnings, structured SQL state, and a compiled wiki; strongest reviewed example so far of "map versus compass" memory separation inside a live assistant runtime
- Phantom — Ghostwright's AI co-worker substrate with a file-and-vector memory split, multi-block prompt assembly, and a sandbox-deny reflection subprocess that mutates evolved identity files under deterministic invariant checks; strongest reviewed example so far of authority gradients (operator-locked vs evolution-managed vs agent-owned) over a single file tree
- Pi Self-Learning — pi extension with automatic task-end reflection, scored learnings index, and context injection; purest implementation of the automated mistake-extraction loop, but the reflection pipeline primarily relocates rather than transforms
- Playground — TribleSpace-backed shell-first agent runtime with branch-separated cognition/archive/memory, unified chat-log importers, and budget-adaptive temporal memory; strongest reviewed example here of append-only event storage plus synthetic memory turns
- ReasoningBank — reasoning-as-memory pipeline that extracts structured memory items from both successful and failed trajectories, retrieves by embedding similarity, and proposes test-time scaling via parallel trajectory comparison; sits between Reflexion (simpler) and ExpeL (richer lifecycle) on the artifact-learning spectrum
- Reflexion — verbal reinforcement loop that turns failed attempts into short natural-language plans; important early trajectory-to-artifact precedent, but with a much thinner memory lifecycle than newer systems
- REM — four-database episodic memory service (Postgres + Qdrant + Neo4j + Redis) with LLM-driven consolidation from episodes to scored semantic facts and temporal graph expansion at retrieval; heaviest infrastructure footprint among reviewed systems with the thinnest knowledge transformation layer
- SAGE — BFT-branded agent memory with CometBFT consensus, Ed25519 signing, application-level validators (sentinel, dedup, quality, consistency), confidence decay, and AES-256-GCM encryption; the consensus framing is ceremony around a deterministic validation pipeline in single-node mode, but the validation gate pattern and domain-scoped RBAC are genuinely useful
- Semiont — document-grounded annotation kernel with W3C annotations, git-backed events, working-tree URIs, and shared human/agent flows; strongest example here of annotation-first KB infrastructure
- Self-Training-LLM — Wikipedia factual self-training pipeline that converts generated questions, model answer samples, and NLI/SelfCheck uncertainty scores into SFT/DPO datasets and model checkpoints, then uses GPT-4o pairwise judging for evaluation; useful weight-learning contrast, not a persistent memory store
- sift-kg — LLM-powered document-to-knowledge-graph pipeline with schema discovery, human-gated entity resolution, and interactive visualization; strongest reference for extraction-first knowledge construction and confidence aggregation
- Siftly — Next.js + SQLite ingestion system with deterministic-first enrichment, resumable stage markers, and hybrid retrieval; strongest reference so far for high-volume source loading patterns
- SignetAI — local-first cross-harness memory daemon with SQLite/FTS/vector/graph recall, trace-derived fact extraction, transcript retention, MCP tools, and connector packaging
- SkillX — trajectory-derived skill-library construction pipeline that distills successful benchmark runs into plans, functional skills, and atomic tool skills, with LLM extraction, clustering, filtering, and incomplete retrieval/activation support
- SkillWeaver — web-agent exploration loop that distills successful Playwright trajectories into reusable async Python API skills and shipped SkillNet code libraries
- Sig — source-only coverage of a private-beta macOS work-memory app that turns meeting decisions, commitments, and user interpretation into local plain files AI tools can read; interesting files-first workplace packaging, but no public code or release artifact to inspect yet
- SkillNote — self-hosted skill registry with dual version tracks, live MCP projection via PostgreSQL NOTIFY, and agent-submitted ratings; strongest reviewed reference so far for skill hosting and distribution as a product, though the offline sync and install stories are narrower than the README implies
- Spacebot — Rust concurrent agent framework with code-level symbolic scheduling (cortex), context-forking branches, typed memory with graph edges and hybrid search; cleanest production implementation of the bounded-context orchestration model among reviewed systems
- Stash — Postgres/pgvector MCP memory service that turns embedded episodes into typed facts, relationships, causal links, goals, failures, hypotheses, contradiction records, and confidence decay
- Supermemory — hosted memory platform with an open integration layer (MCP server, multi-framework SDK wrappers, graph UI); strongest reference for MCP-first distribution and prompt-middleware ergonomics, with core memory logic mostly behind hosted
/v3and/v4APIs - Synapptic — Python CLI that mines Claude Code transcripts into weighted user/guard profiles, benchmarks guard usefulness with per-model ablations, and compiles the result into assistant-specific memory files
- Thalo — custom plain-text language with grammar, types, validation, and LSP; makes the same programming-theory bet we do but with full compiler formalization
- Thalo entity types compared to commonplace document types — detailed type mapping showing gaps (supersedes links, source status tracking) and borrowable patterns
- Tendril — Tauri/Strands agent sandbox where a stable three-tool bootstrap surface lets the model create and reuse workspace-local Deno TypeScript capabilities as persistent executable memory
- Tracecraft — S3-backed CLI coordination layer for multi-agent systems with five primitives over object storage; cleanest exemplar of coordination-by-convention, where the coordination semantics live in naming conventions and client compliance rather than enforcement mechanisms; first entry focused purely on coordination infrastructure rather than memory/knowledge management
- Trajectory-Informed Memory Generation — source-only paper coverage of trajectory-derived strategy/recovery/optimization tips; artifact-learning counterpart to AgeMem's weight-learning path
- virtual-context — proxy-owned context virtualization layer with topic summarization, fact extraction, tool-chain stubbing, and demand-paged retrieval; strongest reviewed example of managing the context window itself rather than bolting retrieval onto it
- Voiden — Git-native API workspace whose
.voidfiles combine markdown docs, structured request blocks, linked reuse, local history, extension-scoped capabilities, and installable agent skills; not an agent memory product, but a strong adjacent file-format-plus-runtime design - WUPHF — local multi-agent office with fresh-session runners, scoped MCP, git-backed team wiki, per-agent notebooks, trace-to-fact extraction, lint, playbook execution learning, and broker-state team skills; strongest reviewed example so far of a full runtime making markdown/git memory part of an agent office rather than a standalone memory backend
- Voyager — embodied lifelong-learning loop with automatic curriculum, critic-gated retries, and promotion of successful trajectories into retrievable JavaScript skills; clearest executable-artifact learning system in this queue
- xMemory — research-code agent memory system that turns dialogue streams into episodes, semantic facts, and theme hierarchies, then uses coverage-based representative selection plus entropy gates for top-down retrieval
- Zikkaron — MCP memory server for Claude Code with 26 neuroscience-branded subsystems (Hopfield retrieval, predictive coding write gate, engram allocation, hippocampal replay) all implemented as heuristic Python without LLM calls; the neuroscience framing is vocabulary over mechanism, but the 9-signal retrieval fusion and compaction hooks are genuinely useful
Patterns Across Systems
Most systems here (ours, Ars Contexta, Thalo, ClawVault, Agent-Skills) independently converge on:
- Filesystem over databases — plain text, version-controlled, no lock-in
- Progressive disclosure — load descriptions at startup, full content on demand
- Start simple — architectural reduction outperforms over-engineering
- Trace-derived learning — trace-derived learning techniques in related systems broadens the comparison beyond pi-adjacent session mining to include artifact-learning and weight-learning systems fed by live traces and trajectories
The divergences are more revealing:
- Storage model — Cognee uses a poly-store (graph + vector + relational with pluggable backends), Siftly uses SQLite, CrewAI uses LanceDB by default with optional Qdrant Edge, Hindsight uses PostgreSQL+pgvector, Zikkaron uses SQLite with FTS5+sqlite-vec, and SAGE uses SQLite+BadgerDB (personal) or PostgreSQL+pgvector (multi-node) as operational substrates, while the others keep files as the primary storage interface. OpenViking occupies a novel middle position: it presents a filesystem interface (
viking://URIs,ls/read/findoperations) but the substrate is AGFS + vector index — filesystem as metaphor, not mechanism. Cludebot uses Supabase (PostgreSQL+pgvector) for its full mode but also offers a local JSON file store that is the closest a database-first system gets to filesystem-first. Cognee, Hindsight, CrewAI, Zikkaron, Cludebot, and SAGE are the furthest from filesystem-first: memories are opaque database records, not readable files - System boundary — CocoIndex sits one layer below most systems here: it is an incremental engine for maintaining derived vector/graph/relational projections, not a primary knowledge medium. That makes it more relevant to our "operational layer beneath the KB" question than to the note/link semantics question directly
- Agent-facing UX — Napkin is the clearest example of treating CLI output itself as part of the memory architecture: hidden scores, match-only snippets, and next-step hints are all tuned for model behavior rather than human browsing. Most other systems focus on storage and retrieval internals but leave the interaction layer human-shaped
- Packaging unit — most systems distribute concerns across multiple files (notes, configs, scripts, indexes), but o-o pushes the opposite extreme: each document is a self-contained polyglot file carrying rendering, update contract, shell dispatch, source cache, and changelog. That maximizes portability and local inspectability at the cost of modularity and inter-document structure
- Grounding discipline — cognitive psychology (arscontexta) vs programming theory (commonplace, thalo) vs empirical operational patterns (Agent-Skills)
- Formalization level — custom DSL (thalo) vs YAML conventions (commonplace) vs prose instructions (Agent-Skills)
- Governance stance — most systems treat governance as advisory (instructions the agent should follow); Decapod enforces governance with hard gates (validation must pass, VERIFIED requires proof-plan); SAGE enforces with cryptographic gates (signed transactions, validator quorum, RBAC clearance levels) — two very different enforcement models, both structurally enforced rather than instructed
- Access control — SAGE has structured multi-agent RBAC (clearance levels, domain-scoped permissions, on-chain agent identity); Cognee has relational ACLs with tenant isolation and per-dataset permissions; most other systems either have no access control or rely on filesystem permissions
- Cross-agent knowledge transfer — most systems are single-agent or agent-agnostic; cass-memory is the first reviewed system to make cross-agent session mining a first-class feature, indexing logs from Claude Code, Cursor, Codex, Aider, and others into a shared playbook
- Runtime self-modification — most frameworks have fixed agent topology defined at build time; OpenSage is the first reviewed system where agents can create subagents and scaffold new tools at runtime, though without quality gates on the created artifacts
- Self-referentiality — only our KB is simultaneously a knowledge system and a knowledge base about knowledge systems
Open Questions
- Does convergence on filesystem-first indicate a durable pattern, or a phase that will be outgrown?
- Should high-volume ingestion in a file-first KB adopt a small operational database layer for stage state and indexing?
- Will the programming-theory grounding produce better systems than the psychology grounding, or will they converge?
- Are there systems we're missing that take a fundamentally different approach?
Other tagged notes
- Closure-SDK - Geometric S3 memory and integrity runtime where ordered carrier streams promote into quaternion genome state, not readable notes or model weights
- Incremental Self-Improvement - Source-only coverage note for Schmidhuber's incremental self-improvement paradigm, a reward-gated self-modification system for learning learning strategies
- MehmetGoekce/llm-wiki - Promptware LLM Wiki bootstrap kit for Claude Code with L1/L2 memory split, Logseq/Obsidian schemas, setup script, and OpenSpec requirements
- Pratiyush/llm-wiki - Multi-agent session-transcript compiler that turns Claude/Codex/Cursor/Gemini history into a redacted markdown wiki, static site, exports, MCP tools, and agent prompt workflows
- Tolaria - Mac and Linux markdown-vault app with git-backed files, type lenses, saved views, agent context snapshots, MCP tools, and managed guidance