Supermemory

Type: ../types/agent-memory-system-review.md · Status: current

Supermemory is Supermemory.ai's hosted memory and context platform with an open integration monorepo around it. The repository exposes MCP server code, browser/app surfaces, SDK wrappers, prompt middleware, API schemas, docs, and a memory-graph UI package; the central extraction, profile-building, ranking, contradiction handling, and storage engine sits behind hosted /v3 and /v4 APIs. That makes it a strong reference for packaging memory as a service, but a weak source for reviewing the learning mechanism itself.

Repository: https://github.com/supermemoryai/supermemory

Reviewed commit: b6913a0375e0faf4d765b038cff7d568a89f3487

Last checked: 2026-05-16

Core Ideas

The open repo is an integration layer around a hosted memory substrate. The README describes automatic fact extraction, profile maintenance, contradictions, forgetting, connectors, and multimodal processing, but the inspectable code mostly calls hosted endpoints or documents their shapes (README.md, validation schemas, API schemas). Locally reviewable retained artifacts are adapter code, API contracts, prompt templates, tool descriptions, UI payload types, and docs. Hosted memory records, profiles, document chunks, embeddings, graph relationships, and ranking state are only visible through API response contracts.

The public API boundary splits documents from extracted memories and profiles. /v3/documents accepts raw content, URLs, files, metadata, customId, container tags, and processing state; docs say repeated customId updates link or diff content for dedup/update behavior (add-context docs, document docs). /v4/memories addresses extracted memory entries directly, including isStatic, versioned updates, soft forget, and memory IDs; /v4/profile returns static and dynamic profile arrays and optional search results; /v4/search returns memories or chunks with similarity, optional reranking, filtering, related memories, and document context (memory docs, profile docs, search docs). In artifact terms, documents and extracted memories are hosted knowledge artifacts when consumed as evidence; profile/search endpoints become system-definition artifacts when their outputs are injected into prompts or used to rank context.

The MCP server is the clearest runtime surface. apps/mcp is a Cloudflare Workers/Durable Objects MCP app that authenticates either API keys or OAuth tokens by introspecting the hosted API, then exposes tools, resources, app UI, and a prompt (MCP entrypoint, auth, server). The memory tool saves or forgets information; recall retrieves memories plus profile; listProjects caches container tags; whoAmI reports the authenticated user and MCP session; resources expose supermemory://profile and supermemory://projects; the context prompt emits an instruction to save memory-worthy facts plus static/dynamic profile context. Those tool descriptions have explicit behavioral authority: they instruct the model when to write, recall, or prefer this memory surface.

MCP forget is implemented as exact-match then semantic fallback, but actual deletion semantics are hosted. The MCP client first calls client.memories.forget({ content, containerTag }); on 404 it searches with a high similarity threshold and then forgets the matching memory ID (MCP client). The v4 memory docs describe forget as a soft delete excluded from search but preserved in the system (memory docs). Locally, this is a useful mediation policy; the storage substrate and restoration/invalidation behavior are not inspectable.

Prompt middleware is the main behavior-shaping adapter pattern. The TypeScript tools package wraps Vercel AI SDK models with a proxy that fetches /v4/profile, builds a memory prompt, injects it into the system prompt, caches per-turn memory text, and optionally saves the conversation to /v4/conversations after generation (Vercel wrapper, middleware, shared memory client, conversation client). OpenAI, Mastra, VoltAgent, Microsoft Agent Framework, Pipecat, and Cartesia wrappers repeat the same shape: identify a user/container, retrieve profile/search context, inject formatted memory into instructions/messages, then optionally save the conversation (OpenAI middleware, Mastra wrapper, VoltAgent middleware, Agent Framework provider, Pipecat service, Cartesia agent).

Cross-language adapters converge on a small contract. TypeScript and Python wrappers expose the same fields: containerTag or container_tag for scope, customId or conversationId for grouping, mode as profile/query/full, addMemory as always/never, and optional base URL/API key. The wrappers differ in host-framework mechanics, but not in memory semantics. This is productively boring: Supermemory makes the memory service a stable external dependency and keeps framework-specific code as thin translation layers.

Local caching and deduplication are prompt hygiene, not memory learning. The TypeScript middleware uses an in-process LRU keyed by container, thread, mode, and normalized user message to avoid duplicate profile calls during a tool-call loop (cache). Prompt builders deduplicate static, dynamic, and search-result strings with priority static over dynamic over search before injection (dedup helpers, prompt builder). These are system-definition artifacts with prompt-selection authority, but they do not decide what becomes a durable memory.

The graph UI reveals the hosted memory graph contract. @supermemory/memory-graph renders documents and memories with updates, extends, and derives edges, isForgotten, forgetAfter, isStatic, version, parent/root IDs, isLatest, and source metadata fields (graph API types, graph types, graph data hook, version chains). The web app and MCP app fetch documents-with-memories payloads from /v3/documents/documents and hand them to the renderer (web graph hook, MCP graph tool). The graph UI is a knowledge-artifact inspection surface; the backend graph construction, edge extraction, and ranking are not implemented here.

Authentication mediation is part of the memory boundary. Hosted API keys are the normal developer path, with docs for container-scoped keys limited to selected endpoints and one container tag (authentication docs). The MCP server detects sm_ API keys, otherwise treats bearer tokens as OAuth tokens, asks the hosted API for user identity plus an API key, and then uses that API key for downstream memory calls (MCP auth). This makes Supermemory's storage substrate operationally hosted and account-scoped, not file-scoped or repo-scoped.

The browser extension and skill are distribution artifacts. The extension stores bearer tokens locally, saves page/chat/Twitter content to /v3/documents, searches /v4/search, and injects UI into ChatGPT/Claude/T3/Twitter pages (extension API, extension storage, background worker). The checked-in agent skill tells coding agents when to recommend Supermemory and gives integration recipes (Supermemory skill). The skill is not memory storage; it is a system-definition artifact that markets and routes future agent implementation choices.

Comparison with Our System

Dimension Supermemory Commonplace
Primary purpose Hosted memory/RAG/profile API with open integrations Repo-native methodology KB with typed artifacts, validation, review, and operational conventions
Storage substrate Hosted service records, documents, chunks, memory entries, embeddings, profiles, spaces/projects; local code has transient caches and UI state Git-tracked markdown collections plus generated indexes/reports
Inspectable artifacts Adapter code, API schemas/docs, MCP server, prompt middleware, graph UI types Canonical notes, type specs, sources, instructions, ADRs, scripts, validation reports
Representational form Hosted mixed records: prose memories/chunks, symbolic metadata/relations, distributed-parametric embeddings/rankers; local prompt strings and tool schemas Mostly prose plus frontmatter, schemas, links, commands, and review artifacts
Lineage API contracts include document IDs, custom IDs, content hashes, memory source counts, memory-document joins, parent/root/version fields; extraction lineage is hosted Source snapshots, commit-pinned reviews, authored links, git history, archive/replacement lifecycle
Behavioral authority Profile/search outputs and prompt middleware influence model context; MCP/tools instruct write/recall/forget; auth scopes enforce access Type specs, collection rules, AGENTS.md, skills, validation/review commands

Supermemory is stronger as a product integration surface. It gives application developers a direct answer for memory: use one API key, identify the user or project with a container tag, call profile/search before generation, and save conversations after generation. It handles connectors, file upload, multimodal extraction, graph payloads, SDK packaging, OAuth/API-key mediation, and multiple agent frameworks in one distribution story.

Commonplace is stronger as an inspectable knowledge system. A commonplace note is a reviewable retained artifact with local source links, status, collection rules, validation, and git history. A Supermemory memory entry can be exposed through API schemas and graph UI payloads, but the reviewer cannot inspect how it was extracted, contradicted, promoted, expired, embedded, or ranked unless the hosted implementation or run traces are separately provided.

The systems also assign authority differently. Supermemory's most important behavior-shaping artifacts are runtime surfaces: prompt templates, middleware, tool descriptions, OAuth scopes, API filtering, reranking flags, and hosted profile/search responses. Commonplace's strongest authority surfaces are file-backed system-definition artifacts: instructions, type specs, validators, review gates, generated indexes, and curated links.

Supermemory should not be marked trace-derived from this source checkout. The local code demonstrably forwards conversations, chat messages, documents, and user actions into hosted APIs, and it retrieves profiles and memories derived somewhere downstream. The derivation step that turns traces into durable memories or profile entries is not inspectable locally. Code-grounded status is therefore "hosted opaque trace intake," not an open trace-derived learning implementation.

Read-back: both — agents can call MCP recall, while SDK middleware injects hosted profile and search context into prompts before generation.

Borrowable Ideas

A tiny cross-framework memory contract. containerTag, customId, mode, and addMemory are enough to make memory adapters feel consistent across TypeScript, Python, UI, voice, and MCP. Commonplace tools should keep similarly stable option names if they expose repo memory to multiple runtimes. Ready as API ergonomics.

MCP as the default hosted-memory bridge. The MCP server's combination of memory, recall, context, profile/project resources, and app UI is a clean package for assistants that cannot or should not import an SDK. A commonplace hosted or daemonized layer could copy the surface shape without copying the hosted substrate. Needs a concrete daemon use case.

Treat prompt middleware as a first-class system-definition artifact. Supermemory's wrappers show that memory behavior often enters through small prompt mutations rather than through a full agent framework. Commonplace should review any future prompt-injection middleware with the same seriousness as instructions or skills. Ready as review vocabulary.

Expose graph payload contracts even when storage is hidden. The memory-graph package makes useful lifecycle fields visible: updates, extends, derives, version chains, latest flags, forgotten flags, expiration, and source metadata. If commonplace adds a visual graph or compiled retrieval layer, its payload should expose authority and lifecycle fields rather than only node labels. Ready as UI contract inspiration.

Scoped keys are a practical hosted analogue to collection scoping. Container-scoped API keys are not a replacement for repo permissions, but the shape is worth remembering for shared or hosted KB deployments: scope a credential to one behavioral domain, then make allowed endpoints explicit. Needs hosted/shared deployment pressure.

Do not borrow opaque extraction as methodology evidence. Supermemory's hosted extraction may be good, but this checkout cannot teach us how to implement contradiction resolution, forgetting, profile compaction, or ranking. It is borrowable as an integration and packaging reference, not as a code-grounded memory-learning design.

Curiosity Pass

The repository is unusually open around the edges and closed at the center. MCP, graph UI, browser extension, SDK adapters, docs, tests, and schemas are all reviewable, while the memory engine itself is not. That boundary is coherent for a hosted product, but it matters for this collection because the most interesting memory claims live behind it.

Memory graph is partly a UI contract. The graph package renders document-memory derivation and memory-memory relation edges, but relation extraction and persistence are hosted. The UI proves that the API exposes graph-shaped state; it does not prove how that state is built.

The local dedup layer is easy to overread. Adapter dedup removes duplicate strings before prompt injection. It is not the same as memory consolidation, contradiction handling, or duplicate-write prevention in the storage layer.

The MCP prompt has high authority for a small string. It tells the model to save informative facts and memory-worthy information. That instruction can turn ordinary user chat into write traffic, so it should be treated as a system-definition artifact even though it is just prompt text.

Conversation storage is both central and opaque. /v4/conversations is the cleanest trace-forwarding path in the wrappers, with structured messages and grouping IDs. The local code does not expose the extraction oracle, append/diff policy, profile update rule, or invalidation lifecycle after the request is accepted.

What to Watch

  • Whether the hosted extraction/profile/ranking engine becomes source-visible, making trace-derived classification reviewable from code.
  • Whether /v4/conversations gains public schemas for append detection, source lineage, message-level provenance, and memory derivation IDs.
  • Whether graph relations add confidence, source offsets, extraction prompt/model versions, human review state, or invalidation rules.
  • Whether MCP clients start using the context prompt automatically and thereby make Supermemory's prompt text a higher-authority instruction channel.
  • Whether scoped keys grow from container restriction into richer read/write/search/forget policy.
  • Whether SDK wrappers keep graceful failure behavior clear: continue without memory, fail closed, or mark outputs as memory-unavailable.

Relevant Notes:

  • Knowledge artifact - classifies: hosted documents, memory entries, chunks, profile facts, and graph payloads advise later agents when retrieved or displayed.
  • System-definition artifact - classifies: MCP tool descriptions, prompt middleware, API scopes, cache/dedup policies, and graph/ranking payload fields route, instruct, enforce, or rank behavior.
  • Behavioral authority - clarifies: Supermemory's authority mainly arrives through hosted API calls, prompt injection, tool descriptions, and auth scopes.
  • Storage substrate - contrasts: Supermemory's durable substrate is a hosted service object and database/vector/graph layer, while the open repo contains integration code and transient caches.
  • Use trace-derived extraction - contrasts: this checkout forwards traces to a hosted engine but does not expose the trace-to-memory derivation mechanism needed for a code-grounded trace-derived tag.