Supermemory Just Won the Memory Benchmarks
Supermemory is on GitHub trending today, 23.3k stars, and the reason is they grabbed the top spot on the three benchmarks that define agent memory: #1 on LoCoMo, #1 on ConvoMem, 85.4% on LongMemEval. MIT-licensed, sub-300ms retrieval on production volume, MCP integration into Claude Desktop, Cursor and VS Code. Their comparison table puts Mem0, Zep, Letta and SuperLocalMemory behind.
Step back. Agent memory has been a crowded fight for a year β Mem0 raised, Zep raised, cognee has a community, Letta open-sourced, Hippo got covered, Anthropic shipped its own context APIs. The arguments were mostly about API surface and ergonomics, because nobody had separated themselves on numbers. Now somebody has. Single-session retrieval 92.3%, knowledge updates 89.7%, temporal reasoning 82.0%. That is the kind of split where the conversation moves from "which one fits my stack" to "why am I not using the one that wins."
The spine: memory is becoming a layer the agent RENTS, not builds. Six months ago every serious agent shop was writing its own vector-store-plus-summarization-loop because the off-the-shelf options didnβt beat what you could roll yourself in a weekend. That window is closing. Once one open-source memory engine measurably wins, the make-or-buy math tips for everyone except the frontier labs.
What to watch. Whether the benchmark crown holds against the next wave (MemMachine just published 0.9169 on LoCoMo with gpt-4.1-mini). Whether one of the agent frameworks ships Supermemory as default and ends the bring-your-own-memory era for hobbyists. And whether the closed-source competitors (Mem0, Zep) respond with their own benchmark sweep or start losing customers quietly.
https://github.com/supermemoryai/supermemory
← Back to all articles
Step back. Agent memory has been a crowded fight for a year β Mem0 raised, Zep raised, cognee has a community, Letta open-sourced, Hippo got covered, Anthropic shipped its own context APIs. The arguments were mostly about API surface and ergonomics, because nobody had separated themselves on numbers. Now somebody has. Single-session retrieval 92.3%, knowledge updates 89.7%, temporal reasoning 82.0%. That is the kind of split where the conversation moves from "which one fits my stack" to "why am I not using the one that wins."
The spine: memory is becoming a layer the agent RENTS, not builds. Six months ago every serious agent shop was writing its own vector-store-plus-summarization-loop because the off-the-shelf options didnβt beat what you could roll yourself in a weekend. That window is closing. Once one open-source memory engine measurably wins, the make-or-buy math tips for everyone except the frontier labs.
What to watch. Whether the benchmark crown holds against the next wave (MemMachine just published 0.9169 on LoCoMo with gpt-4.1-mini). Whether one of the agent frameworks ships Supermemory as default and ends the bring-your-own-memory era for hobbyists. And whether the closed-source competitors (Mem0, Zep) respond with their own benchmark sweep or start losing customers quietly.
https://github.com/supermemoryai/supermemory
Comments