April 8, 202616 MIN READ

MemPalace: The Open-Source AI Memory System That Scores

By Dorian Laurenceau

📅 Last reviewed: April 24, 2026. Updated with April 2026 findings and community feedback.

MemPalace: The Open-Source AI Memory System That Scores 96.6%

📅 Last Updated: April 8, 2026, MemPalace v3.0.0, released April 5, 2026.

📚 Related: Claude Code Complete Guide | MCP Protocol Guide | Best AI Coding Tools 2026

Every conversation you have with an AI disappears when the session ends. Six months of decisions, debugging sessions, architecture debates, gone. You start over every time.

Other memory systems try to fix this by letting AI decide what's worth remembering. The AI extracts "user prefers Postgres" and throws away the conversation where you explained why. You lose the reasoning, the context, the nuance.

MemPalace takes a different approach: store everything, then make it findable. On April 5, 2026, actress and tech entrepreneur Milla Jovovich and developer Ben Sigman released MemPalace v3.0.0, an open-source AI memory system that has already hit 21.7K GitHub stars in three days. Here's why the developer community is paying attention.

The Problem MemPalace Solves

After six months of daily AI use, you've generated approximately 19.5 million tokens of conversations. That's every decision, every solution, every trade-off analysis. The question is: where did it all go?

Approach	Token Cost	Annual Cost	Recall
Paste everything into context	19.5M, doesn't fit	Impossible	N/A
LLM summarization	~650K tokens	~$507/year	Lossy
MemPalace wake-up	~170 tokens	~$0.70/year	96.6%
MemPalace + 5 searches	~13,500 tokens	~$10/year	96.6%

The difference is dramatic: $10/year for 96.6% recall vs. $507/year for lossy summaries that throw away context.

The honest read on AI memory systems, tracked across r/LocalLLaMA, r/LangChain, and r/MachineLearning: "memory for LLMs" is the problem the entire applied AI industry has been failing to solve cleanly for three years, and every new approach — vector stores, graph memories (Mem0, Zep), summarization-based approaches (LangGraph memory, Letta/MemGPT), and verbatim-retrieval approaches like MemPalace — trades off recall, latency, and cost differently. The LongMemEval benchmark is now the community's shared measurement, and the numbers it produces are sobering: no system hits 100%, and "memory quality" is a moving target that depends heavily on what you ask about.

Where the community correctly pushes back on the "summarize everything" approach: summarization is lossy on purpose, and the information the summarizer chose to drop is exactly the information you'll want later. Verbatim retrieval with semantic search preserves the original text and lets the query at retrieval time decide what's relevant — which matches human memory much better than pre-decided summaries.

Pragmatic rule from people building personal-AI and agent systems with long-running state: start with verbatim + semantic search (ChromaDB, Pinecone, Qdrant, or Weaviate) and only add summarization when you hit a concrete latency or cost wall. Most teams reach for summarization too early, lose information, and then wonder why the assistant "forgot" a detail it should have had. Storage is cheap; information loss is not.

The Palace Architecture

MemPalace is inspired by the ancient Greek Method of Loci, a memorization technique where you place ideas in rooms of an imaginary building and walk through it to recall them. MemPalace applies the same principle to AI memory:

Loading diagram…

The Building Blocks

Element	What It Is	Example
Wing	A person or project	`wing_kai`, `wing_driftwood`
Room	A specific topic inside a wing	`auth-migration`, `graphql-switch`
Hall	A memory type connecting rooms	`hall_facts`, `hall_events`, `hall_discoveries`
Tunnel	Cross-wing connections (same topic, different contexts)	Auth-migration appears in both Person and Project wings
Closet	A summary that points to original files	Plain-text summaries (AAAK coming soon)
Drawer	The original verbatim file	Your exact conversation, never summarized

Why Structure Matters

This isn't cosmetic organization. The palace structure directly improves retrieval accuracy:

Search mode	Recall (R@10)	Improvement
Search all closets (unfiltered)	60.9%	Baseline
Search within wing	73.1%	+12%
Search wing + hall	84.8%	+24%
Search wing + room	94.8%	+34%

The architecture gives AI a navigable map instead of a flat search index. When you ask "what did we decide about auth?", MemPalace knows which wing to search, which rooms to check, and which halls to follow, instead of scanning every piece of data.

The Memory Stack

MemPalace uses a 4-layer system that loads only what's needed:

Layer	Content	Size	When Loaded
L0	Identity, who is this AI?	~50 tokens	Always
L1	Critical facts, team, projects, preferences	~120 tokens (AAAK)	Always
L2	Room recall, recent sessions, current project	On demand	When topic comes up
L3	Deep search, semantic query across all closets	On demand	When explicitly asked

Your AI wakes up with L0 + L1 (~170 tokens) and immediately knows your world, your team members, your projects, your preferences. Searches only fire when needed, keeping costs near zero.

Getting Started

Installation

pip install mempalace

Requirements: Python 3.9+, no API key, no internet after install.

Setup

# Initialize — guided onboarding, sets up wings for your people and projects
mempalace init ~/projects/myapp

# Mine your project files (code, docs, notes)
mempalace mine ~/projects/myapp

# Mine your AI conversations (Claude, ChatGPT, Slack exports)
mempalace mine ~/chats/ --mode convos

# General mode — auto-classifies into decisions, milestones, problems
mempalace mine ~/chats/ --mode convos --extract general

Search

# Search everything
mempalace search "why did we switch to GraphQL"

# Search within a specific project
mempalace search "database decision" --wing myapp

# Search a specific topic
mempalace search "auth approach" --room auth-migration

Connect to Your AI

For Claude, ChatGPT, Cursor, Gemini (MCP-compatible tools):

claude mcp add mempalace -- python -m mempalace.mcp_server

Now your AI has 19 tools available through MCP. Ask it anything:

"What did we decide about auth last month?"

Claude calls mempalace_search automatically, gets verbatim results, and answers. You never type a search command, the AI handles it.

For local models (Llama, Mistral):

mempalace wake-up > context.txt
# Paste context.txt into your model's system prompt

MemPalace vs. The Competition

The Fundamental Difference

Most memory systems use an AI to extract what it thinks is important from your conversations, "user prefers Postgres," "team decided to use Clerk." Then they throw away the original conversation.

MemPalace stores everything verbatim and uses semantic search to find what's relevant. The 96.6% recall proves this works better than extraction-based approaches. You never lose the reasoning, the context, or the nuance behind a decision.

The 19 MCP Tools

When connected via MCP, your AI gets access to a comprehensive toolkit:

Palace (Read)

Tool	Purpose
`mempalace_status`	Palace overview + AAAK spec
`mempalace_list_wings`	All wings with counts
`mempalace_list_rooms`	Rooms within a wing
`mempalace_get_taxonomy`	Full wing → room → count tree
`mempalace_search`	Semantic search with filters
`mempalace_check_duplicate`	Check before filing
`mempalace_get_aaak_spec`	AAAK dialect reference

Palace (Write)

Tool	Purpose
`mempalace_add_drawer`	File new verbatim content
`mempalace_delete_drawer`	Remove by ID

Knowledge Graph

Tool	Purpose
`mempalace_kg_query`	Entity relationships with time
`mempalace_kg_add`	Add facts
`mempalace_kg_invalidate`	End facts (temporal validity)
`mempalace_kg_timeline`	Chronological entity story
`mempalace_kg_stats`	Graph overview

Tool	Purpose
`mempalace_traverse`	Walk the graph across wings
`mempalace_find_tunnels`	Rooms bridging two wings
`mempalace_graph_stats`	Connectivity overview
`mempalace_diary_write`	Agent diary entry (AAAK)
`mempalace_diary_read`	Read recent diary entries

Specialist Agents

MemPalace lets you create agents with their own memory:

~/.mempalace/agents/
  ├── reviewer.json       # code quality, patterns, bugs
  ├── architect.json      # design decisions, tradeoffs
  └── ops.json            # deploys, incidents, infra

Each agent has a focus, keeps a diary (in AAAK), and builds expertise by reading its own history. The reviewer remembers every bug pattern it's seen. The architect remembers every design decision. They're specialist lenses on your data.

Your CLAUDE.md only needs one line:

You have MemPalace agents. Run mempalace_list_agents to see them.

Honest Limitations

MemPalace launched with impressive numbers but also with acknowledged issues. To their credit, the creators published a detailed correction on April 7:

What Was Overstated

→AAAK compression, Originally claimed "30×" compression. Actually lossy, and small-scale examples showed no token savings
→"+34% palace boost", Compares unfiltered to filtered search. Real but not a novel technique, standard ChromaDB metadata filtering
→"100% with Haiku rerank", Real result, but the rerank pipeline wasn't in the public benchmark scripts
→Contradiction detection, Exists as fact_checker.py but not yet wired into knowledge graph operations

What's Real

→96.6% LongMemEval, Raw mode, 500 questions, independently reproduced
→Local, free, no cloud, Actually runs entirely on your machine
→Palace architecture, Real and useful for retrieval, even if the boost is standard filtering
→19 MCP tools, Functional and well-documented

Who Should Use MemPalace?

Ideal Users

→Solo developers managing multiple projects who lose context between sessions
→Team leads who need to recall what decisions were made and by whom
→Heavy AI users (Claude, ChatGPT, Cursor) who generate thousands of conversations
→Privacy-conscious organizations that can't send data to cloud memory services

Not Ideal For

→Casual AI users who have a few conversations per week, the setup overhead isn't justified
→Teams needing real-time collaboration, MemPalace is local to each machine
→Windows users, Some issues reported with macOS ARM64; Windows support is less tested (early days)

The Bottom Line

MemPalace proves that AI memory doesn't need to be expensive, cloud-dependent, or lossy. By storing everything raw and making it navigable through the palace architecture, it achieves the highest recall scores ever published, for free.

The project is three days old. It has rough edges, overstated initial claims, and open issues. But the core approach, verbatim storage, structured navigation, local-first, is sound, the benchmark results are independently verified, and the team is fixing problems in real-time.

If you use AI daily and have ever wished it could remember what you discussed last month, MemPalace is worth 5 minutes of setup. Your future self, the one looking for "why did we choose Postgres?" six months from now, will thank you.

GitHub: github.com/milla-jovovich/mempalace

Dorian Laurenceau

Full-Stack Developer & Learning Designer

Full-stack web developer and learning designer. I spent 4 years as a freelance full-stack developer and 4 years teaching React, JavaScript, HTML/CSS and WordPress to adult learners. Today I design learning paths in web development and AI, grounded in learning science. I founded learn-prompting.fr to make AI practical and accessible, and built the Bluff app to gamify political transparency.

Prompt EngineeringLLMsFull-Stack DevelopmentLearning DesignReact

Published: April 8, 2026Updated: April 24, 2026

Newsletter

Weekly AI Insights

Tools, techniques & news — curated for AI practitioners. Free, no spam.

Free, no spam. Unsubscribe anytime.

FAQ

What is MemPalace?+

MemPalace is a free, open-source AI memory system that stores your conversations and project data locally using ChromaDB. It scored 96.6% on LongMemEval, the highest result ever published for an AI memory system, without requiring any API key or cloud service.

How does MemPalace work?+

MemPalace organizes your data into a palace architecture: wings (people/projects), rooms (topics), closets (summaries), and drawers (verbatim files). Your AI loads a tiny 170-token wake-up context, then searches the palace only when needed, keeping costs near zero.

Is MemPalace free?+

Yes. MemPalace is MIT-licensed, runs entirely on your machine, and requires no API key, no cloud service, and no subscription. The total cost of operation is approximately $10/year for search queries, compared to $507/year for LLM-based summarization approaches.

How does MemPalace compare to Mem0 and Zep?+

MemPalace scores 96.6% on LongMemEval vs ~85% for both Mem0 and Zep. MemPalace is free and local; Mem0 costs $19-249/month and Zep costs $25+/month. Both competitors require cloud services and API calls.

What is the AAAK dialect?+

AAAK is an experimental lossy abbreviation system designed to compress repeated entities into fewer tokens. It is readable by any LLM without a decoder. Currently, AAAK scores lower than raw mode (84.2% vs 96.6% on LongMemEval) and is being improved.

Does MemPalace work with Claude, ChatGPT, and other AI tools?+

Yes. MemPalace includes an MCP server with 19 tools that works with Claude, ChatGPT, Cursor, Gemini CLI, and any MCP-compatible tool. For local models (Llama, Mistral), it provides a wake-up context file and CLI search commands.

Who created MemPalace?+

MemPalace was created by Milla Jovovich and Ben Sigman. It launched as v3.0.0 on April 5, 2026, and quickly reached 21.7K GitHub stars. The project has 11 contributors and is MIT-licensed.