Save and Search Memory with Echo

Echo makes exactly two promises: everything gets saved. And when you need it, it shows up.

Research noteMemoryUpdated Jul 1, 2026

You know this moment.

You and your agent just worked out a conclusion worth keeping. You want to write it down.

Then you hesitate. Into CLAUDE.md? The file is already long, the agent will take forever to write it up, and you might never use it anyway. Into your notes? You would not know where to file it.

Forget it.

Some weeks later, the conversation pops back into your head. You go looking. Different keywords, different platform, scrolling back through chat history — it is gone.

This is normal. Everyone who works with AI hits both failures, every week: when you want to save it, it will not keep. When you need it back, it will not turn up.

Echo makes exactly two promises: everything gets saved. And when you need it, it shows up.

Why you cannot save it

It is not that you are lazy. The file mechanism is forcing your hand.

Today, model memory runs almost entirely on files: CLAUDE.md, AGENTS.md, markdown notes.

Every line of those files gets loaded, in full, into every new session. It takes up window space, burns tokens, and splits attention. The more you write, the more every session costs before work even starts.

So you only dare to record what is most important. The rest sinks into chat history. And chat history is unsearchable.

This is not a habit problem. It is a mechanism problem.

File-based memory: findable, at an absurd price

The problem with file-based memory is not that nothing can be found. You can always stuff the whole history in, or dig through chat logs by hand.

But a file has no judgment about recall. It does not know which three entries the current task needs, which one has expired, which version is current, or which search result was superseded by a later one.

So file-based memory works the clumsy way: search again, read again, verify again, and dump a big bag of relevant and irrelevant material back into the context window together. It finds things — slowly, messily, noisily. And even when the answer comes back, it usually drags along a crowd of things that should not have.

LongMemEval, the most widely used long-term memory benchmark (ICLR 2025, 500 cross-session memory questions), has already measured this: stuff the entire history into the window, and GPT-4o lands around 60.6%. Independent replications fall in the same range — full-context GPT-4o at roughly 60–64%.

Our dirty-context analysis surfaced the same pattern: the agent greps again, runs git diff again, re-reads the same files — not because each of those actions produces anything fresh, but because it has no reliable working memory outside the window. To do anything, it first has to drag things back into view. And dragging things back into view is a re-feed.

Claude Code's CLAUDE.md works this way: the docs say CLAUDE.md and auto memory load as context at the start of every conversation. Codex's AGENTS.md is the same: it gets read before work starts, and becomes that round's instruction chain.

These files are useful. But they are not a high-precision recall system. They are closer to reading the employee handbook aloud, from page one, at the start of every meeting: the one sentence you needed is probably in there somewhere — but you pay for the entire handbook, in tokens, in latency, and in the noise of everything in it that does not matter.

Echo reports 95.8% on the LongMemEval hard set. The setups differ, so the numbers do not compare strictly — but the magnitude makes the point: what you saved actually shows up when it should is now an engineering metric, not luck.

A real memory system has to be precise, fast, and cheap — all three at once. Echo does not pour the history back. It works like a laser: it cuts back only the few slices the current task actually needs.

The shorter the work session, the better

Every piece in this series stands on the same ground:

The longer the session, the worse the work.

In How dirty is your context window? we tore down real Codex sessions: the longer a session runs, the more the window fills with stale searches, old diffs, repeated reads, and expired output. Rapid sessions keep tokens-per-commit down at 9.4M; long-haul sessions blow past 94.4M. Efficiency does not decline linearly. It falls off a cliff.

Keep sessions short. The shorter the session, the more efficiently the agent executes; the longer it runs, the closer the context window gets to that cliff.

But short has a cost: cut the session short, and there is not enough context.

So every question collapses into one: at the moment work starts, where does the context come from?

The answer: structured context, generated from precise memory. Two situations, two moves.

Situation one: brand-new task, cold start

No prior thread. First session, first context.

One sentence:

Warm up this task.

Echo pulls the repo map, hot files, relevant decisions, constraints, and dead ends out of your memory store and assembles a launch pack. The agent starts its first turn in the right place — instead of feeling its way in from git log and rg.

Situation two: picking up from an old session

The last session got heavy, got dirty, and is one turn from compaction.

Walk away. We said it in Stop re-explaining your project to your AI every session: do not let compaction decide what survives.

Echo signals you when the context turns dirty. On the signal, open a new session in one click — and the old session's live state carries straight over: the goal, the files you changed, the decisions, the constraints, the roads that failed.

The baggage stays behind.

Day to day: save on the spot, recall on the spot

To save, one sentence:

Save this to Echo.

Remember this decision.

decisions, constraints, failed paths, repo understanding, your preferences, product language, research conclusions, task state — if it might save the agent one detour in the future, save it. No format to think about. No hesitation.

To recall, also one sentence:

Search Echo for how we defined the Context Rebuild Loop.

search memories: warm-up, failed attempts

You do not have to remember where you said it. Echo does.

Download Echo MCP

Save“Save this to Echo.”

Find“Search Echo for the old Context Rebuild Loop.”

Start“Start me a session with context.”

Switch“Spin up a shorter, cleaner session.”

Download Echo MCP Keep reading: warm-up

Keep reading

想存的存不住，想找的找不到— Chinese version Stop re-explaining your project to your AI every session— how warm-up uses these memories Generate your Context Efficiency Report— see what your agent keeps re-paying for