AI Development

Compile Your Notes, Don't Re-Read Them

Most AI knowledge setups re-derive the same understanding from raw sources on every question — a better pattern has the model distill sources into an interlinked wiki once, so knowledge compounds instead of being rediscovered each time.

May 27, 2026

AI Development
Knowledge Management
AI Agents
Workflow

There’s a default way people wire an LLM up to a pile of documents — point a retrieval system at the raw sources and let the model re-read the relevant chunks on every question. It works, and for a lot of cases it’s fine. But it has a quietly wasteful property: the model does the same comprehension work over and over, rediscovering the same understanding from scratch each time you ask. A pattern I’ve come to prefer — and that Andrej Karpathy put a clean name on — flips that around: have the model compile the raw sources into a distilled, interlinked knowledge base once, then ask your questions against the distilled notes. Comprehension happens up front, and the result compounds.

Re-deriving understanding is the waste

Reading raw source material to answer a question is real cognitive work — the model has to parse, connect, and synthesize. When that happens fresh on every query, you’re paying that cost repeatedly for understanding you already produced last time. The insight evaporates the moment the response is generated, so the next question starts over from the raw text.

If your knowledge setup re-reads the source every time, it never gets smarter. It just gets asked again.

The alternative is to treat comprehension as something you bank. Run the sources through the model once with the job of distilling them — summaries, concept pages, entity pages, the connections between them — and write that out as durable notes. Now the expensive synthesis is done, saved, and reusable. Questions read the distilled layer instead of the raw pile.

Compile, don’t re-read

The core move is the same one a compiler makes: do the heavy transformation once, ahead of time, and run against the cheap, optimized output afterward. Applied to knowledge:

Curate the raw sources — the inputs you trust and want to build on.
Compile them into notes — have the model produce distilled pages: a summary per source, a page per concept or entity, written in clear prose rather than raw excerpts.
Query the notes, not the sources — your questions hit the already-understood layer, which is smaller, cleaner, and pre-connected.

The pleasant surprise is how little machinery this needs. It’s not a vector database or a retrieval pipeline — it’s disciplined file organization plus a capable model keeping the notes current. Plain markdown files with links between them is enough, which is part of why it fits a notes vault so naturally.

Interlink everything, or it’s just a folder of summaries

The compiling only pays off if the distilled notes are connected. A pile of unlinked summaries is marginally better than the raw sources; a web of notes that link to the related concepts and entities is something you can actually navigate and reason over. Every page should point to the related pages, so following a thread surfaces the neighboring context automatically. The links are what turn a stack of notes into a knowledge base — the same reason I lean on linked notes and index pages in my own vault rather than ever-deeper folders.

This is also where it stays honest: each distilled note should cite the source it came from, so you can always trace a claim back to where it originated. Distillation without traceability is just confident paraphrase, and confident paraphrase is how an “intelligent” knowledge base quietly starts lying to you.

Append and update, don’t re-compile from zero

Knowledge isn’t static, so the compile step isn’t one-and-done — but it’s also not “throw it all out and rebuild.” New sources get folded in: distill them, link them to what’s already there, and update the pages they touch, rather than re-deriving the whole base. The notes grow and improve over time, which is exactly the compounding you wanted. The discipline is the same as good note-taking by hand: bias toward updating an existing note over creating a duplicate, so the base gets denser and more connected instead of sprawling.

Where this fits with feeding an agent context

This isn’t a replacement for giving an agent durable, versioned context about how your systems work — it’s the same instinct aimed at a different target. That’s about the conventions and tools an agent needs to act; this is about turning a body of source knowledge into something queryable so the agent (or you) stops re-reading the originals. Both are versions of one idea: do the understanding once, write it down in a durable, linked form, and stop paying for the same comprehension twice. It’s documentation as infrastructure with the model doing the first draft.

The shift is small but it changes the economics: a knowledge base that compiles gets better every time you add to it, while one that re-reads is frozen at “however well the model parses today.” Distill once, link generously, keep the provenance, and grow it by appending. If you’ve built a second brain this way and want to compare notes, I’m easy to reach.