AI Development
Give Your Agent Durable Context, Not a Fresh Briefing Every Time
An AI agent that re-learns your conventions every session is wasting most of its usefulness — the leverage comes from durable, versioned context files, scoped tools, and explicit rules for what's cached versus always-live.
- AI Development
- AI Agents
- Workflow
- Automation
The difference between an AI agent that’s genuinely useful and one that’s a clever parlor trick is usually not the model. It’s the context. I spent time wiring an agent up to a large, fiddly internal system — the kind with a non-standard data model and a thousand rules you’d never guess — and the lesson that stuck wasn’t about prompting. It was that an agent re-briefed from scratch every session can never get good, and the fix is to treat its knowledge and its tools as durable, versioned artifacts you build once and maintain. This is the practical, plumbing-level companion to the broader idea of context engineering.
Re-explaining is the tax that eats the benefit
If every session starts with you pasting in “here’s how our system works, here are the field names, here’s the workflow,” you’re paying a tax that scales with how much you use the agent — and you’re paying it in the most error-prone currency there is, because you’ll paste a slightly different version each time. The agent never accumulates expertise; it’s a brilliant new hire with amnesia, every morning.
An agent that forgets your conventions between sessions isn’t a colleague, it’s a very fast intern you re-onboard daily. Write the onboarding down once.
The move is to make the context durable: conventions, data model, naming rules, and “how we do things here” live in files the agent reads automatically, not in your working memory. Once that exists, the agent starts every session already knowing the things that don’t change.
Treat context like code: versioned, in the repo, reviewed
The natural home for that durable context is the repository, as plain files under version control — the same place the code lives. That has properties that matter:
- It’s versioned. When a convention changes, the change is a diff you can see, review, and roll back. The agent’s “knowledge” has a history.
- It’s shared. Everyone (and every agent instance) reads the same source of truth, so behavior is consistent instead of depending on whose prompt you copied.
- It’s reviewable. A bad rule that makes the agent do the wrong thing gets caught the way a bad code change does, in review, not in production.
Context that lives in a chat window is gone when the tab closes. Context that lives in a versioned file is an asset that compounds. The discipline is exactly the same one behind treating documentation as infrastructure — durable, owned, kept honest — just pointed at an agent instead of a human.
Give it tools, not just text
Knowledge is half of it; the other half is capability. There’s a real difference between telling an agent about a system and giving it scoped tools to act on that system through a defined interface. A tool layer (the emerging standard here is a server the agent calls, like MCP) is better than letting the agent fire raw API calls for the same reasons a good API is better than raw database access:
- The tool can enforce limits — scoping, result caps, rate limits — so a confused agent can’t do unbounded damage or get the account throttled.
- The interface is stable and legible, so the agent’s actions are constrained to a known surface rather than the entire API.
- It’s the natural place for least privilege: the agent gets exactly the verbs it needs and no more.
And the corollary rule I hold firmly: don’t let the agent bypass the governed tool with raw calls. The whole point of the controlled interface evaporates the moment something routes around it, and in a rate-limited or access-controlled system, a bypass is how you get tokens revoked. The tool is the contract; honor it.
Cache the stable stuff; never cache the live stuff
The subtlety that made the whole thing reliable was being explicit about freshness. Not all context is equal: some of it almost never changes, and some of it is different every second. So I tier it:
- Always cached — the structural stuff: the data model, field definitions, workflow states. It changes rarely, it’s expensive to re-fetch, and the agent should read it from local files and never call out for it.
- Cache-first, fetch on miss — semi-stable things you fetch once and append, rather than re-pulling wholesale.
- Always live — anything that mutates state or reflects the current world: creating and updating records, searches, “what’s the status right now.” Never cache these; always go to the source.
Getting this wrong in either direction hurts: cache the live stuff and the agent acts on stale reality; fetch the stable stuff every time and you burn your rate limit and your latency on data that didn’t change. Naming which tier each kind of data is in, up front, is what keeps the agent both fast and correct.
On a mismatch, stop — don’t silently self-heal
The most important rule in the whole setup is what happens when the cache and live reality disagree. The tempting behavior is to have the agent quietly re-sync and carry on. Don’t. Report the mismatch and stop. A silent auto-correction hides the fact that your source of truth drifted — which is exactly the moment a human should look, because either the cache is stale or something changed underneath you, and both are worth knowing. An agent that papers over discrepancies is an agent you can’t trust, because you can’t see when it was wrong.
This is the same instinct as a dry run before a real run: make the agent’s uncertainty visible rather than letting it confidently guess. Speed comes from the cache; trust comes from surfacing the moments the cache might be lying.
Put it together and a useful agent is less about the prompt and more about the scaffolding around it: durable versioned context so it doesn’t relearn the basics, scoped tools so it acts through a safe interface, explicit cache tiers so it’s fast without being stale, and a hard rule to surface mismatches instead of hiding them. The model is the engine; this is the chassis. If you’re building agent workflows against a gnarly internal system and want to compare scaffolding, I’m easy to reach — it’s most of what the AI automation lab is about.