01 The problem
AI agents have a memory problem, and so do agencies. An agent solves a hard problem at 2pm and has forgotten the solution by the next session. An engineer hits a deployment trap, works around it, and the workaround lives in one person's head until they leave. Knowledge accumulates in chat logs, which is to say it evaporates.
For a company that runs on AI orchestration, this is not a papercut. Every forgotten fix is a fix that gets paid for twice, and every undocumented decision is a future argument. We needed our agents and our people reading from, and writing to, the same brain. So we built one, and because it is the pattern we sell to clients as an internal knowledge brain, we run the reference implementation on ourselves.
02 What we built
A single source of truth with three tiers, machine-readable and human-readable at once.
The substrate is an Obsidian vault in version control: 257 notes at launch, organized into bands for company facts, operating procedures, client records, per-project knowledge, engineering patterns, and central memory. Basis for the count: the vault index at launch, June 2026.
The three tiers govern what knowledge means:
- Tier 1: agency SOPs. Universal procedures, shared across every project.
- Tier 2: project directives. Per-project rules, canonical in each project's repository and mirrored read-only into the vault, so the vault always reflects what is actually deployed.
- Tier 3: outputs. Real work products, kept for optimization and as worked examples.
Over the vault sits a retrieval layer: notes are embedded into a vector store (pgvector on an isolated database instance) and exposed through an API our agents call as tools. The key call is agentic: an agent passes its full task, and the pipeline decomposes it into targeted searches, retrieves across all tiers in parallel, and returns roughly 1,500 tokens of synthesized context instead of the 6,000 to 18,000 tokens manual searching burns. Basis: retrieval comparisons documented in our agent instructions, June 2026.
Agents also carry memory across sessions: observations and session summaries persist, searchable, with a timeline. The 2pm solution is still there next week, for every agent.
03 The error ritual
The part of this system we are most opinionated about is not retrieval. It is what happens when something breaks.
Every resolved failure gets logged with a severity (production-down, work-blocked, or friction) and must declare exactly one resolution: a prevention rule, a check added, or an accepted risk. No vague postmortems, no "we'll be careful next time". Before any risky operation, deploys, migrations, bulk sends, agents are required to check the error log first, so a known mistake is never repeated by a system that has its own incident history on file.
Durable discoveries get appended to the relevant directive's learnings section. We call the loop self-annealing: fix it, update the script, update the directive, log the lesson. The knowledge base is not documentation that drifts from reality. It is an organ that metabolizes failure.
04 Governance: two lanes
Writes follow a two-lane rule, because memory and law are different things.
Lane A, memory, writes automatically. Errors, learnings, session summaries and reviews go straight into the vault's memory bands. High volume, low ceremony, append-only in spirit.
Lane B, normative, goes through review. Changes to directives and SOPs, the rules that govern behavior, travel as pull requests in the owning repository, then mirror into the vault. An agent can remember anything it likes; it cannot quietly rewrite the rules it operates under.
That split is the design insight we would offer anyone building an organizational brain: the failure mode is not too little writing, it is ungoverned writing. Separate the lanes and both can run fast.
05 The technologies behind it
- Obsidian (markdown in version control) as the human-facing substrate; plain text outlives every tool.
- Supabase with pgvector for embeddings and retrieval, on a dedicated instance isolated from all client data and from our own operational database.
- Vercel hosts the knowledge API that agents call as tools.
- Claude runs the agentic retrieval pipeline and the agents that read and write the vault.
Stack reasoning is documented at /stack.
06 What running it taught us
Retrieval quality is an architecture problem, not a model problem. The jump in usefulness came from query decomposition and tiering, not from a bigger model.
Isolation is non-negotiable. Agency knowledge, operational data and client data live in three different places by construction. The vault can be brilliant without ever being a leak.
The ritual matters more than the tooling. A mediocre knowledge base with a mandatory error ritual beats a beautiful one nobody writes to. The forcing functions, check errors before risky work, declare a resolution after every failure, are the system.
This is the same pattern we deploy for clients as an internal knowledge brain, most recently inside the five-system build for a design firm, and the same retrieval discipline behind the PE deal desk.
07 Related work
The agents that read this vault run the agency: the internal operations system and the reply-to-deck pipeline. The client-facing version is part of custom agents. If your organization's knowledge currently lives in chat logs and one veteran's head, the free audit will tell you what a brain would change.