01 The problem
A full-service Canadian law firm with 100+ lawyers was facing margin pressure from an uncomfortable direction: its own timesheets. Associates were spending more than 40 percent of their time on document preparation, precedent research and contract drafting. The work was necessary. It was also work that clients increasingly refused to pay full rates for, and they were not wrong to refuse.
The firm had tried legal technology before. Those attempts failed for the usual reasons: tools that did not integrate with how lawyers actually work, and lawyers who, reasonably, declined to change how they work to serve a tool. So the mandate this time was strict. Quality could not move. Privilege could not be put at risk. And nobody was going to learn new software.
We built four AI agents inside the firm's existing workflows: client intake, precedent research, contract drafting and court documents. Document preparation time fell 60 percent. Precedent research time fell 75 percent. No lawyer opened a new tool to get those numbers.
02 The firm, and why this mattered
It helps to be specific about what document work means inside a firm like this. A new matter arrives, and someone assembles the intake: client details, case information, a conflict check, an engagement letter. A deal needs an agreement, and someone finds the closest precedent the firm has, copies it, and adapts it clause by clause. A filing is due, and someone formats it to the court's requirements and checks every citation by hand. None of that is legal judgment. All of it consumes lawyers.
For a long time the billable hour absorbed the cost. Assembly was billed like everything else, so nobody had a reason to separate it from the law. That arrangement has been ending for years. Sophisticated clients now read legal bills line by line and push back on hours that look like assembly. More work moves to fixed fees, and on a fixed-fee matter every hour of assembly comes straight out of the firm's profit instead of onto an invoice. The same hours that used to be revenue quietly became cost, without anything about the work itself changing.
The firm's earlier technology purchases made this more frustrating, not less. The tools were capable enough. They failed at the point of use. Each one was a separate destination with its own login, its own format and its own habits to learn, and a lawyer with a deadline does not go to a second place to do work she can already do in the first place. The software sat unused, the subscriptions lapsed, and the partners drew the reasonable conclusion that legal tech does not stick.
That skepticism was not an obstacle to this project. It was the most useful requirements document we had, because it described exactly how the next attempt would fail if nothing changed.
03 What had to be true first
Before any build started, we agreed on four conditions with the firm. They are the spec behind every engineering decision in section 06.
Quality could not move. A draft that needs heavy rework saves nothing and burns trust. The standard was plain: output had to land at the level an associate would produce, and every output had to pass through a lawyer before it went anywhere. Efficiency had to come from removing assembly, never from lowering the bar.
Privilege had to stay protected. Solicitor-client privilege is not a policy preference. It is the foundation of what the firm sells. The system was built to respect it: privileged material lives in firm-controlled systems, model calls run under no-training, no-retention terms, and nothing privileged persists anywhere the firm does not control. Those answers were settled in writing before a single document moved.
Zero new tools for lawyers. The failure mode was already documented in the firm's own history: a destination app nobody visits. So the bar was set at zero. No new login, no new interface, no training course as a precondition for value. The output had to appear inside the workflow lawyers already used, or the project was not worth doing.
The firm's own precedents as the knowledge base. A generic legal corpus produces generic drafts. The firm's competitive asset is the work it has already done: its agreements, its clauses, its past answers to hard questions. The agents had to draw on that body of work first, with public databases as the supplement rather than the source.
04 What we built
Four agents, each owning one legal workflow end to end, all of them reading and writing the same matter file.
Client intake agent. When an inquiry comes in, the agent gathers case information, runs a conflict check against the firm's records, and prepares the engagement letter. By the time a lawyer first opens the matter, it is already a matter: facts collected, conflicts flagged, the letter drafted and waiting for review. This is the same pattern we productize as client onboarding automation, pointed at legal intake.
Precedent mining agent. Ask it a question and it searches the firm's own knowledge base and public databases together, in real time, and returns a ranked list of relevant precedents. That search used to be an associate's afternoon. The ranking favors the firm's own work product, which is the point: the firm has answered most questions before, and now those answers are findable.
Contract drafting agent. Deal parameters in, first draft out. The agent assembles common agreements from the firm's own templates and clauses, so lawyers review and refine instead of starting from a blank page. First drafts that took hours now take minutes.
Court document agent. Prepares court-ready documents with proper formatting and citations. This is the least glamorous of the four and the one with the least room for error, which is why formatting is handled the way section 06 describes.
Four agents, four workflows, one rule: the agent does the assembly, the lawyer does the law. In our catalog these are custom agents, and this build is a fair picture of what that term means in practice. Not a chatbot beside the work. A set of narrow, reliable systems inside it.
05 The technologies behind it
Three choices carried the build. The reasoning behind everything we ship, including what we refuse to use, is documented at the stack we actually run.
Claude for document reasoning. Legal work is long-document work: agreements, precedent sets, filings that run long and reference each other. Claude is our default model because it is the strongest long-document reasoner we have benchmarked, and because its API terms support the no-training, no-retention posture the privilege requirements demanded.
Supabase for the matter store and access control. Matter state, agent outputs and review status live in Supabase, with row-level security enforcing per-matter access in the database itself. Access control that lives in the database cannot be talked out of its rules by a clever prompt. Near privileged material, that is the only kind worth having.
The firm's own document store, left in charge. Documents stay in the firm's infrastructure, under the firm's permissions. The agents work around that store rather than replacing it, because the fastest way to kill adoption in a law firm is to ask it to migrate decades of documents into your product.
06 The engineering decisions that made it reliable
Agents embedded in the workflow, not a destination app. Every earlier failure was a place lawyers had to go. So nothing in this system is a place. Drafts, precedent lists and engagement letters arrive inside the matter file and the systems the firm already runs. Integration consumed more engineering effort than generation did, and that ratio was correct. Generation is what the model does. Integration is what makes anyone use it.
Nothing retained by model providers. Model calls cross the firm's boundary under no-training, no-retention terms, and the firm's store keeps every document, every draft and every version. What crosses the line is a request. What comes back is a draft. Nothing stays behind.
Formatting precision as a first-class requirement. Court documents fail on formatting before anyone reads the law in them: registry requirements, citation style, numbering, page layout. We treated layout as code, not as model output. Deterministic templates enforce the formatting, and the model supplies content into them. The agent is never trusted with the part a registry clerk will measure with a ruler.
Per-matter access control. A firm of 100+ lawyers runs on information barriers. The agents inherit the same rules: an agent working a matter sees that matter and nothing else, and the database enforces it rather than the prompt. The intake agent's conflict check writes to the same record, so the access rules and the conflicts process share one source of truth.
Every output is a draft. No agent in this system files, sends or signs anything. Output lands as a draft with a lawyer as the decision point, every time, by architecture rather than by policy. The system never asks the firm to trust machine judgment, so the firm never had to hold a meeting about how much to trust it.
07 Before and after
Two things changed: where the work happens, and how long it takes.
Before, using technology meant leaving the workflow. Open the other tool, sign in, reformat the inputs, carry the output back by hand. Under deadline pressure, nobody made that trip, and the firm's earlier purchases died of it. After, there is no trip to make. The first draft, the precedent list, the engagement letter: each is simply there, where the work already happens. The earlier tools asked lawyers to come to them. This system arrives.
The time change is the published outcome. Document preparation time fell 60 percent. Precedent research time fell 75 percent. The lanes below are drawn to those figures.
What did not change matters just as much. The standard of the work product, the lawyer's name on it, and the place the work happens are all untouched. The system removed hours, not judgment.
08 How the firm runs it
There was no adoption program, because there was almost nothing to adopt. A lawyer's first contact with the system was not an onboarding session. It was opening a new matter and finding the file already assembled, the conflict check already run, the engagement letter already drafted. The orientation a lawyer actually needed fit in one sentence: this is a draft, treat it like a junior's work.
Review is the firm's quality control, and it works because it was already the firm's habit. Lawyers review agent output the way they have always reviewed associate output, and when something misses the standard, the correction flows back into the agents' instructions, which are plain text the firm can read and change. The system improves the way the firm has always improved: by review.
Resistance did not repeat because the trade on offer was different. Nothing asked a lawyer to trust a machine's legal judgment. It asked them to stop doing formatting, searching and first-pass assembly. That trade sells itself. When a rollout does need a full change program, with training and champions and dashboards, it looks like our enterprise AI adoption work. This one needed subtraction, not a program.
09 Results
- Document preparation time down 60 percent
- Precedent research time cut 75 percent
- Contract first drafts: hours down to minutes
- Associates reallocated to higher-value client work
- Improved margins on fixed-fee matters
The margin point matters most. On fixed-fee work, every hour of document assembly comes straight out of profit. Cutting that time by more than half did not just speed the firm up. It changed the economics of how the firm prices, because the same fee now carries far less cost inside it. The hours that came back went where the rates are defensible: client work that actually requires a lawyer.
10 What to know before building this
For legal teams weighing the same build, the transferable lessons are these.
Remove assembly. Never ask for trust in machine judgment. Adoption in a law firm is a trade, and only one trade closes. The moment a system asks a lawyer to delegate judgment, the lawyer is right to refuse, and the system dies in review. Buy back the hours spent on formatting, searching and first drafts, leave the law entirely alone, and the system gets used without anyone mandating it.
Integration is the whole game. Capability is table stakes. Every tool this firm had abandoned was capable. Before buying or building anything, ask one question: where does the output land? If the answer is "in our app," you are buying the failure you already own. If the answer is "inside the file your lawyers already have open," you are buying something they will use.
Settle privilege before anything else. Where documents live, what crosses to model providers, under what terms, and what persists afterward. Get those answers in writing before the first document moves, because they decide the architecture. Built to respect solicitor-client privilege is a design input, not a compliance memo written after the fact.
Start where assembly is heaviest and judgment lightest. Intake and precedent research are first-pass collection and search, the easiest places for lawyers to accept help and the fastest to show time savings. Drafting earns its place after the first two have built trust. Court documents demand the most engineering discipline, so they benefit most from a system that already works.
The wider picture for this industry, including the regulatory questions firms ask us before they start, is at AI for law firms.
11 Related work
This build sits inside a family of systems.
- Enterprise AI adoption: the companion problem, when the hard part is people rather than plumbing.
- Custom agents: the service behind all four agents in this build.
- Client onboarding automation: the intake pattern, productized for any firm that chases documents.
- AI for law firms: what we build for legal teams and the rules we design around.
- The stack we run: every tool in this build, with the reasons attached.