Teaching Your LLM the Codebase

Two developers. Same codebase. Same LLM. Different code. That’s not a bug in the LLM – it’s a missing brief.

After the BDD work, Tom and Priya are both leaning hard on LLMs. The Feature files make it easy: hand the LLM a .feature file, ask for an implementation, get code back. Tom noticed the LLM generates code faster than he can review it. That’s true. But he’s about to notice something else.

The code review that took an hour

Tom opens Priya’s pull request. The code is correct – tests pass, behaviour matches the feature file. But it looks nothing like his code. Her handler functions return early on errors. His use if-else chains. Her test names read like sentences: TestPausedSubscription_CannotChangeBoxSize. His read like labels: TestChangeBoxSizePaused. Her structs have unexported fields with constructor functions. His have exported fields.

None of this is wrong. It’s all defensible. But the review takes an hour because Tom keeps stopping to ask: “Is this a style choice or a behaviour choice?” Every difference is a potential bug he has to investigate.

He brings it up at standup. “Priya’s code and my code look like they were written by different teams.”

Priya frowns. “We’re using the same LLM. Same model, same tool.”

“But not the same prompts,” Lee says. He’s been listening. “You’re each telling it something different about how you want the code to look. The LLM doesn’t have opinions – it reflects whatever you give it.”

The experiment

Lee suggests they test this. Same task, both developers, compare the results. The task: write a function that calculates the next delivery date, skipping public holidays. Same requirements. Same language. Same LLM.

Tom prompts: “Write a Go function that calculates the next delivery date after a given date, skipping any dates in a public holidays list.”

Priya prompts: “In our Greenbox codebase we use custom types for dates and guard clauses for validation. Write a Go function that calculates the next delivery date after a given date, skipping public holidays. Return an error if the input date is in the past.”

Tom gets back a clean function. It takes time.Time and []time.Time, returns time.Time. No error handling. No validation. Works fine.

Priya gets back a function that takes a DeliveryDate type and a HolidayCalendar interface. Guard clause at the top rejects past dates. Returns (DeliveryDate, error). The generated code matches the patterns in the rest of the codebase because she described those patterns in the prompt.

“You gave it context,” Tom says.

“I gave it the same context I’d give a new developer on their first day,” Priya says. “Here’s how we do things. Here’s what the conventions are. Here’s what the types look like.”

“But you had to type all of that every time.”

“Right. And that’s the problem.” Priya pulls up the Claude Code documentation on her screen. “There’s a way to make it permanent.”

The brief

Lee draws a parallel to his consulting work. “When I join a new client, the first thing I look for is a brief: how the team works, what they’ve decided, what they’ve explicitly rejected. When the brief exists, I’m productive in days. When it doesn’t, I spend weeks asking ‘why did you do it this way?’”

“The LLM needs the same thing,” Priya says. “And there’s a file for it.”

In Claude Code, this brief is a file called CLAUDE.md. It lives in the root of the repository. Every time the LLM starts a task, it reads this file first. The file becomes the persistent context that Tom was missing and Priya was typing out by hand.

“Think of it as the onboarding document for your AI pair programmer,” Priya says. “Everything you’d tell a new hire in their first week goes in this file.”

What goes in the brief

The team sits down and writes their first CLAUDE.md together. Lee facilitates – he’s good at drawing out the things people know but haven’t said aloud. He asks three questions:

“What patterns have you settled on?”

Priya lists what she’s been pushing for over the past few months: guard clauses for early returns, table-driven tests, custom types for IDs and dates instead of raw strings, unexported struct fields with constructor functions, error wrapping with fmt.Errorf("context: %w", err). Tom nods along. He’s not sold on all of it – the typed IDs still feel like boilerplate to him – but he can’t argue with the consistency.

“What patterns have you explicitly rejected?”

This one surprises Tom. He hadn’t thought about anti-patterns as something to document. But Priya points out: “The LLM keeps generating interface{} parameters. We never use those. It keeps creating utility packages. We don’t have a utils package and we don’t want one.”

Lee nods. “Telling the LLM what not to do is as important as telling it what to do. Same as onboarding. A new developer who’s told ‘we don’t use global state’ won’t introduce global state. An LLM that’s told the same thing won’t either.”

“What does someone need to know about the domain?”

This is where Maya’s language matters. The LLM shouldn’t call it an “order” – it’s a “subscription.” It shouldn’t call it a “product” – it’s a “box.” The delivery happens on a “delivery day,” not a “shipping date.” The team has been building a shared vocabulary, and the LLM needs to speak it too.

The first version

They write a CLAUDE.md that fits on one screen. Lee insists on this. “If it’s longer than a page, nobody will maintain it. Not the developers, and not the LLM – it’ll dilute the important stuff with noise.”

The file covers:

Project structure: where things live, what each package does.
Coding conventions: guard clauses, error handling, test naming, no utils package.
Domain language: subscription not order, box not product, delivery day not shipping date.
Build and test commands: go test ./..., go vet ./..., how to run the linter.
What not to do: no interface{}, no global state, no utility packages.

Tom commits it. The next morning, he prompts the LLM with the same delivery date task. Without changing his prompt at all, the generated code comes back with a DeliveryDate type, a guard clause, and the domain terminology.

“It read the brief,” he says.

“It read the brief,” Priya confirms.

When the team grows

A month later, Kai joins the project. He’s a contractor, less familiar with the codebase. His first day, he sets up Claude Code, opens the repo, and starts working. His first PR looks like it was written by someone who’s been on the project for months. The naming is right. The patterns match. The test structure follows the team’s convention.

Tom reviews it in fifteen minutes. No style questions. No “we don’t do it that way” comments. Just a review of the logic.

“This is the real win,” Lee says. “The CLAUDE.md isn’t just for the LLM. It’s for every developer who works with the LLM. When the brief is right, the generated code teaches the patterns to new team members faster than any onboarding document.”

Kai reads the CLAUDE.md himself, separate from the LLM. “This is the best onboarding doc I’ve ever seen,” he says. “And it’s thirteen lines of conventions.”

Beyond the project root

The team discovers that some conventions are package-specific. The subscription package has rules about status transitions that don’t apply elsewhere. The billing package has rules about how invoice amounts are stored (cents, not dollars).

Claude Code supports CLAUDE.md files in subdirectories. A CLAUDE.md in subscription/ applies when working in that package. The root CLAUDE.md applies everywhere. The specificity model is the same as .gitignore – closest file wins for its scope, with the root as the baseline.

Tom adds a CLAUDE.md to the subscription package:

Status transitions: Pending → Active → Paused → Active (resume) or Cancelled.
Paused subscriptions cannot change box size.
Cancelled subscriptions cannot be modified at all.
NewSubscription starts in StatusPending.

Four lines. The LLM generates subscription code that respects the status rules every time.

Specialised agents

Priya finds the next piece. “What if the LLM could behave differently depending on the task? When it’s writing tests, it should be thorough and consider edge cases. When it’s reviewing code, it should check for convention drift. When it’s writing migration code, it should be conservative and prefer backwards compatibility.”

This is what AGENTS.md does. Where CLAUDE.md is the general brief, AGENTS.md defines specialised roles – agents with specific instructions, tools, and constraints.

The team starts with two:

A test writer agent that knows about the team’s test conventions – table-driven tests, descriptive names, the distinction between hard constraints and soft expectations in test naming.

A reviewer agent that checks PRs for convention drift – exported fields that should be unexported, missing error handling, deep nesting that could be a guard clause.

Priya sets these up. When she asks the LLM to write tests, it applies the test writer’s conventions automatically. When Tom asks for a code review, the reviewer checks for the patterns the team has agreed on.

“The agents encode what we’ve learned,” Tom realises. “If someone new joins, they don’t just get the conventions – they get the reasoning built into the tool.”

Lee smiles. “That’s the best kind of process. The kind that outlives the person who set it up.”

What the team learned

Three months later, the CLAUDE.md has been updated fourteen times. Each update is small – a line added when a new convention is agreed, a line removed when a pattern is abandoned. The file is a living document of the team’s coding standards, maintained not by discipline but by self-interest: when the CLAUDE.md is accurate, the LLM generates better code, and reviews go faster.

Tom, who started the week typing bare prompts and getting inconsistent results, now treats the CLAUDE.md as seriously as the test suite. “Tests tell you if the code is correct. The CLAUDE.md tells the LLM how to write code that’s correct and consistent.”

The insight that sticks: the style of your codebase is a few-shot prompt. When the codebase is consistent, the LLM generates consistent code. When the conventions are explicit, the LLM follows them. CLAUDE.md is just making that implicit prompt explicit – and shareable across a team.

What the files look like

The team’s actual CLAUDE.md and AGENTS.md files – what goes in them, how they’re structured, and how they shape the LLM’s output – are worth seeing in detail. Next: CLAUDE.md and AGENTS.md in practice.