Continuous Discovery: Making It Stick

GreenBox is a produce-box company that grew from five people and a wrong assumption to 5,000 subscribers across three Australian cities. Along the way, the team learned fifteen discovery techniques. The question now is how to make that learning stick when the daily pressure of shipping never lets up.

This is the last post in the GreenBox story.

It started with five people in a room, a good idea, and four weeks of building the wrong thing. It ends with 5,000 subscribers across three cities, three squads, twenty-five people, and a product that removes weeknight dinner stress for thousands of Australian families.

The journey between those two points wasn’t a straight line. It was a series of hard-won lessons about the gap between having an idea and understanding it well enough to build it. Every technique in this series – Event Storming, Example Mapping, Impact Mapping, User Story Mapping, JTBD, Assumption Mapping, Business Model Canvas, Domain-Driven Design, Decision Tables, ADRs, Value Stream Mapping, Wardley Mapping, Ensemble Programming, Cynefin, Threat Modelling – exists to close that gap. To get from “we think we know what to build” to “we actually know what to build.”

But techniques don’t sustain themselves. People learn them, use them intensely for a while, and then slowly stop. The urgency of shipping crowds out the discipline of understanding. New team members join who weren’t there for the workshops. The photographs of the Event Storm wall fade. The muscle memory atrophies.

The question for any growing organisation is: how do you make discovery stick?

The drift

Charlotte notices it gradually.

In the Perth squad, the fortnightly retro has slipped to monthly. Then it gets cancelled because of a release deadline. Then it doesn’t get rescheduled. Tom mentions in passing that they haven’t done an Example Mapping session in three weeks because “the stories are all Clear.” Maybe they are. Maybe the team is pattern-matching on familiarity and mistaking comfort for clarity.

In Melbourne, Anika’s squad is shipping fast. Their velocity metrics look great. But their customer satisfaction scores have dipped, and nobody can explain why. The squad hasn’t talked to a customer in six weeks. They’re building what the backlog says, but the backlog was written two months ago, and nobody has checked whether it still reflects what customers actually need.

The remote squad, which handles exploratory work, is doing fine – they’re running experiments and JTBD interviews for the Brisbane expansion. But their learnings aren’t reaching the other squads. The insights from Brisbane customer interviews are sitting in a shared document that nobody in Perth or Melbourne has read.

The squads are drifting apart. Not dramatically – nobody’s building the wrong thing the way Tom did in week one. But the edges are fraying. Decisions are being made in isolation. Context is being lost. The shared understanding that was the team’s superpower at five people is thinning at twenty-five.

On a Saturday morning in early September, Maya drives down to the Margaret River farmers’ market. She goes every few weeks – partly for produce, partly because the market is where she first met Dave Morrison, and partly because the two-hour drive through the jarrah forest is the only time she’s truly unreachable.

Dave is at his usual stall, between the honey seller and a woman who makes goat’s cheese. He’s selling surplus zucchini and the last of the season’s tomatoes. His son Ben handles the GreenBox supply now, but Dave still comes to the market because he’s been coming to the market since before Ben was born.

They get coffee from the van at the end of the row and sit on an upturned crate behind Dave’s stall. The morning is cold – Margaret River cold, the kind that sits in your bones until the sun gets above the trees.

Dave tells her about the frost of 2019. She’s heard pieces of it before, but never the full story. He woke at 4am to find ice on the inside of the greenhouse plastic. By dawn he knew the entire tomato crop was gone. Three months of work. He didn’t tell Helen until that evening because he spent the day walking the rows, pulling up dead plants, trying to find something salvageable. There was nothing.

“I didn’t call anyone,” he says. “Didn’t tell the other farmers. Didn’t tell the buyer who was expecting ten crates that Friday. Just kept going. Replanted the next week.”

Maya is quiet for a while. Then she tells him something she’s never told anyone except Nadia. During Series 2 – after the unit economics session, after Charlotte showed her the numbers that didn’t work – Maya sat at her laptop and drafted an email to subscribers. “Dear subscribers, we’ve made the difficult decision to pause operations.” She got three sentences in before Nadia came into the room and told her to come to bed.

“I never sent it,” Maya says. “But I never deleted it either. It’s been sitting in my drafts for six months.”

Dave looks at her. “You don’t farm for the good years,” he says. “You farm so the bad ones don’t kill you.”

They sit with that. The market fills up around them. A woman with two kids stops at Dave’s stall and buys a bag of zucchini. Dave waves her off when she tries to pay for a second bag. “Take it. They’ll go to waste otherwise.”

Maya watches him. This is what Freshly will never have. Not the produce. Not the logistics. This.

Charlotte raises it with Lee over coffee a few days later. “The techniques work when people use them. The problem is that people stop using them. Not because they don’t believe in them – because the daily pressure of shipping is relentless and discovery feels optional when things are going well.”

Lee nods. “That’s the trap. Discovery feels most optional when it’s most needed. When everything is running smoothly, the team assumes they understand the problem. Then the market shifts, or a new competitor appears, or customer behaviour changes, and the assumptions they built on are suddenly wrong. The correction is expensive because nobody noticed the drift until it compounded.”

Teresa Torres and continuous discovery

Charlotte has been reading Teresa Torres’ work on Continuous Discovery Habits. The core argument is one that the GreenBox story has been illustrating all along: discovery isn’t a phase you do before building. It’s a weekly habit embedded in the rhythm of delivery.

Torres’ model centres on three ideas:

Talk to customers every week. Not once a quarter. Not when something goes wrong. Every week, at least one conversation with a real customer (or potential customer). The conversations don’t need to be hour-long research sessions. Fifteen minutes. What’s working? What’s frustrating? What changed in your life recently that affects how you use the product?

Map your assumptions. At any given time, the team is building on a stack of assumptions – about customers, about the market, about the technology, about the business model. Most of those assumptions are invisible. Continuous discovery makes them explicit, ranks them by risk, and tests the riskiest ones first.

Connect everything to outcomes. Every piece of work should trace a line back to a measurable outcome. Not a feature shipped. Not a story completed. An outcome for a customer or the business. Impact Mapping does this at the strategic level. Continuous discovery does it at the weekly level.

Charlotte takes these principles and adapts them for GreenBox’s three-squad structure.

The GreenBox weekly cadence

Charlotte establishes a rhythm. Not a rigid process – a set of practices that happen at predictable intervals. Each squad adapts the details to their context, but the skeleton is the same.

Monday: assumptions check. Five minutes added to the Monday standup. Each squad reviews the assumptions underlying their current work. “We’re building the Brisbane meal plan feature because we assume Brisbane customers want the same meal planning tools as Perth customers. Have we validated that? What’s the riskiest assumption we’re carrying right now?”

Most Mondays, the answer is “our assumptions are fine.” That’s OK. The point isn’t to find problems every week. The point is to build the habit of asking. When an assumption does turn out to be wrong, the team catches it in a five-minute standup check instead of a two-week post-mortem.

Tom adds a weekly “system health” check to the same Monday standup – deploy frequency, any incidents, any alerts that fired. The first one reveals something: the uptime monitor says everything is fine, but three subscribers complained about missing box previews last week. The system was up but the preview feature was broken – a silent failure that monitoring didn’t catch. Tom: “We’re monitoring whether the front door is open. We’re not checking whether anyone’s home.” He adds checks for key user journeys: can a subscriber see their box preview? Can they pause? Can they pay? Not just “is the site up” but “are the important things working.” Charlotte notes it: “You’ve been measuring discovery practices for months. Now you’re measuring delivery practices. Same principle – if you don’t measure it, you can’t improve it.”

Tuesday: one customer interview per squad per week. This is the non-negotiable. Charlotte schedules interview slots for each squad on Tuesdays. The interviews are short – fifteen to twenty minutes. The team uses a rotating interviewer so that everyone gets practice and nobody treats it as someone else’s job.

The remote squad interviews Brisbane prospects. The Melbourne squad interviews Melbourne subscribers. The Perth squad interviews a mix of subscribers and churned customers.

The LLM helps here. It transcribes the interviews in real time. After the conversation, the interviewer feeds the transcript to the LLM and asks for a summary of key insights, surprises, and potential implications for current work. The summary goes into a shared channel that all three squads can see.

This is where the Brisbane insights start reaching Perth and Melbourne. Not through a document nobody reads, but through a weekly stream of customer conversations that anyone can follow.

Wednesday: Example Mapping for the week’s stories. Only for stories classified as Complicated or above using the Cynefin framework. Clear stories skip the session. Complex work gets a different treatment – experiment design rather than Example Mapping.

The Wednesday sessions are tight. Twenty-five minutes per story. Coloured cards. Timer running. The same discipline the team learned in month two, but applied selectively to the work that genuinely needs it.

Thursday and Friday: build, ship, measure. The second half of the week is delivery. Write the code, run the tests, deploy, check the metrics. The LLM is embedded throughout – generating code from decision tables, drafting ADRs for architectural decisions, producing first-pass threat models for features that touch system boundaries.

Fortnightly: retrospective. Every two weeks, as Lee taught them in the very first retro. What’s working? What’s not? What will we do differently? The retro is where the cadence itself gets reviewed and adjusted.

Monthly: Impact Map review. Are we still building toward the right goal? Has the goal changed? Have the assumptions about which impacts matter most shifted? The monthly review is a thirty-minute session where each squad checks their current work against the Impact Map and flags anything that’s drifted.

Quarterly: Wardley Map review. Has the competitive landscape changed? Are there components that should move from build to buy (or the reverse)? Has new technology – a new LLM capability, a new third-party service, a competitor’s move – changed the strategic picture?

Weekly Rhythm

Monday
Assumptions check

Tuesday
Customer interview

Wednesday
Example Mapping

Thu–Fri
Build, ship, measure

Periodic

Fortnightly
Retrospective

Monthly
Impact Map review

Quarterly
Wardley Map review

How LLMs fit the rhythm

The LLMs are embedded in every part of the cadence now. Not as the centrepiece – as the infrastructure.

Transcription. Tuesday customer interviews are transcribed in real time. The team used to spend twenty minutes writing up notes after each interview. Now the LLM produces a summary within seconds of the conversation ending.

Synthesis. At the end of each month, Charlotte feeds the month’s interview summaries to an LLM and asks: “What patterns do you see across these conversations? What are customers consistently saying? What’s changed from last month?” The LLM is good at spotting patterns across large volumes of qualitative data. It catches themes that individual interviewers miss because they only see one conversation at a time.

Drafting ADRs. When the team makes an architectural decision during an ensemble session, one person captures the context and the LLM drafts the ADR. The draft is never perfect – it misses nuance, overstates certainty, and sometimes gets the trade-offs wrong. But it’s faster to edit a draft than to write from scratch, and the existence of a draft means the ADR actually gets written instead of being deferred indefinitely.

Code generation from decision tables. The substitution engine still runs on decision tables that Maya maintains. When the tables change – a new seasonal rule, a new allergen combination – the LLM generates the updated code. The ensemble reviews it. The code is consistent because it comes from the same source every time.

First-pass threat models. Before any feature that touches a boundary, the developer feeds the design to the LLM with a STRIDE prompt. The LLM produces a first pass. The team reviews and adds context. Thirty minutes instead of two hours.

The LLM isn’t making decisions. It’s reducing the friction in the team’s thinking processes. Every minute the team doesn’t spend on mechanical work – transcription, drafting, enumeration – is a minute they can spend on judgement, context, and domain understanding.

There’s a pattern here worth making explicit. In every case, the LLM handles the mechanical work and the humans handle the judgement. The LLM transcribes; the human decides what matters. The LLM spots patterns; the human decides which patterns are meaningful. The LLM drafts; the human decides what’s accurate. The LLM enumerates threats; the human decides which threats are realistic.

This division of labour is the thread that runs through the entire GreenBox story. The LLM is a force multiplier for human thinking. It amplifies whatever the team brings to it. When the team brings clear domain understanding, the LLM produces excellent code. When the team brings vague assumptions, the LLM produces plausible-looking code that’s wrong. The quality of the output is bounded by the quality of the thinking that feeds it.

That’s why the discovery cadence matters. The weekly interviews, the assumption checks, the Example Mapping sessions – they’re not just about building the right thing. They’re about making sure the team’s thinking is sharp enough to get good outputs from the LLMs. Discovery feeds understanding. Understanding feeds prompts. Prompts feed code. The chain is only as strong as its first link.

The hard part: keeping it going

Establishing the cadence took Charlotte about six weeks. Keeping it going is the harder job.

People skip the Monday assumptions check when they’re under deadline pressure. The Tuesday interviews get rescheduled when a production incident eats the morning. The Wednesday Example Mapping sessions get cancelled when there’s nothing Complicated in the week’s stories.

Charlotte doesn’t fight every skip. She watches for patterns. A skipped interview one week is fine. Three skipped interviews in a row is a signal. She brings it to the retro. “We’ve gone three weeks without talking to a customer. What’s getting in the way?”

Usually the answer is time pressure. The squads feel like they don’t have time for discovery because they’re behind on delivery. Charlotte’s response is always the same: “The last time this team went six weeks without talking to customers, we built the Brisbane meal planning feature that nobody wanted. That cost us three weeks of rework. Which is more expensive – fifteen minutes on a Tuesday or three weeks of building the wrong thing?”

The argument lands because it’s grounded in their own history. The GreenBox team has lived through the cost of skipping discovery. They’ve rebuilt Tom’s subscription model. They’ve thrown away Jas’s customisation designs. They’ve watched the Brisbane meal planner get quietly shelved. The abstract case for continuous discovery is unconvincing. The concrete case – their own experience – is irrefutable.

Cross-squad alignment

At twenty-five people and three squads, keeping everyone aligned is Charlotte’s biggest challenge.

The Perth squad understands the Perth market. The Melbourne squad understands Melbourne. The remote squad understands Brisbane. But nobody understands all three simultaneously. Decisions made in one squad affect the others – a change to the substitution engine in Perth affects Melbourne’s produce rules, a new box size in Brisbane creates a pricing precedent that Perth customers might expect.

Charlotte introduces two practices to keep the squads connected.

Fortnightly cross-squad sync. Thirty minutes, one representative from each squad. Not a status update – a context share. “Here’s the most interesting thing we learned from customers this fortnight. Here’s the biggest assumption we’ve invalidated. Here’s a decision we’re about to make that might affect you.” It’s a lightweight mechanism for surfacing the information that would otherwise stay siloed.

Shared customer insight feed. The Tuesday interview summaries go to a channel that all three squads see. Charlotte adds a weekly digest – a five-minute read that highlights the most important customer insights from across the organisation. The LLM drafts it. Charlotte edits it. It goes out every Friday.

These practices don’t eliminate drift. They slow it down. The squads still develop their own cultures, their own shorthand, their own assumptions. But the regular touchpoints create opportunities for those assumptions to be challenged before they harden into silos.

Ravi, who works across the Perth squad and the remote squad, becomes an informal bridge. He notices when Brisbane learnings are relevant to Perth and flags them. He spots when a Perth architectural decision will create problems for Melbourne. He’s not a formal liaison – he’s just someone who works across boundaries and pays attention. Charlotte values this enormously and makes sure the cross-pollination is visible in his peer feedback.

What the cadence catches

Three months after establishing the continuous discovery rhythm, Charlotte reviews what the cadence has caught that wouldn’t have been caught otherwise.

The Tuesday interview that changed the Brisbane strategy. A Brisbane prospect – the same Jen from New Farm who had mentioned Freshly during the Cynefin-era interviews – came back for a follow-up conversation during a routine Tuesday slot. She mentioned that she’s still unhappy with her current box because it’s too large for a single person and half the produce goes to waste. The remote squad had been planning Brisbane launch boxes at the same sizes as Perth. The interview confirmed what the earlier data had hinted at: a “single person” box at 60% of the small size, at a lower price point, would open an entirely new market segment. They pivoted the Brisbane pilot. The new size outsold the original plan three to one.

If the team hadn’t been interviewing every Tuesday, that conversation wouldn’t have happened until after launch. The box sizes would have been wrong. The pilot would have underperformed. And the team would have concluded that “Brisbane doesn’t want produce boxes” when the real conclusion was “Brisbane wants different produce boxes.”

The Tuesday interview that made the team channel go quiet. Mrs Patterson – eighteen months subscribed, the customer whose name Maya first mentioned during an Example Mapping session back in month two – was one of the Perth squad’s Tuesday interviews. Sam conducted it. Fifteen minutes, routine. At the end, Sam asked the standard closing question: “Is there anything else you’d like to tell us?”

Mrs Patterson was quiet for a moment. Then she said: “I’ve never met any of you, but I feel like you know me.”

Sam thanked her, ended the call, and sat at her desk for a while. Then she screenshotted the transcript line and posted it in the team channel. No commentary. No emoji. Just the words.

Nobody replied for twenty minutes. Then Priya reacted with a single thumbs-up. Then Maya. Then Tom. Then Kai, from Melbourne. Then Ravi. One by one, every person in the company.

The Monday assumptions check that prevented a pricing mistake. During a routine Monday standup, Anika’s squad flagged an assumption: “We believe Melbourne customers will pay the same delivery surcharge as Perth customers.” Someone asked, “Have we checked?” Nobody had. A quick look at competitor pricing in Melbourne revealed that delivery expectations are different – Melbourne customers expect free delivery on subscription services because most competitors offer it. The surcharge would have caused immediate churn.

The monthly Impact Map review that killed a feature. The Perth squad had been building a “recipe suggestions” feature for three weeks. At the monthly Impact Map review, Charlotte asked the standard question: “Which impact does this serve?” The squad traced it back to “subscribers stay subscribed.” The hypothesis: if we suggest recipes based on the box contents, subscribers will use more of their produce and feel they’re getting more value.

Reasonable hypothesis. But the Tuesday interviews over the previous month had consistently shown something different. Subscribers who stayed weren’t asking for recipes. They were asking for more control over allergen handling and easier pause/resume. The recipe feature was solving a problem that active subscribers didn’t report having. Charlotte didn’t kill the feature – she paused it and redirected the squad to the allergen handling improvements. Three weeks of work saved.

None of these catches are dramatic. There’s no single moment where the cadence prevented a catastrophe. It’s more like a series of small course corrections that keep the organisation pointed in the right direction. Each one is worth a few days or weeks of saved effort. Over a quarter, they compound into a fundamentally different trajectory.

Lee and Charlotte

On a Wednesday afternoon in late September, Lee and Charlotte meet for coffee at a cafe in Subiaco, near the GreenBox office. They haven’t sat down together properly in months. Lee has been stepping back – his involvement has shifted from weekly coaching to occasional deep-dive sessions when the team enters genuinely Complex territory. He drove up from Margaret River this morning, his surfboard strapped to the roof of his station wagon. He surfs poorly but persistently; the board is an optimistic gesture rather than evidence of skill.

Charlotte stirs her flat white. “When you first came in, what did you think the problem was?”

Lee considers. “Speed without understanding. The team was moving fast and building the wrong thing. The LLMs made it worse because they removed the natural friction that used to force conversations. In the old days, implementation was slow enough that developers had to talk to each other. The LLMs made implementation so fast that people stopped talking.”

“And now?”

“Now the team talks first and builds second. The discovery practices are embedded. But the challenge has shifted.” He pauses. “The challenge now is entropy. At five people, shared understanding happens naturally. You’re all in the same room. You overhear conversations. You absorb context by proximity. At twenty-five people across three cities, shared understanding has to be engineered. It doesn’t happen by accident any more.”

Charlotte nods. “That’s what the cadence is for. The weekly interviews, the cross-squad syncs, the monthly reviews. It’s all maintenance. Maintaining the shared understanding that used to be free when we were small.”

“How’s it going?”

Charlotte is honest. “Some weeks it works beautifully. Everyone’s aligned, the customer insights are flowing, the squads are building on solid ground. Other weeks it falls apart. Someone skips the interview. The retro gets cancelled. A squad goes two weeks without checking their assumptions and ships something that doesn’t quite land.”

She looks at her coffee. “You know, I coached a meal kit company before GreenBox. Three years ago. They had good people, good product, good funding. They went under. I’ve been carrying that around. I keep checking for the same patterns. Sometimes I push too hard on process because I think if the process is right, the outcome is guaranteed.” She looks up. “It’s not. I know it’s not. But I still check.”

“That’s not a flaw,” Lee says. “That’s experience with scar tissue.”

They’re quiet for a moment.

“What about you?” Charlotte asks. “You’ve been pulling back. How does that feel?”

Lee takes a long time to answer. He turns his coffee cup in his hands. “My wife – my ex-wife, Mei – she said something when we split up. She said I was always coaching other people’s lives. She wasn’t wrong. Twenty years of consulting. I’d fly into a company, help them see what they couldn’t see, and fly out. I was good at it. But I was doing the same thing at home – treating my family like a project I could optimise from the outside.”

Charlotte listens.

“GreenBox is the closest I’ve come to building something since I stopped trying to build a marriage. Maya reminds me of… not me. A better version of what I could have been if I’d committed to something instead of consulting about commitment.”

“How’s your daughter?”

Lee looks up, surprised. He’s mentioned Yuki so rarely that Charlotte asking about her feels almost intrusive.

“She’s at university. Environmental science, in Sydney. I’ve started calling her every Sunday.” A half-smile. “She doesn’t always answer. When she does, we talk about her research. Carbon sequestration in coastal wetlands. She’s trying to save something.” He pauses. “I know the feeling.”

Charlotte leans back. “What’s different about coaching teams that have LLMs? Because that’s the thread through all of this – every technique we’ve used, the LLM has been part of the picture.”

Lee thinks about it. “The techniques are the same. Event Storming is Event Storming whether you’re writing code by hand or generating it with an LLM. Example Mapping doesn’t change. The principles are the same: understand the problem before you solve it, make assumptions explicit, test your hypotheses.”

He takes a sip. “But the speed is different. Everything happens faster. The code arrives in minutes instead of days. Which means the consequences of misunderstanding arrive faster too. A wrong assumption in week one used to take two weeks to manifest as wrong code. Now it takes an afternoon. The feedback loop is tighter, which is good – but only if you’re paying attention.”

“And the risk?”

“The risk of moving fast without understanding is higher than it’s ever been. Because the LLM is confident. It generates code that looks professional, passes tests, handles edge cases. It looks like someone who knows what they’re doing wrote it. But the LLM doesn’t understand the domain. It doesn’t know that sweet potato isn’t a good substitute for pumpkin in July. It doesn’t know that logging credit card numbers will end the business. It doesn’t know that Brisbane customers care about organic certification more than Perth customers do.”

He sets down his cup. “The LLM can build anything you describe. The question is whether what you described is right. That’s always been the question. LLMs just made it more expensive to get wrong, because you can get wrong faster and at greater scale.”

Charlotte smiles. “So the answer is still the same. Invest in understanding.”

“The answer is always the same. Invest in understanding. The tools change. LLMs, new frameworks, new techniques. But the underlying principle doesn’t: shared understanding of the problem is the rarest and most valuable thing in software development. Everything else – the code, the architecture, the scale – follows from that.”

They sit quietly for a moment. The cafe is half empty on a Wednesday afternoon. Outside, a delivery van with the GreenBox logo pulls up to the building next door.

Charlotte watches it. “That’s Liam doing the Subiaco run. Two hundred and forty boxes every Thursday. When I started, it was sixty.”

Lee watches the van too. Something in his expression shifts – the consultant’s distance gives way to something more personal. “That’s something,” he says quietly. “That’s actually something.”

His phone buzzes. He glances at it. A message from Yuki: Dad, did you know mangroves sequester carbon 4x faster than terrestrial forests? Lee smiles and types back: I did not. Tell me more on Sunday.

He puts the phone away and looks at Charlotte. “Do you remember the retro? The first one. When Tom had rebuilt the subscription model twice and Jas had thrown away a full set of designs. The frustration in that room was incredible.”

“I remember Maya reading the sticky notes and saying ‘everyone is frustrated with me.’”

“And then Tom saw the pattern. ‘Maya understands the business. We don’t. And building stuff without that deep understanding isn’t working.’ That was the moment. Everything after that – the Event Storm, the techniques, the cadence – flows from that one insight.”

Charlotte nods. “We’ve scaled the insight. That’s what the cadence is. It’s the same insight – understand before you build – but institutionalised. Repeatable. Not dependent on one person having a good day or one retro going well. It happens every week whether anyone is feeling insightful or not.”

“That’s the trick,” Lee says. “Making the right behaviour automatic. Not heroic. Not dependent on willpower. Just… what we do. The cadence makes discovery feel like gravity rather than a choice.”

The cafe door opens. Tom walks in, spots them, and hesitates for a moment – as if he’s not sure he belongs in this conversation. Then he pulls up a chair.

“Sorry I’m late. The Brisbane squad had a question about the substitution pipeline.”

Lee nods. Charlotte pours him water from the table carafe.

Tom is quiet for a while, listening to them talk. Then, during a pause, he says: “Can I say something?”

They wait.

“The retro was the turning point. Not for the process. For me.” He looks at his hands. “I’d been building things alone my whole career. I was good at it. I thought being good at it was the point. The retro – that first one, where Lee asked us to just stop and talk – it was the first time I understood that the things I was building weren’t good enough because I was building them alone. The code was fine. The thinking behind the code was incomplete.”

He looks at Lee. “I’ve been thinking about the ensemble session. When Charlotte said ‘week one vibes.’ She was right. I did the same thing in month eighteen that I did in week one. Built something beautiful by myself and it was wrong in three different ways.” He pauses. “But this time I heard it. The first time, it took me four weeks. This time it took me a day.”

Lee smiles. “That’s growth.”

“It doesn’t feel like growth. It feels like I should have known.”

“Knowing and feeling are different timelines,” Lee says. “You knew after the first retro. You felt it after the ensemble. Both count.”

They sit for a while longer. The afternoon light comes through the cafe window and makes long shadows on the floor.

That evening, Maya is at her desk in the Perth office. Everyone else has gone home. She’s looking at the photo of her parents’ farm that she keeps next to her monitor – the one from before it was a subdivision, when the paddocks still had cattle and her mother grew vegetables in the kitchen garden.

She opens her email. Clicks on Drafts.

The unsent email is still there. Six months old. “Dear subscribers, we’ve made the difficult decision to pause operations…” Three sentences. She never finished it. She never told anyone she wrote it, except Dave, that morning at the farmers’ market.

Maya reads it once. Then she deletes it. The draft disappears.

She closes her laptop and looks out the window. Somewhere across Perth, two hundred and forty GreenBox subscribers are deciding what to cook for dinner. They won’t have to think about it. The box on their doorstep already decided.

The principle

If there’s one thing the GreenBox story demonstrates, it’s this: the hard part of building software has never been writing code.

The hard part is knowing what to write. Understanding the domain. Surfacing the assumptions. Getting multiple people to share the same mental model of a complex problem. Figuring out which problems are worth solving. Testing whether your solution actually helps someone.

LLMs have made the easy part trivially easy. Code generation, test scaffolding, boilerplate, migrations, API integrations – the LLM handles all of it. Faster than any human developer, often cleaner, certainly more consistently.

But LLMs haven’t touched the hard part. They can’t tell you whether your business model works. They can’t feel the frustration in a customer’s voice when the box arrives with the wrong substitutions. They can’t sense the tension in a room when two developers disagree about a domain concept. They can’t notice that the Melbourne squad hasn’t talked to a customer in six weeks and the backlog is drifting.

Every technique in this series exists because the hard part is hard. Event Storming gets the domain out of one person’s head. Example Mapping makes stories concrete. Impact Mapping connects features to goals. User Story Mapping shows the whole journey. JTBD reveals what customers actually need. Assumption Mapping makes invisible beliefs visible. Business Model Canvas checks whether the numbers add up. Domain-Driven Design draws boundaries that match the business. Decision Tables codify expert knowledge. ADRs record why decisions were made. Value Stream Mapping shows where effort is wasted. Wardley Mapping shows the strategic landscape. Ensemble Programming puts all the thinking in one room. Cynefin matches the approach to the problem. Threat Modelling asks what could go wrong.

None of these techniques write code. All of them make the code worth writing.

The GreenBox team learned this the expensive way in month one, when they built the wrong subscription model, the wrong customisation interface, and the wrong farm portal. They learned it again in month four, when they shipped features that didn’t move the subscriber number. They learned it again during the scaling phase, when new team members built on assumptions that the original team had long since invalidated. They’ll learn it again next quarter, when something they’re certain about turns out to be wrong.

Discovery isn’t a vaccine against building the wrong thing. It’s a practice that reduces how often you build the wrong thing and how much it costs when you do. The cadence – the weekly rhythm of interviews, assumptions checks, Example Mapping, building, measuring, and reflecting – is how you sustain that practice beyond the initial enthusiasm.

The teams that do discovery well aren’t the ones with the best techniques or the cleverest frameworks. They’re the ones that show up every Tuesday for the customer interview, even when they’re tired. They’re the ones that ask “what are we assuming?” on Monday morning, even when the answer is usually “nothing new.” They’re the ones that run the retro every fortnight, even when there’s a release to push out.

The discipline is unglamorous. The payoff is enormous.

But there’s one more gap. The weekly cadence keeps each squad aligned with customers. The fortnightly retro keeps the process healthy. The monthly Impact Map check catches strategic drift. But nobody has connected these layers to a yearly vision. Each squad has great weekly habits and no shared strategy.

Perth is optimising features. Melbourne is building B2B. Brisbane is piloting. Are these the right three bets? When the board asks Maya where GreenBox will be in twelve months, she can describe what each squad is doing this sprint. She can’t explain how those sprints add up to a strategy.

The weekly habits are working. Now the team needs to zoom out (coming 27 October).