The Workshop: Event Storming an Architecture

For the story in action, see Event Storming: Building Shared Understanding — the Greenbox team running a Process Level session that produced the map an Architecture session later builds on.

About Event Storming

Event Storming is a sticky-note technique invented by Alberto Brandolini in 2013 for getting people with different knowledge of a process to build one shared mental model of it. Everyone stands in front of a long wall; each person grabs a pad of orange sticky notes and writes things that happened — in past tense, one per note — and sticks them on the wall. “Payment Submitted.” “Box Packed.” “Alert Fired.” “Deployment Rolled Back.” The wall becomes the shared surface. The sticky notes become the forcing function: you can’t hide behind vague hand-waving when you have to write a specific thing on a specific note and stick it where everyone can see.

Event Storming comes in three levels. This post covers Event Storming an Architecture — the zoom-in you run after the flow is clear, to turn it into code. The other two:

Big Picture — a whole domain or large subsystem, half a day to several days, mixed business and IT. Run it when you need cross-team alignment on how an entire product works.
Process Level (default) — one specific process, 2–3 hours, focused team. Most teams, most of the time, want this.

An Architecture session requires a prior Process Level map as input. Running one without a Process Level map is almost guaranteed to fail — you’ll end up designing software for a flow you haven’t agreed on yet. If you don’t have a Process Level map, run one first.

Intent

Take a Process Level event map and turn it into a software design — clusters of events become aggregates, clusters of aggregates become bounded contexts, and the commands that cross boundaries become the explicit APIs between components.

Also Known As

Brandolini calls this Software Design EventStorming or sometimes just Design Level. I use “Event Storming an Architecture” in this series because the session’s real output is the architecture — the aggregates, the bounded contexts, and the APIs between them — and naming it that way makes the purpose obvious. Occasionally shortened to “Aggregate Storming” when the emphasis is squarely on finding aggregate boundaries. Sometimes confused with general “Domain-Driven Design modelling sessions,” which may or may not use events as the backbone.

A few terms before we start

Event Storming an Architecture borrows vocabulary from Domain-Driven Design. If this is your first Domain-Driven Design session, the terms below will show up in every phase of the playbook. Read them now and you’ll follow along; skip them and the session will feel like it’s being run in a foreign language.

The clearest way to explain these concepts is with a shopping cart. Everyone has seen one, and it touches every idea we need.

Event — a thing that already happened, written in past tense. “Item added to cart.” “Item removed.” “Cart checked out.” The wall from the Process Level session is made of these. Events describe facts, not intentions.

Command — a request to make something happen, written in the imperative. “Add item.” “Remove item.” “Check out.” Every event has a command somewhere behind it — the command is how the event came to be. Commands can fail (“Add item” might fail because the item is out of stock); events always succeed, because they only exist once the thing has already happened.

Invariant — a rule about your data that must always be true. A shopping cart has an obvious one: the cart total must equal the sum of the line items. You should never, ever be able to look at a cart and see a total of $50 when the line items add up to $47. That would mean your database is lying, and somewhere a customer is about to get charged the wrong amount. The invariant is the rule; enforcing it is your job.

Aggregate — a piece of your system that enforces an invariant, all at once, with no half-states.

Here’s the concrete version. When a customer adds an item to the cart, two things have to happen: the line item gets inserted, and the total gets updated. These are not two separate things you do one after the other — they’re one atomic change. Either both happen, or neither does. You never leave the database in a state where the line item is there but the total hasn’t caught up yet, because if you did, the invariant would be broken and the cart would be wrong.

The shopping cart is the aggregate. It’s the smallest chunk of your system that you save as a single transaction, because its internal rules have to stay true at every moment. Inside the aggregate boundary, the rules are never broken. Outside, it’s a different story — more on that in a second.

A useful way to spot an aggregate: ask “if I changed this half of the data without changing the other half, would the business be wrong?” If yes, the two halves are in the same aggregate. If no, they’re probably not.

Now the interesting part: different aggregates can drift.

Imagine the cart references a product catalogue, which is a different aggregate. The catalogue has the canonical price of every product. What happens when a catalogue price changes?

In a naive design, you’d want every cart in the system to instantly update. But that’s often impossible at scale, and usually unnecessary. Instead, the business accepts a brief inconsistency: the catalogue changes now, the carts catch up when they’re next viewed. For a moment, the cart shows an old price that doesn’t match the catalogue. Nobody cares. When the customer checks out, the cart re-fetches the current price and reconciles.

That’s the aggregate boundary in action. Inside the cart aggregate, the total and the line items are always in sync — they change together, atomically. Between the cart and the catalogue, there’s a moment of drift, and the business tolerates it.

So when you’re looking at a wall of events in an Architecture session and trying to find aggregates, you’re really asking: which groups of events must stay instantly consistent, and which can drift? The instant-consistency groups are aggregates. The boundaries between them are where drift becomes acceptable — which is also where events (instead of direct calls) become the right way for one aggregate to tell another what happened.

Bounded context — a larger area of the system with its own internal language. One or more aggregates live inside a bounded context. Inside a bounded context, the word “customer” has one specific meaning. In a different bounded context — say, a support ticketing system — “customer” might mean something subtly different. That’s fine; bounded contexts are how you let those differences coexist without one team’s model polluting another’s. They’re also the natural unit of team ownership and service deployment.

There’s one more term — anti-corruption layer — that shows up in Phase 5. It’s the translator you put between your bounded context and something external (a third-party API, a legacy system) so their shape doesn’t bleed into your domain model. You’ll see it in context.

That’s enough to follow the playbook. If you want the fuller story, Eric Evans’ Domain-Driven Design (2003) is the canonical source, and Vaughn Vernon’s Implementing Domain-Driven Design (2013) is the more practical follow-up.

Motivation

You have a Process Level map of a flow, the team agrees on how it works, and now you need to turn it into code — specifically, into components that can be built, deployed, and changed independently.

A team has just run a Process Level session on their subscriber lifecycle. The wall is clear: Account Created, Payment Method Added, Plan Selected, Subscription Activated, Invoice Raised, Payment Captured, Access Granted, Plan Upgraded, Subscription Cancelled. Everyone agrees this is what happens. Now the lead developer is staring at the wall with one question: where does this split into services? One big service that owns all of it? A “billing service” and an “access service”? Four microservices, one per noun? An architect can guess, but a guess now is an expensive mistake later — services built on the wrong boundaries leak their insides into each other, and six months from now every feature touches every service.

Or: a team is splitting a monolith. The Process Level work has made the flows explicit; the question now is which parts move first, where the seams run, and what the contract looks like on each side of the seam. You can draw boundaries on a whiteboard in twenty minutes, but the boundaries that survive contact with the code are the ones drawn by the people who’ll implement them, in front of the wall of events they’re carving up.

Or: a new service is being designed into an existing landscape. It will emit events the billing service consumes and consume events the warehouse service emits. Someone needs to decide what those events look like, who owns them, and what invariants each side enforces — before the first commit, not after three sprints of discovery.

Event Storming an Architecture is the workshop you reach for when the what is clear and the how to split it into code is not.

Applicability

Use when:

A Process Level session has clarified a flow and you’re ready to model the code
A team is splitting a monolith into bounded contexts and needs the seams defined
You’re designing a new service that interacts with existing ones and need the command/event API before anyone writes the first line
Two teams are arguing about which one should own a piece of behaviour — the boundary is probably wrong
You’re inheriting a system whose internal boundaries nobody can explain and you want to redraw them from the events up

Don’t use when:

There’s no Process Level map yet. If the domain is still unclear, you’re designing on sand — run Process Level first
The scope is a single CRUD screen or an isolated utility
The decision has already been made politically and the session is being used to ratify it — everyone will feel it, and the output will be worthless
You’re trying to map a whole organisation — use Big Picture (publishes 13 April) instead

Participants

Facilitator. Ideally someone with both Event Storming experience and enough design background to spot when a proposed aggregate is going to fall over. They don’t model, but they have to know what a healthy model looks like.

Architects / tech leads. The people who will carry the design forward. Usually one or two. More than two and you end up with competing architectural philosophies eating the whole session.

Developers who will build it. At least two. Not “future developers” — the people who will actually write the first commit. Designs drawn by people who won’t build them get thrown away the first time they meet reality.

One domain expert. The same person (or one of the people) from the Process Level session. Their job is to answer “would the business really do it that way?” when a boundary proposal implies something the business wouldn’t agree to. You do not need a crowd of domain experts here; you’ve already done that work.

Group size: 3–6. Smaller than Process Level because this is a design activity — design-by-committee at 8+ people grinds to a halt. Fewer than 3 and you lose the argument that pressure-tests the boundaries.

Who to leave out:

Product and design. They were essential for Process Level; they’re a distraction here. This session is about code structure, not scope.
The rest of the Process Level group. The wider group’s job was done when the map went up. Inviting them back turns a design session into a committee review.
Anyone whose job title is “architect” but who won’t write code. Architectural pronouncements from outside the work produce designs nobody builds.

Structure

Phase	Duration	Materials	Key question
Review the Process Level map	10 min	The existing wall	“Do we all still see the same thing?”
Identify aggregate candidates	25 min	Marker, coloured dots	“Which events belong together?”
Draw boundaries	20 min	Thick marker on the paper	“Where is the line?”
Break	10 min	—	—
Classify commands	25 min	Blue notes, arrows	“Internal or crossing?”
External vs internal systems	15 min	Yellow notes, ring them	“Who owns this side of the line?”
Hotspot review	25 min	Pink notes	“Where is the design still fragile?”
Wrap-up and owners	15 min	—	“Who owns what next?”
Buffer	15 min	—	—
Total	2h 40min inside a 3-hour block

The six working phases are 120 minutes. The other 40 minutes are the unglamorous stuff: re-orienting everyone to the existing wall, mid-session break, wrap-up, and the conversations that inevitably run long when boundaries are contested. Twenty minutes of slack in the 3-hour block — don’t try to fill it.

Budget realistically. The 25-minute Phase 2 — Identify aggregate candidates and the 20-minute Phase 3 — Draw boundaries are the session’s thinking-hardest phases, and first-time facilitators almost always over-run them. If it’s your first Architecture session, budget 35 minutes for Phase 2 and 30 minutes for Phase 3, let the whole thing run to three and a quarter hours, and plan the room for that. Phase 2 is where the aggregate tests get applied and arguments land; Phase 3 is where the arguments at the edges of a cluster are worked out. Rushing either one produces a design that falls apart in the first sprint.

A note on materials. Architecture sessions work on top of an existing Process Level wall — so most of what’s on it (orange events, blue commands, yellow actors, pink hotspots) is already there from the earlier session. What you add is physical — a thick marker on the paper to draw aggregate boundaries, coloured dots to group event clusters into aggregate candidates, and arrows to show commands and events crossing lines. You’ll also introduce two new note colours in Phase 4: purple for policies (the “when this event arrives, this aggregate reacts like this” rules that tie subscribing aggregates to the events they listen for) and pale green for read models (the information a policy consults before firing). These come from Brandolini’s original notation and earn their place when you start drawing cross-boundary events and asking what the receiving aggregate actually does with them. The artefact has to survive being photographed, so use markers heavy enough to show up at distance.

Collaborations

The session alternates between small-group clustering and whole-group argument. Clustering is where two or three people decide which events feel like they belong together; argument is where the whole group challenges each cluster and moves the lines. Developers should lead the clustering — they’re the ones who’ll live with the consequences. The domain expert’s job is to veto clusters that don’t match business reality, not to propose them.

The rhythm is cluster first, argue second, draw third. Don’t let anyone draw a boundary on the paper until the group has agreed the cluster. Boundaries drawn too early get defended instead of examined.

Facilitator Playbook

Before the phases, here’s a running example the rest of this section will reference — a deliberately ordinary domain so you can see the moves without being distracted by novelty. This is the same subscription service the Process Level post uses, and the wall below is the exact output of that session. The Big Picture post zooms out one level further and shows this same lifecycle inside the larger subscription business (acquisition, growth, retention, winback). The three sessions form a zoom chain: Big Picture → Process Level → Architecture.

The wall as the Process Level session left it — sixteen actor-command-event triplets in rough time order. Small yellow stickies are actors (who or what kicks the motion off), blue stickies are commands (the intention), orange stickies are the resulting facts. There are no arrows within a triplet: spatial adjacency does the talking. The Architecture session will work on top of this wall and start drawing boundaries around groups of stickies.

Sixteen triplets, one flow, mostly understood by the team. The Architecture session asks: where are the code boundaries?

Phase 1 — Review the Process Level map (10 min)

“The wall is already up” — that’s the ideal. In practice, the Process Level session might have been two weeks ago, the original wall is long gone, and the only thing you have is photographs, a transcribed event list in a shared doc, and the team’s memory. That’s enough. Before the Architecture session starts, redraw the Process Level wall: print the event list big, cut it into sticky-note-sized strips, and stick them up on fresh paper in the same rough order. Ten minutes of preparation by the facilitator the day before. The Architecture session assumes a wall is in the room; it doesn’t care whether the wall was built an hour ago or a fortnight ago, as long as it’s physically present and the events on it are the ones the team agreed on.

The wall is up. Walk it end to end. Re-familiarise, re-agree, catch the few things that have drifted since the Process Level session.

Open with:

“We’re not redoing the Process Level work. We’re taking the flow on this wall as given, and turning it into a design. If something on the wall is wrong, say so now — we’ll fix it in the first five minutes and then the wall is frozen for the rest of the session.”

Walk the events in order yourself, out loud, pointing at each one. Invite corrections. Make them on the wall. Then draw the line:

“Good. From this point on, the events on the wall are the ground truth. If you disagree with the flow, that’s a different session. Today we’re asking where the code boundaries go.”

What to watch for:

Someone re-opening Process Level decisions. Let small corrections happen; block major re-litigation. “That’s a Process Level conversation — let’s book it and move on.”
Silent consent that isn’t real consent. If the room is quiet, name one person directly: “You were at the Process Level session — does this flow still match what you remember, or has your model moved since?”
Events that have moved in people’s heads since the session. Worth updating the wall before you design on top of it.

Phase 2 — Identify aggregate candidates (25 min)

This is the heart of the session. An aggregate is a cluster of events that share a consistency boundary — they change together, they enforce their own invariants, and a single command usually affects only one of them.

Give the framing clearly:

“An aggregate is a group of events that belong together because they share the same rules. If these two events can never be out of sync without the business being wrong, they’re in the same aggregate. If they can drift for a second without anyone caring, they’re probably in different ones. We’re looking for the natural seams.”

Then teach the tests. People in their first Architecture session have nothing to go on except the framing, and the framing alone is too abstract. Give them five concrete tests they can point at two events and apply out loud:

The crash test. “If the server crashed between these two events, would the business be in an invalid state?” If yes, they must happen together — same aggregate. If no, a brief gap between them is survivable — probably different aggregates. Same aggregate: Cart Line Added and Cart Total Updated — if a crash left the line in but the total not updated, the cart is lying. Different aggregates: Payment Captured and Receipt Emailed — if the email didn’t send, the customer is still correctly charged and you can retry the email later.
The shared-rule test. “Is there a rule that depends on both of these events being in sync?” If you can write a rule like “the total must equal the sum of line items” or “you can’t cancel a paused subscription without resuming it first,” the events the rule touches are in the same aggregate. Same aggregate: Subscription Paused and Subscription Resumed — governed by the rule “a subscription is exactly one of: active, paused, cancelled.” Different aggregates: Subscription Cancelled and Refund Issued — the refund follows from the cancellation, but the rules governing refunds (amount, eligibility, accounting) are separate from the rules governing subscription state.
The same-ID test. “Do these events carry the same identifier — the same cart ID, subscription ID, invoice ID?” It’s a hint, not a rule. Events that share an ID are usually in the same aggregate, because they’re changes to the same underlying thing. Events about different IDs almost never are. Hint toward same aggregate: every event carrying subscription_id=42 probably belongs in the Subscription aggregate. Hint toward different: a subscription_id and an invoice_id travelling together usually means you’re looking at a crossing, not an aggregate.
The Conway’s-Law test. “If someone had a question about this event, which team or role would they ask?” Events that share an owner are usually in the same aggregate; events owned by different teams almost never are. Same aggregate: Invoice Raised and Payment Failed — the Finance team owns both; questions about either go to the same people. Different aggregates: Payment Captured (Finance) and Access Granted (Platform) — different questions, different teams, different aggregates. Caveat this one in your head as you use it. This test is backward-looking: it catches aggregates the org already reflects, which is useful to confirm a cluster but dangerous to define one. The “org-chart design” failure mode in Phase 3 below is exactly what happens when you let Conway’s Law lead the conversation — you end up with boundaries drawn on team lines that’ll dissolve the next time the org re-shuffles. Use this test to double-check a cluster the invariants already picked out, never to pick the cluster in the first place.
The long-running-process test. “Is there a stateful wait — a human decision, an approval, a scheduled delay — between these two events?” If so, you almost certainly have a long-running policy (also called a process manager or saga) in between: a persistent state that sits around waiting for the next input. That’s not automatically an aggregate boundary — sometimes the long-running state lives inside an aggregate as a sub-process. But the presence of a wait is a signal that you’ve found a piece of the model that will need somewhere to store its in-between state, and that’s often either a small aggregate in its own right or a hint that the aggregate on one side of the wait has a state machine you haven’t drawn yet. Signal, not answer: Refund Requested (by customer or support) and Refund Issued (after approval). These might be two aggregates (if Support owns the request and Billing owns the refund), or they might be one Billing aggregate with a pending-approval state on the Refund entity. The pause doesn’t tell you which — it tells you to look closer at where the state lives while nobody is touching it.

Tell the participants: “Try two or three of these tests on every pair of events you’re not sure about. The tests will disagree sometimes — that’s fine, it means it’s a genuinely tricky boundary and worth a conversation.”

One edge case worth naming out loud: external systems. All five tests assume both events are in systems you own. Real architectures have events that live on the other side of Stripe, SendGrid, HMRC, the accounting platform, or the bank. The crash test doesn’t apply cleanly to “Payment Captured (Stripe) and Invoice Paid (our ledger)” because you don’t own the Stripe side and you can’t make them atomic even if you wanted to. When a test points at an event on the other side of a boundary you don’t control, the honest answer is: these are in different aggregates and they’re separated by an anti-corruption layer; the invariant you need is eventual-consistency-plus-reconciliation, not same-aggregate-atomicity. Phase 5 of this session is when you mark those external events explicitly — until then, it’s enough to pink-note them and move on.

Hand out coloured dots — one colour per candidate aggregate. Let small groups put dots on events they think belong together. Don’t draw boundaries yet. The dots are cheap and disposable; boundaries are not.

Aggregate vs bounded context — two levels at once. In DDD there are actually two nested levels here, and a first Architecture session finds them together without always separating them. An aggregate is the smallest unit of transactional consistency — the events that must commit together to keep an invariant true. A bounded context is a larger linguistic and design boundary — the area in which a ubiquitous language (Evans’ term: one shared vocabulary used in code, conversation, and stories without drift) stays consistent, and inside which several aggregates may live. “Billing” in the running example below is named as one aggregate for teaching simplicity, but in a real system it would usually be a Billing bounded context containing multiple aggregates (Invoice, Payment, Refund, Dunning Schedule), each with its own consistency boundary. For a first pass in this session, don’t agonise over the distinction — most clusters you find will be either a single aggregate or a bounded context with two or three closely-related aggregates inside it. In Phase 3 you can ask, for each cluster: “is this one aggregate, or is it a bounded context with a few aggregates I haven’t separated yet?” and split where the answer is the latter.

What you’re looking for, in the running example. The sixteen cells on the wall probably cluster into four or five aggregates:

Subscription — Subscription Activated, Plan Selected, Plan Upgraded, Subscription Paused, Subscription Resumed, Subscription Suspended, Subscription Cancelled. The lifecycle. These events share one invariant: the subscription can only be in one state at a time, and the transitions between states have rules (you can’t cancel a paused subscription without resuming it first, you can’t upgrade a cancelled one). If any two of these got out of sync, the business would be broken.
Billing — Invoice Raised, Payment Captured, Payment Failed, Payment Retry Failed, Refund Issued. Money state. These share a different invariant: every charge traces to an invoice, every refund traces to a charge, and the ledger has to balance. Billing doesn’t care about plan features; it cares about amounts and obligations.
Access — Access Granted, Usage Recorded. Whether the customer can use the product and how much they’ve used. Separate from both Subscription (which is about the commercial relationship) and Billing (which is about money). A subscription can be active with access revoked (while payments are still being retried), or cancelled with access still granted until the end of the period.
Customer — Account Created, Payment Method Added. Identity and stored payment credentials. Often touched by support, rarely by the lifecycle.
Support — Ticket Raised, Ticket Resolved. A parallel aggregate that reads from all the others but changes on its own rhythm.

Five aggregates from sixteen wall cells. The test that each one passes: if you look at only the events inside it, you can describe a consistency rule that governs them, and the rule doesn’t need to know about any other aggregate to make sense.

What to watch for:

Noun-driven clustering. Someone proposes “everything about the Subscription goes together” and sweeps Billing events into Subscription. Push back: the noun is a hint, not a rule. “Does Plan Upgraded need to know anything about Payment Captured to decide what to do? Or does it just need to tell Billing it happened?”
The god aggregate. One cluster absorbing most of the wall — usually the one someone calls “Subscription” or “Order.” Almost always wrong. “What invariant forces all of this into one place?” In the running example, the god aggregate would be a Subscription that owns billing, access, and support. Split it.
Too many tiny aggregates. One event per aggregate means you’re building a distributed monolith. “If these two always change together, they probably belong together.” Subscription Paused and Subscription Resumed don’t each need their own aggregate.
Clustering along team lines. “Team A owns these because they work on billing.” That’s an ownership decision, not a design one. Pink note it and design the boundaries on the domain, not the org chart.

Phase 3 — Draw boundaries (20 min)

Now the marker comes out. For each agreed cluster, draw a thick line around it on the paper. The act of drawing is deliberate: it makes the commitment visible, and it forces the room to look at the edges.

(A note on notation: Brandolini’s original specification for Software Design EventStorming uses large pale yellow rectangles as the aggregate shapes — the events and commands belonging to an aggregate physically sit on top of the large yellow card, and the card itself is the aggregate. It’s a lovely visual, and if you happen to have large pale-yellow cards to hand, use them. In practice the marker-boundary approach works just as well, takes less sourcing, and the argument at the edges is the same whichever you use. The point is that the boundary is visible and committed — not the shape of the paper.)

Open with:

“We’re going to draw the lines now. Once the line is on the paper, we argue at the edges — which events are inside, which are outside, and why. Nothing gets drawn without the group agreeing.”

Work one cluster at a time. As each boundary goes down, ask:

“Is there anything inside this line that doesn’t belong? Anything outside it that does?”

The arguments at the edges are where the session earns its keep. Boundaries that nobody argues about are usually either obvious or uninspected.

In the running example, the argument that matters is probably about Subscription Suspended. Does it belong in Subscription (it’s a state change on the subscription) or in Billing (it happened because payment retries were exhausted)? The answer is Subscription — the state lives there. But the subtler question is how Billing tells Subscription to move. Two options:

Billing sends a command. “Suspend this subscription.” Billing is reaching across the boundary and telling Subscription what to do. Direct, simple, and couples Billing to Subscription’s interface — Billing now has to know Subscription exists, how to call it, and what a “suspend” looks like.
Billing publishes an event. “Payment Retries Exhausted.” A fact about what happened in Billing. Subscription (and anyone else who cares) subscribes to that event and decides for itself what to do — in this case, transition to Suspended. Billing doesn’t know or care who’s listening.

The second is usually better. It keeps Billing and Subscription loosely coupled: Billing’s job is to honestly report what happened in its world, and Subscription’s job is to decide how to react. If six months from now you add a “notify the customer” step, you add another subscriber to the event; Billing doesn’t change. This is the core trade-off at every aggregate boundary — commands are tighter and more direct, events are looser and more extensible — and Event Storming an Architecture is where you make the call.

A second, subtler edge case: Usage Recorded. Is it Access (it’s about product use) or Billing (it feeds invoice generation)? Probably Access — usage is captured continuously and Access is the aggregate that owns the running total. Billing reads the total at invoice time. If Usage lived in Billing, every usage event would have to cross the boundary, and you’d be building a distributed monolith just to count API calls.

What to watch for:

The straight line. A boundary that goes cleanly around a cluster with no arguments at the edge. Suspicious. “Has anyone looked at what’s just outside it?”
The moving line. A boundary that shifts every time someone walks past. Usually means two candidate aggregates are really one, or one is really two. Pair two people to propose one answer and defend it.
Boundaries drawn along team lines rather than domain lines. “If Team A disbanded tomorrow, would this line still make sense?” If no, it’s the wrong line.
Overlapping boundaries. Two aggregates can’t overlap. If they do, you have a third aggregate in the overlap, or one of them is wrong.

Phase 4 — Classify commands (25 min)

Now look at every blue command note from the Process Level wall, and for each one ask: is this command handled inside an aggregate, or does it cross a boundary?

Commands that stay inside an aggregate are internal — they’re method calls, function invocations, things the aggregate does to itself. Commands that cross boundaries are the API between components — they become HTTP calls, messages on a queue, events on a bus, or scheduled jobs in another service.

Frame it:

“Every blue command is either internal to one aggregate or crosses a boundary. Mark the crossing ones with an arrow from the source to the target. Those arrows are the component APIs. If we have too many arrows, our boundaries are wrong — we’re building a distributed monolith. If we have none, we haven’t split anything, which is also fine if that’s what we meant.”

Draw the arrows with a different coloured marker so the crossing commands stand out from the internal ones.

In the running example, the commands split roughly like this:

Internal to one aggregate:

Pause Subscription, Resume Subscription, Upgrade Plan — Subscription aggregate, all of them
Raise Invoice, Capture Payment, Retry Payment, Issue Refund — Billing aggregate
Record Usage, Grant Access, Revoke Access — Access aggregate
Raise Ticket, Resolve Ticket — Support aggregate

Crossing boundaries — as commands (one aggregate telling another to do something specific, usually because a human authorised it):

Support → Billing: “Issue refund for invoice X.” A support agent has decided to issue a refund; Billing has to execute it and report back. This is a command because the sender needs the receiver to do a specific thing and needs to know it was done.
Support → Subscription: “Cancel subscription on behalf of customer.” Same shape — a human has authorised a specific action and needs confirmation.

Crossing boundaries — as events (one aggregate announcing a fact, others subscribing):

Subscription → anyone who cares: “Subscription Activated.” Billing subscribes and starts the billing cycle. Access subscribes and grants initial entitlements. Notifications subscribes and sends a welcome email. Subscription doesn’t know or care who’s listening.
Subscription → anyone who cares: “Plan Upgraded.” Billing subscribes and generates a prorated invoice. Access subscribes and updates entitlements. Notifications subscribes and sends a confirmation.
Subscription → anyone who cares: “Subscription Cancelled.” Billing subscribes and stops future charges. Access subscribes and schedules revocation for period-end.
Billing → anyone who cares: “Payment Captured.” Access subscribes and activates entitlements. Notifications subscribes and sends a receipt.
Billing → anyone who cares: “Payment Retries Exhausted.” Subscription subscribes and transitions to Suspended. Support subscribes and raises a ticket.

Seven crossing relationships from the sixteen-triplet wall — two commands and five events. That’s the ratio you want. Commands cross boundaries when a specific sender needs a specific receiver to do a specific thing and know it was done — almost always because a human decision is driving it (the support agent, the admin, the customer clicking a button). Events cross boundaries when an aggregate is just announcing a fact — the fact happened, it’s published, anyone who cares can subscribe. As a rule of thumb, tilt the ratio toward events: they’re looser, more extensible, and they let new consumers show up without the publisher changing. If your design has more commands than events at the boundaries, look at each command and ask whether it could be re-shaped as “aggregate A publishes fact, aggregate B subscribes and reacts.” Usually it can.

Showing which aggregate receives the event — and what it does with it.

An event is a fact; a subscribing aggregate has to do something in response. If you only show the publishing side on the wall, you’ve drawn half the story. For this phase, introduce one more sticky note colour: purple, for policies. A policy is the rule that says “when this event happens, this aggregate reacts like this.” It’s the subscribing aggregate’s half of the conversation, and it’s the missing piece if you’re only showing commands and events.

Here’s the notation the session uses:

The event on the wall has an arrow leaving its publishing aggregate, labelled with the event name.
The arrow lands in the receiving aggregate’s territory. Put a small marker where it lands so there’s a visible endpoint, not an arrow into the void.
Next to the landing marker, add a purple policy note describing what the receiver does when the event arrives. Write it in the form “when X, then Y” — “when Plan Upgraded arrives, generate a prorated invoice”. The purple note is the receiving aggregate’s contract with the event: it says “if you’re in this territory and this event shows up, here’s what I’m obligated to do.”
The policy triggers an internal command inside the subscriber — that’s the aggregate’s own work, done under its own rules. Draw the internal command as a blue note inside the subscriber’s territory (because at that point it’s an internal command, which is exactly what blue is already for). The chain looks like: publisher aggregate emits event (orange) → crosses boundary → subscriber aggregate territory → policy fires (purple) → internal command (blue) → subscriber’s own state changes → (possibly) subscriber emits its own event onwards.
If the policy needs information to decide, add a pale green read model note next to it. A read model is the data the policy consults before firing — “when Plan Upgraded arrives, check the customer’s current plan and remaining billing period, then generate a prorated invoice based on the difference.” That “current plan and remaining billing period” bit is a read model. Not every policy needs one; many just react directly. But when a policy’s decision depends on information that lives somewhere (another aggregate’s state, a stored calculation, a cached lookup), naming the read model explicitly makes the dependency visible.

Each colour has one job. Orange: events. Blue: commands, whether internal or crossing. Yellow: actors. Pink: hotspots. Purple: policies — the reactions that tie an event to a command in the receiving aggregate. Green: read models — the information a policy consults before deciding what to do.

Two rules to state out loud, in the room, as the purple and blue notes go up. These are the rules that stop the wall from becoming a flowchart, and they are load-bearing:

Events don’t cause events. An event is always followed by something that reads it — a policy, a human, a clock — which in turn issues a command. If someone asks “so does Plan Upgraded cause Invoice Raised?”, the answer is “no — Billing reads Plan Upgraded, a policy fires, that policy issues Raise Invoice, and Raise Invoice produces Invoice Raised. Events are facts, not causes.” Once the rule is named, people catch themselves.
Commands don’t cause commands. A command produces exactly one event (on success) or fails. If the next command needs to happen, it’s triggered by a second policy that subscribes to the first command’s event. If the team tries to draw an arrow from one blue note directly to another blue note, stop them and insert the event that goes between. There is always an event between two commands, even if nobody bothered to name it yet.

If the team in the room is drawing arrows between orange notes or between blue notes, pause and re-teach these. They take ten seconds to say and they’re the difference between an Event Storming wall and a Visio diagram.

A note on reactive vs long-running policies. Most purple notes on a wall are reactive — “when X, do Y”, done in one step, no state held between. But some policies carry state across multiple events. A refund approval is an example: the policy starts when Refund Requested fires, waits for a human approval or rejection, then issues or denies the refund. Brandolini calls the stateful kind a long-running policy; Vernon calls it a process manager; the generic DDD term is saga. Both kinds are purple on the wall. But if you find a policy whose “when” and “then” are separated by hours, days, or a human decision, flag it as long-running — it needs its own persistent state, and that state is often either its own aggregate or a sub-entity inside an existing one.

Here’s the whole shape on a wall — two aggregates, one crossing event, and the policy/read-model/command chain the subscriber runs in response:

One cross-boundary event, two policies, one read model, two commands, two events — one internal, one emitted. The policies chain through the event: Policy 1 reacts to Plan Upgraded and triggers Raise Invoice, which produces Invoice Raised; Policy 2 subscribes to that internal event and triggers Capture Payment, which produces Payment Captured. Commands never point to commands — there is always an event in between. Every colour earns its place.

In the running example, “Plan Upgraded” travels like this:

Subscription emits Plan Upgraded (orange, fact).
Billing subscribes. Purple policy 1: “when Plan Upgraded arrives, raise a prorated invoice for the difference.” That policy reads a green read model — Current Billing Period — to work out the proration, then triggers the internal blue command Raise Invoice, which produces an internal orange event Invoice Raised. A second purple policy lives inside Billing and subscribes to its own internal event: “when Invoice Raised, capture payment against the card on file.” That policy triggers Capture Payment, which, if the card clears, produces Payment Captured — the event that matters to the rest of the system. Note what’s happening: two policies chained via an internal event, not one policy firing two commands in parallel. That’s important. Commands never point directly to other commands; there is always an event in between, even if it only matters inside the aggregate that produced it.
Access subscribes to Plan Upgraded separately. Purple policy: “when Plan Upgraded arrives, update entitlements to match the new plan’s limits.” It reads a green read model — Plan Entitlements — to know what the new plan allows. Internal blue commands: Revoke Old Entitlements, Grant New Entitlements.
Notifications subscribes. Purple policy: “when Plan Upgraded arrives, send an upgrade confirmation to the customer.” It reads a green read model — Customer Contact Details — to know where to send it. One internal blue command, one outbound email. Note that Notifications isn’t really an aggregate in the same sense as Billing or Access — it has no invariant it enforces and no persistent domain state of its own. It’s a side-effect subscriber (or integration adapter): a listener that reacts to domain events by talking to an external service (SendGrid, SES, Mailgun). In a real design it would usually live in its own bounded context but as an adapter, not an aggregate. I’ve included it in the fan-out diagram because a lot of cross-boundary subscribers are shaped like this, and it’s worth seeing that the pattern accommodates them — not every subscriber has to be a full aggregate with invariants and state.

The whole choreography on a wall:

One event, three subscribers, three independent reactions. Subscription doesn't know or care who's listening; each subscriber decides for itself what Plan Upgraded means. The diagram is wide — zoom in if you need to read the fine print.

On the wall, this looks like an orange event note, an arrow into each subscriber’s territory, a purple policy stuck near the boundary, a green read model next to the policy where the policy needs to consult something, and one or more blue internal commands attached below the policy. By the end of Phase 4, every cross-boundary event should have at least one subscriber drawn with its policy visible. If an event has no subscriber, either you don’t need the event (delete it) or you’ve forgotten who cares (pink note it and find out).

Green read models are often the place first-time architects discover a hidden dependency. “Wait, Billing needs to know the customer’s current billing period — where does that live?” If the answer is “Subscription owns it and Billing queries Subscription at policy-fire time,” you’ve just uncovered a cross-boundary read that might want its own event, or might need a denormalised copy inside Billing. Either way, the question is visible because the green note made it visible.

The important rule: each subscriber decides independently what the event means to it. Subscription doesn’t know that Billing is generating an invoice when Plan Upgraded fires; it just publishes the fact. Billing doesn’t know that Access is revoking and re-granting entitlements; it just reacts on its own terms. If a subscriber’s reaction depends on another subscriber having already reacted (“Billing can only do its thing after Access has done its thing”), you’ve got a hidden ordering dependency that breaks the event-driven model. Pink note it — it’s either a misplaced boundary or a missed command.

What to watch for:

Commands crossing three boundaries in a row. A chatty API. Either the boundary is wrong, or the command is really a sequence that should be refactored into an event-driven flow. Example from the running scenario: a customer upgrades their plan, and the naive design sends a command chain — Support → Subscription (“upgrade plan”) → Billing (“generate prorated invoice, then charge it”) → Access (“update entitlements”). Three boundary crossings in a row, each waiting on the next, each able to fail mid-chain and leave the system in a half-upgraded state. The event-driven refactor: Subscription handles the upgrade command, emits Plan Upgraded, and returns. Billing subscribes to Plan Upgraded, generates the prorated invoice, and emits Invoice Raised. Billing charges, emits Payment Captured. Access subscribes to Payment Captured and updates entitlements. Four aggregates, one command crossing, three events. If Billing is slow, the user still gets a fast response from Subscription. If Access is down, the event sits in the queue and catches up when it’s back. You’ve turned a fragile synchronous chain into an async choreography that each aggregate can reason about in isolation.
A boundary with no commands crossing in or out. Either it’s a leaf aggregate (fine), or it’s isolated from the rest of the system for no reason (not fine).
Bidirectional arrows between the same two aggregates. They’re tightly coupled. Worth asking whether they’re really one aggregate.
Commands that used to be internal becoming crossing. “Did we just make this slower and more fragile? What do we gain in exchange?”

Phase 5 — External vs internal systems (15 min)

The yellow actor notes from the Process Level wall include a mix of things: people, scheduled jobs, external systems (Stripe, SendGrid, HMRC, the bank), and internal services you own. For a design session, the distinction matters. External systems are contracts you don’t control; internal services are designs you’re making right now.

Ring the externals with one colour, the internals with another.

Prompt:

“For each yellow note, is this something we own or something we integrate with? If we own it, it’s a bounded context we’re defining. If we don’t, it’s a contract we’re living with — and we probably need an anti-corruption layer between us and them.”

In the running example, the yellow notes break down like this:

External:

Stripe — payment processing. Every time money moves, Stripe is involved. Our Billing aggregate talks to an anti-corruption layer that translates between our paymentId and Stripe’s payment_intent_id. We don’t let Stripe’s shape bleed into our domain model.
SendGrid — transactional email for invoices, payment retry notices, receipts. Not an aggregate; just a dependency. One adapter.
A tax API (Avalara, TaxJar) — calculating VAT/GST at invoice time. Billing calls it but doesn’t own it.
The customer’s bank — implicit. Stripe abstracts it, so for design purposes it’s behind the Stripe boundary.

Internal (things we own and get to design):

Subscription service — owns the Subscription aggregate.
Billing service — owns Billing.
Access service — owns Access, and typically the hot path for most production traffic.
Support tooling — owns Support. Often a separate product (Zendesk, Intercom) with a thin integration, which means it’s partly external — worth being explicit.

The interesting case is Support. If we’re using Zendesk, it’s external — a contract we live with. If we’re building our own ticketing, it’s internal. The yellow note doesn’t tell you which; the team does. And the answer changes the architecture: building on Zendesk means an anti-corruption layer for tickets; building our own means another aggregate to design.

What to watch for:

External systems being treated as internal. “We can change Stripe’s API any time we want.” No, you can’t.
Internal services being treated as external. A sign the team owning them has become an external dependency. Worth a pink note — it may be an organisational smell more than a technical one.
External integrations without anti-corruption layers. “What happens when their schema changes?” If the answer is “our whole model breaks,” you need a translator at the boundary.
Missing externals. The Process Level session may have left implicit systems out. “Where does this data actually come from?”

Phase 6 — Hotspot review (25 min)

Walk the wall one more time and put pink notes on the places the design is still fragile: aggregates you’re not sure about, boundaries that keep moving, commands you couldn’t classify cleanly, externals without anti-corruption layers, anywhere two people still disagree.

Cluster the pinks into themes:

Aggregates whose invariants aren’t clear yet
Boundaries that may need to split or merge
APIs that will need careful design
External integrations that need their own investigation
Places where the design depends on information the Process Level session didn’t capture

For each cluster: is this something to resolve before we can build? Is this something to defer? Who owns finding the answer?

What to watch for:

Trying to solve a hard aggregate question in-session. Time-box 5 minutes per cluster. If it’s not clear in 5, it needs a time-boxed investigation outside the room.
Pink notes that reveal Process Level gaps. Go back to the Process Level wall, not to brainstorming.
Energy dropping. The room has been thinking hard for two hours. Keep this phase brisk.

Steering When It Goes Sideways

Named failure modes. Each has a symptom, a recovery move, and a threshold where you stop rather than limp through.

The god aggregate. One cluster is absorbing most of the wall. Everyone keeps adding events to it. Recovery: Pick one invariant and ask what really depends on it. “If Access Granted changed in isolation, would Payment Captured need to know?” Split where the answers diverge. Stop if: The group genuinely cannot find a splitting invariant after 20 minutes. You may be in a domain that really is one aggregate, or the Process Level scope was wrong.

The distributed monolith. Every command crosses a boundary. Every aggregate depends on every other. Recovery: Stop drawing boundaries. Ask: “If we had one service, what would actually need to split?” Redraw from zero. Stop if: A second attempt produces the same pattern. You may be designing at the wrong scope, or the Process Level flow is one indivisible thing.

The org-chart design. Boundaries are landing exactly on team lines and nobody will move them. Recovery: Name it. “We’re drawing the org chart, not the domain. Let’s draw the domain and argue about ownership afterwards.” Stop if: The team lines are fixed by a decision outside the room and everyone knows it. Make the constraint explicit in the output; don’t pretend you designed freely.

The solution architect. One person has a design in their head before the session starts and is steering the wall towards it. Recovery: Ask them to park their proposal on a separate sheet. “Write it down, we’ll compare at the end.” Free the rest of the room to work without it. Stop if: They can’t hold the distinction. Either the session is a ratification exercise (which isn’t a session), or you need to reset the group.

The frozen wall. The Process Level map turns out to be wrong in a load-bearing place. Recovery: Pause the design session. Fix the wall with whoever can. If the fix is small, resume. If it’s large, stop. Stop if: The Process Level wall has significant gaps. Reschedule after a follow-up Process Level session.

The silent veto. The domain expert keeps saying “sure” to things that clearly violate the business rules they’d defend in any other room. Recovery: Name it out loud. “You just agreed to something I’ve heard you push back on before. Is the business really going to accept that, or are you being polite in front of the architects?” Give them explicit permission to veto. Stop if: The dynamic doesn’t shift. A design session without a genuine domain veto produces designs that don’t survive contact with reality.

Consequences

Benefits

Aggregate boundaries chosen by the people who will implement them, in front of the events they’re splitting
Bounded context boundaries made explicit, argued at the edges, and drawn on paper
Command and event APIs between components surfaced before the first line of code
External integrations identified as contracts, not as “things we can change”
A design artefact that’s directly traceable to the Process Level model the whole team already agrees on

Costs

9–18 person-hours for a 3-hour session
The session is useless without a prior Process Level map
Political risk when team ownership has to move to match domain boundaries
Reputational cost if the first session produces a design nobody implements

Failure modes

Designing without the domain expert’s real voice
Drawing boundaries on the org chart instead of the domain
One architect’s design getting rubber-stamped by a polite room
Ending with a distributed monolith and nobody noticing until three sprints in
The Process Level map turning out to be wrong and the design collapsing

Stop-the-session signals

The Process Level wall has significant gaps that only surface once you try to design on it
A second attempt at boundaries produces the same distributed-monolith pattern
One person is steering every decision and pairing hasn’t shifted it

Ending early is not failure. Shipping a design the room doesn’t really believe in is worse.

Worked Example

See Event Storming: Building Shared Understanding for the Greenbox team’s first session — the Process Level map that later became the input for their Architecture session, where clusters of events turned into aggregates and the first real boundaries between subscription, fulfilment, and billing got drawn on the wall.

Outputs & Follow-up

Facilitator’s close-out (same day, 24 hours)

Panoramic photographs of the wall, including the boundary markers and crossing arrows.
Transcribed aggregate list, bounded context list, command/event API list, and hotspot list in a shared document.
A short summary to participants: “here’s the design we landed on, here’s what’s still open, here’s what happens next.”

The product owner’s week

This looks unusual for a product owner — the output is architectural, not scope — but the product owner still has real work here.

Turn the bounded contexts into vocabulary. Each context has a name the team will use for the next year. Pin them down; don’t let them drift.
Triage the hotspots. Each cluster becomes a time-boxed investigation, a follow-up session, or a deferred question. The design work usually sits with the architects, but the product owner makes sure it doesn’t vanish.
Reshape the backlog along the boundaries. Stories that cross three bounded contexts are a smell; split them where the boundaries say to. If you can’t split a story cleanly, the boundary may still be wrong.
Book the follow-up conversations. External integrations that need anti-corruption layers are their own small workshops. Don’t let them wait until the first sprint blows up on them.
Walk the design with anyone who couldn’t attend. Tech leads on adjacent teams especially — their reaction tells you whether the boundaries will hold up at the edges of the system.

Ongoing

Keep the photographs visible while the work is active. A bounded context map on a wall is worth a hundred Confluence pages.
Update the design when the domain changes. An architecture model is a snapshot; snapshots go stale faster than you’d like.
Revisit if a sprint keeps producing stories that cross the same boundary. That’s the wall telling you the line is wrong.

Process Level Event Storming — the prerequisite. You can’t design the code until you’ve modelled the flow.
Big Picture Event Storming (publishes 13 April) — the zoom-out, when you need to find which flows deserve a Process Level session in the first place.
Example Mapping (publishes later this week) — for aggregates whose invariants aren’t clear yet, Example Mapping turns the question into rules and examples before you commit to code.
Decision Tables (publishes later) — for policies inside aggregates that involve multiple conditions, decision tables are the follow-up artefact.
Threat Modelling (publishes later) — the crossing-command API you’ve just defined is one of the inputs to threat modelling the system.

About Event Storming

Intent

Also Known As

A few terms before we start

Motivation

Applicability

Participants

Structure

Collaborations

Facilitator Playbook

Phase 1 — Review the Process Level map (10 min)

Phase 2 — Identify aggregate candidates (25 min)

Phase 3 — Draw boundaries (20 min)

Phase 4 — Classify commands (25 min)

Phase 5 — External vs internal systems (15 min)

Phase 6 — Hotspot review (25 min)

Steering When It Goes Sideways

Consequences

Worked Example

Outputs & Follow-up

Related Patterns