Hidden in Plain Sight: The Events You Forgot to Model - EventSourcingDB

12 min read Original article ↗

There's a very specific moment that most teams working with Event Sourcing eventually run into. Someone asks a seemingly simple question about the past: why did this happen, how often has that occurred, what would have been the case if things had gone differently. You open the event store, expecting the answer to be right there, because that's the promise, after all. The full history, nothing lost, everything reconstructible. And then you realize the event you'd need was never written. The information existed once, for a brief moment, and slipped away before anyone thought to catch it.

It's tempting to treat this as a checklist problem. Just write down the events you tend to forget, keep the list handy, refer to it during the next Event Storming. But that approach misses something important. The problem isn't that teams are lazy or careless. It's that the way we're taught to think about events quietly steers us away from certain kinds of events in the first place. If you want to stop forgetting them, it helps to understand why they vanish from the model to begin with.

The Question That Couldn't Be Answered

Let's make this concrete. Imagine a library. It runs a well-designed event-sourced system, modest in scope, clean in its modeling. When a reader checks out a book, a BookBorrowed event is written. When they bring it back, a BookReturned event joins it. If the return is late, the system automatically calculates a fine based on the current rules and writes a FeeCharged event with the amount. Three clean event types, each tied to a meaningful business moment. Commands in, events out. Nothing obviously missing.

The system runs smoothly for a year. Then the library's management starts planning a reform of the fine structure. They're considering progressive tiers, grace periods during holidays, discounts for certain reader groups. Before committing, they want data. The question they ask the team sounds innocent: how would last year's revenue have looked under the new rules?

The team opens the event store, confident. FeeCharged is right there, thousands of entries, each with a reader, a book, an amount. But the amount is where the trail ends. There's no reference to which rule was applied, no indication of the inputs that went into the calculation, no trace of the configuration that was in effect when the fine was computed. And the old rules? They're gone. The code was rewritten three months ago. The repository still has fragments of them in its git history, but the exact runtime configuration, the edge cases, the overrides, all of that is lost to time.

Here's the painful part. The information existed at the exact moment the fine was calculated. It lived in a variable, was briefly the most important thing in the process, and then disappeared. Capturing it would have cost next to nothing, a handful of characters in the event payload. But no one thought to.

And this is not a story about a sloppy team. The team did everything by the book. They had clean event types, one event per command, sensible names, proper bounded contexts. The miss was invisible from where they were standing. That's what makes it worth writing about.

So Why Do We Keep Missing Them?

The honest answer is that there isn't a single reason. There are several, and they tend to reinforce each other. What follows isn't meant to be exhaustive or to apply to every team in equal measure. But if you've ever found yourself staring at an event store and wishing it contained one more field, one more event, one more piece of context, there's a good chance at least two of the patterns below played a role.

The Ghost of CRUD Is Still in the Room

Most of us didn't grow up with Event Sourcing. We grew up with tables, rows, columns, and the gentle assumption that the world fits into them. Even after years of writing event-sourced systems, that shape of thinking lingers. It whispers at every modeling session, suggesting that an event is really just a change to some state, and that anything that doesn't correspond to a change doesn't belong.

That's the subtle trap. Events aren't state changes. They're observations about the world. Sometimes an observation happens to coincide with a state change, which is why the two get confused. But plenty of things worth knowing don't map neatly onto a column somewhere. A decision was made under certain assumptions. A signal arrived from an external system. A deadline quietly passed. None of these look like UPDATE row SET value = ..., so the CRUD ghost in our heads waves them off as not being events at all.

A close relative of this pattern is flattening the vocabulary. Domain experts rarely speak in CRUD terms. They use rich, specific language: "the reader returned the book damaged", "the loan was rolled over after a courtesy call", "the book was set aside for inspection". A team under CRUD influence collapses all of these into a generic BookReturned, because that's what shows up in the database schema. The nuance isn't lost in translation. It's lost in the act of translating in the first place. If you want to dig deeper into this specific failure mode, we've written about it before in Naming Events Beyond CRUD.

Commands Are Not the Whole Story

Here's a belief that's quietly baked into a lot of Event Sourcing literature: every event is caused by a command. A user clicks, a command is dispatched, an event is written. Clean, symmetric, easy to teach. It's also incomplete in a way that costs teams dearly.

Events are observations, not echoes of commands. Plenty of things happen in a system without anyone sending a command. A scheduler notices that a grace period has expired. A background job detects that an expected external signal never arrived. A periodic inventory check discovers that a book is missing from the shelf. None of these start with a user-facing command. If your mental model only accepts events that follow commands, all of these observations slip through without a trace.

The same mental reflex produces another blind spot, the tendency to model only the happy path. Event Storming sessions are full of arrows that represent things going right. Orders get fulfilled, books get returned, readers pay their fines. What gets much less attention is the space around these flows: the delays, the retries, the partial failures, the cases where something was supposed to happen and didn't. These are often the events that turn out to be most valuable later, because they're the ones you can't just observe in aggregate totals.

It's worth saying out loud during the next modeling session: events aren't only things the user does, and they aren't only things that go well. The system observes, and every observation worth making is a candidate for being captured.

The Reconstruction Fallacy

Of all the causes on this list, this one might be the most seductive. It sounds like: "We don't need to store that. We can reconstruct it later from the events we already have." And sometimes that's true. Many derived facts really are reconstructible from what's already in the store, which is one of the great strengths of Event Sourcing. The trouble is that the reconstruction only works as long as the context of the original decision stays stable.

Go back to the library for a moment. The fines could, in theory, have been recomputed from BookBorrowed and BookReturned alone, by measuring how many days overdue each return was and multiplying by the daily rate. That recomputation would have worked perfectly on the day the code was written. It fell apart a year later, not because the events changed, but because the rules around them did. The inputs were still there. The interpretation of those inputs had quietly moved.

This failure mode is especially cruel because it doesn't announce itself. A reconstruction that's stopped working won't throw an error. It'll just silently produce numbers that look plausible and are wrong. By the time anyone notices, the decisions built on those numbers are already in the world.

There's a second layer to this cause, and it's mostly about time horizons. When you're modeling for the next two sprints, everything feels derivable, because the world hasn't had a chance to shift yet. When you're modeling for the next two years, the picture changes. The question stops being "can I derive this today?" and becomes "will I still be able to derive it after everything around the event has moved on?"

Keep the Recipe, Not Just the Cake

At this point it's easy to draw the wrong conclusion. If events can't safely rely on reconstruction, and if context matters, then surely every event needs to carry its full computational history? Every input, every intermediate value, every branch taken? That sounds exhausting, and it is. It's also unnecessary.

The actual fix is much smaller and much more elegant. You don't need to store the recipe in full detail. You need to store which recipe was used. A short, stable identifier that points to the logic, the version, the configuration that produced the value. Not the computation itself. Just a name you can look up later.

Back to the library. If FeeCharged had carried a single additional field, something like feeRuleSet: "2025-summer-v2", the entire problem would have dissolved. A year later, when management asked their question, the team could have looked up what 2025-summer-v2 meant, compared it to the new rules, and produced the answer in an afternoon. The old rules wouldn't need to live in the running code. They'd need to live somewhere that the reference could point to, which is a much easier problem to solve.

The same pattern shows up in a completely different corner of software. Imagine a system that stores a SHA-256 hash of a JSON document to detect changes over time. Looks fine on the surface. But which canonicalization was used? Were the keys sorted alphabetically, or left in insertion order? Was whitespace preserved or stripped? Were line endings normalized? Without these details, the hash is an opaque string. And the moment the serialization logic gets upgraded, even for reasons that have nothing to do with the hash, nothing lines up anymore. Two documents that are logically identical produce different hashes, and there's no way to tell which pipeline produced which value.

The fix is the same as for the fines. Store a small reference alongside the hash, something like sha256:canonical-json-v2:<hex>. Three tokens, two problems solved. You know which hash algorithm was used, and you know which normalization was applied to the input before hashing. If either ever changes, old and new entries can coexist peacefully, and every consumer can decide deterministically which code path applies.

We use this exact pattern in EventSourcingDB itself. Every signed event carries a signature with the prefix esdb:signature:v1: in front of the hexadecimal value. Eighteen bytes. The v1 is the critical bit. The day we ever introduce a new signing scheme, whether that's a different curve, a different serialization of the signed payload, or a different encoding, both versions will coexist without ambiguity, and verification will know exactly which path to take for each signature it encounters. Cost today: negligible. Benefit on the day we need it: enormous. This is the same mindset we explored in Versioning Events Without Breaking Everything, just applied one level deeper, to the values inside an event rather than to the event shape itself.

Generalized, the lesson reads like this: whenever an event carries a value that was produced by some piece of logic, put a reference to that logic right next to the value. Not the logic itself. Not the inputs in full. Just a name. One extra field. That's the entire intervention.

The One Question Worth Asking Every Time

If there's a single habit that prevents more forgotten events than any other, it's the habit of asking one specific question during modeling: "What would I want to know about this event in two years that I don't know I want to know today?"

It sounds abstract, but in practice it cuts through most of the causes we've walked through. It breaks the CRUD reflex, because it forces you to think about questions rather than state. It weakens the happy-path bias, because the most interesting future questions are usually about the edges. It punctures the reconstruction fallacy, because it pushes you to imagine a world in which today's code is long gone. And it stretches the time horizon, because the question only makes sense over years, not sprints.

You won't get it right every time. No one does. There will always be events you wish you'd written and didn't, and there will always be fields you wish you'd added and didn't. The goal isn't perfection. It's to avoid the category of misses where a single extra field, decided in ten seconds, would have made the difference between an answerable question and a permanent gap. Most of the time, that's all it takes. A field. A prefix. A name.

If you'd like to explore Event Sourcing modeling more deeply, and especially the mindset shift away from CRUD that sits under so many of these questions, cqrs.com is where we gather our thinking on Event Sourcing, CQRS, and Domain-Driven Design. And if you've been through your own version of the story above, or if you're staring at an event store right now wishing it contained one more thing, we'd love to hear about it. Write to us at hello@thenativeweb.io. Every conversation about forgotten events teaches us something new.