I've been reading about event sourcing and although I found it a quite natural approach for several problems, I didn't quite understand how to store the events in practice.
Searching a little on the internet I've found this article by Vaughn Vernon talking about a simple approach to the storage of aggregates in DDD. Although it is not specifically about event sourcing, he purposes a way to store domain events using PostgreSQL.
In his approach, we have a table Events
with one id
and a JSON data
field. This gives a lot of freedom, since we can store any JSON data and hence we can store a variety of events.
But having all the events corresponding to all the aggregates in a single table, makes me a little worried.
So, when we store events to use event sourcing, how should we proceed? I can see three options:
Following the idea used for domain events on the article and store everything inside a single table.
Create one table per event. The drawback here is that we need to track the events for each aggregate, and for each aggregate there can be various kinds of events. So this would easily lead to a huge table number.
Create one table per aggregate and store all the events for that aggregate there. Although we end up with different kinds of event brought together in the same table, they are all related to the same aggregate.
Which of these three options would be the more reasonable? If none, what would be the correct way to store events when using event sourcing?
Sounds like FUD.
All events look the same, right? A blob of data, and some columns of meta data that are useful for placing the blob in context. You don't have any particularly clever relations to run; find all events in stream, find all events caused by command (which are all going to be in the same stream anyway), that's about it.
Events probably all belong in the same logical view.
Physically, you might want to goof around so that you can scale. You may want to review what Udi Dahan had to say in CQRS but different slides. But the basic idea here is that sharding/partitioning is a problem that database vendors are already in the business of solving, so let them do it.
Discussions of Postgres event stores: