What is the best way to handle consistency between aggregates? Having an example from Vaugn Vernon book, you have BacklogItem aggregate and SprintAggregate. When BacklogItemEvent is raised the event handler catches it and tries to update Sprint Aggregate. What if this operation fails? How to find the best way of handling this situation? As far as I understand there are 3 options: 1) Update all aggregates in one transaction. We loose scalability, but we gain consistency. 2) Do nothing. Just log and Error and wait for manual intervention. 3) Use saga. This complicates the design and forces us to implement each usecase which has to enforse envariants between aggregates in a separate object(saga). If Sprint update fails, saga will try to Uncommit Backlog item (compensate). Which of this option will you choose, and what is the criteria you base on?
问题:
回答1:
What is the best way to handle consistency between aggregates?
If your aggregates are correctly designed, then you handle "consistency" between aggregates over time (aka: eventual consistency).
What if this operation fails?
Take a careful read through Race Conditions Don't Exist; Udi Dahan makes an argument that operations in collaborative domains should not fail.
Update all aggregates in one transaction.
You can do that; but what that effectively means that that the two entities are really part of a single implicit aggregate. In other words, it strongly suggests that you haven't got your aggregate boundaries in the right place.
Trying to modify a multiple aggregates in a single transaction is effectively two phase commit, with all of the additional complications that arise from that.
Do nothing. Just log and Error and wait for manual intervention.
Yup; see, for instance; what Greg Young has to say about warehouse systems and exception reports.
Use saga. This complicates the design and forces us to implement each use case which has to enforce invariants between aggregates in a separate object(saga).
These days, you'll normally see "process manager" rather than "saga", which has a more specific meaning. But yes, if the domain model needs orchestration between aggregates, then you are going to need to describe the orchestration logic somewhere.
You might want to review Rinat Abdullin's discussion of Evolving Business Processes; he makes a pretty good argument that the automation is just replicating the actions the human operator would take.
Which of this option will you choose, and what is the criteria you base on?
I strongly prefer simple to easy. So I would aim for exception reporting, on the argument that (a) these failures should be rare anyway, so we don't want to be investing a lot of design capital in work far off the happy path, and (b) if we have failing commands in the system, then we ought to have a mechanic for reporting failed commands anyway, so I'm just leveraging what's already present.
If I were squeezed for time, if the project hadn't yet become successful enough to need to scale, if I didn't have the reporting pieces needed at hand, I might prefer instead to sneak the changes into a single transaction, and then raise an exception report in the development process itself to call attention to the fact that more work needed to be done later.
回答2:
Which of this option will you choose, and what is the criteria you base on?
Domain expert input. If they demand extremely strict correctness at all times, chances are eventual consistency won't make the cut. There are other times when compensating actions require manual interventions that are hardly feasible in a given domain. Or, it could be extremely simple and beneficial to include a human in the loop. Talking with a business person will teach you about the broader domain process and uncover or rule out some options.
Transactional analysis. If they are not under strong concurrent access, maybe updating 2 aggregates in a single transaction is not that problematic. In contrast, identifying "hot" aggregates allows you to leverage looser consistency where it matters.
Use case complexity. Not all eventual consistency scenarios require a Saga. If the operation is as simple as updating an aggregate as a consequence of an event and rolling the original change back in the unlikely event that the update fails, chances are you don't need such a complex, long-lived pattern.