just wondering about the guarantee of the event notification delivery from the persistent event sourced actor to the read processor in lagom , is there any or there is no message durability for event notification to the read processor which will update the query side ?
I understand there is eventual consistency which is fine but i am talking about the event handler notification to the Cassandra read processor.
Event processing is guaranteed by using event sourcing in the persistent entity and offset tracking in the read-side processing.
When your persistent entity command handler persists events, each of the events is stored with an ordered "offset".
Read-side processors work by polling the database for events with offsets greater than the last offset that has already been processed. Because all events and each read-side processor's latest offset are all persisted in the database, this ensures that events won't be missed even if a read-side processor crashes and restarts.
Lagom's Cassandra read-side processors return a CompletionStage
or Future
that produce a list of Cassandra BoundStatement
instances, and these are executed in an atomic batch update along with the offset update. As long as all of the effects of your read-side event handler are captured in the list of updates it produces, this ensures that the event will be handled effectively once: if part of the update fails, it will be automatically retried.
If you're doing anything else in your event handler, you'll need to be sure that the offset update only happens if your event handler is successful. The CompletionStage
or Future
the event handler returns must only complete after your side-effect does, and the success or failure of your operation should be propagated. Be aware that your event handler will be retried if the offset is not updated, so if your event handler interacts with an external service, for example, you'll need to be sure it is idempotent.
You should also be aware of how eventual consistency can affect things. The akka-persistence-cassandra
configuration reference has some details:
The returned event stream is ordered by the offset (timestamp), which corresponds
to the same order as the write journal stored the events, with inaccuracy due to clock skew
between different nodes. The same stream elements (in same order) are returned for multiple
executions of the query on a best effort basis. The query is using a Cassandra Materialized
View for the query and that is eventually consistent, so different queries may see different
events for the latest events, but eventually the result will be ordered by timestamp
(Cassandra timeuuid column). To compensate for the the eventual consistency the query is
delayed to not read the latest events, the duration of this delay is defined by this
configuration property.
However, this is only best effort and in case of network partitions
or other things that may delay the updates of the Materialized View the events may be
delivered in different order (not strictly by their timestamp).
The important consequence is that if the latency of eventual consistency is longer than the configured eventual consistency delay (possibly due to a network partition between Cassandra nodes), there is a possibility of events being "lost". A read-side handler may have already processed a newer event and stored its offset before an older event has been delivered to the node that it is reading from. You might need to tune your configuration accordingly.