How to handle message that failed to response succ

2019-06-05 13:00发布

问题:

I have some question about handling message that failed to response success, but success commit to database.

The design that I think is for guarantee processing message once.
Following sequences are step for handling.
Question is included on statements with ★

1) Fetch message from Message Queue
-> When failed after this, MQ will timeout & retry

2) Cache.SetIfNotExist(MessageId, MyId, Timeout)
ProcessingTime < Cache.Timeout < MQ.Timeout
* This make ownership for message
-> When failed after this, Cache will timeout and MQ will timeout & retry

3) Processing Data include read storage
* All of data should include optimistic locking information
-> When failed after this, Cache will timeout and MQ will timeout & retry

4) Cache.Get(MessageId) == MyId
* This confirm this processor has ownership to message
-> When failed after this, Cache will timeout and MQ will timeout & retry

5) Commit Data
* This will commit all data to storage
* If you update multiple document, optimistic locking do not guarantee consistency (if all or nothing feature is exist, you can get consistency guarantee)
* If you use some document for reading in optimistic locking state, read document and commit document should check by optimistic locking
* You should use transaction in RDBMS for guarantee consistency
★ Failed after this is problem. If MQ retry this transaction, there is no way to check. So transcation will be processed twice or more.
★ If cache is timeout during commit data, same problem is occurred.

6) Cache.Set(MessageId, MyId, Timeout)
* Prevent retry by MQ timeout before deleting message.

7) Ack Message
* Send to finish and delete message from message queue

  • Question : How to handle this problem?
    1) Failed after commit data is problem. If MQ retry this transaction, there is no way to check. So transcation will be processed twice or more.
    2) If cache is timeout during commit data, same problem is occurred.

Thanks you for reading.

回答1:

I found there is no way to resolvs this problem if database doesn't hold message ownersnip data that was described at question.
In database, recovery mechanism restore removing crashed data like commited but not end transacriom after downtime.
One way to solve that situation is add phase flag to ownership cache check database when failed at commit stage.
Other way is updated database know published messageid with expiry and removing this with periodical batch.
Only one threaded write job can guarantee consistency, or required other mechanism. So service to DB communication has two thread, it cannot guarantee.