Azure inter-role synchronization

2019-05-02 09:40发布

问题:

I was wondering about the best practices in synchronizing multiple azure instances that run the same role. More precisely, I want to prevent several worker roles to work on the same work-unit.

Azure queues do not seem to help on this matter. One option is to use an sql table with locks and stored procedures; but using sql synchronization in Azure seems a bit awkward.

Any ideas?

Edit, my detailed(but simplified problem) is as follows:

  • There are n targets.
  • A unit of work must be done on each target at a specified interval (say 30 seconds - but it is different for each target).
  • I have m workers (hosted in h instances).
  • Processing a unit of work could take anything between 10 seconds and 1 hour.

The idea is that I have a scheduler that puts units of work in an Azure queue, and each of the m workers will read these and process them.

The problem:

  • worker1 starts working on unit1 (which is regarding target1) - this one will take long, say 10 minutes
  • 30 seconds pass
  • the scheduler puts another unit of work for target1, say unit13
  • worker2 starts working on unit13, against the same target1 - not good

I have some ideas, but they don't seem cloudy enough, so I am interested to see what solutions would you apply for this problem.

回答1:

I've just written a couple blog posts about using blob leases to do this sort of thing. See http://blog.smarx.com/posts/managing-concurrency-in-windows-azure-with-leases and http://blog.smarx.com/posts/building-a-task-scheduler-in-windows-azure.



回答2:

dunnry is spot-on: queues work great for preventing multiple instances from working on the same work item. When you call GetMessage, the message you retrieve is now invisible for the timespan you specify (default: 30 seconds). In that timespan, no other reader can retrieve this queue message.

Having said that: You need to ensure your processing is idempotent. In the case where your processing takes longer than the invisibility timespan, the message becomes visible again. At this point, the original reader cannot delete the message, and someo other reader can read the message (making it once again invisible). In this case, it's possible that you re-process the same message. You'll need to carefully set your timeout window to avoid this as a general rule.

Note: Each CloudQueueMessage has aDequeueCount property, so you can determine if the message has been seen more than once (and so you can also deal with poison messages).



回答3:

CloudFX has a PrimaryInstanceManager class that can be used for some of these scenarios.