Extremely uneven cloud service load-balancing with

2019-05-10 00:07发布

问题:

I'm utilizing Azure for hosting a cloud service, which I recently modified to be scalable across multiple instances, including a session caching worker role. My question is, why would I be seeing extreme load (upwards of 90%) on one instance, but not on other instances (15-20% across all other instances)? Should I be worried?

Before I set up load balancing and when my single instance hit upwards of 95% load, it would slow to a crawl --- becoming unusable. Is there any way to ensure that I don't have any users experiencing this because they're somehow round-robin'd onto the overloaded instance?

回答1:

We found we had a similar type of situation when one load-balanced instance failed over; what we were seeing is that all the load transferred, but wouldn't balance out again. We found that turning off keep-alive for a couple of minutes let the load spread again, after which we could turn it back on.

http://technet.microsoft.com/en-us/library/cc772183(v=ws.10).aspx



回答2:

Well... azure load balance is based on round robin... so the distribution should be almost equal (something like 60-40 or even 70-30 is still acceptable)... so just to be sure: Are you sure your not using IIS "redirect" (I forgot the name of the feature) that would set sticky session?

I must say that without further details about what your site actually "do and how" it's quite hard to advice... I must say that this behavior is strange, but it's not clear that it is the load balancer fault...

Edit1: I would suggest that you further exam what is the 90% guy is doing by tracing it's activities... maybe you're out-of-luck and the requests that will cause heavy load are falling into that machine and the ones that will be quickly worked are being worked by the other one... Another thing that might be happening is that something might be stucked (maybe a infinite-loop)... if you implemented a scalable architecture I would recommend that you provision another machine and kill the one that is suffering...

Edit2: A simple way to verify that the load balancer is working is: Log remotely to the service machines and replace something like a image that is displayed on the main page (something that you can easily spot just by looking to the page). On server 1 put lets say a yellow image and on server 2 a red image (ok... maybe something not this drastic but you get the point...). Then keep loading the page again and again...