When you setup an Auto Scaling groups in AWS EC2 Min
and Max
bounds seem to make sense:
- The minimum number of instances to scale down to based on policies
- The maximum number of instances to scale up to based on policies
However, I've never been able to wrap my head around what the heck Desired
is intended to affect.
I've always just set Desired
equal to Min
, because generally, I want to pay Amazon the minimum tithe possible, and unless you need an instance to handle load it should be at the Min
number of instances.
I know if you use ElasticBeanstalk
and set a Min
to 1 and Max
to 2 it sets a Desired
to 2 (of course!)--you can't choose a value for Desired
.
What would be the use case for a different Desired
number of instances and how does it differ? When you expect AWS to scale lower than your Desired
if desired is larger than Min
?
Here are the explanations for the "min, desired and max" values from AWS support:
Think about it like a sliding range UI element.
With min and max, you are setting the lower bound of your instance scaling. Withe desired capacity, you are setting what you'd currently like the instance count to hover.
Example: You know your application will have heavy load due to a marketing email or product launch...simply scale up your desired capacity beforehand:
Source
This happens when you set a CloudWatch alarm based on some AutoScaling policy. Whenever that alarm is triggered it will update the DesiredCount to whatever is mentioned in config.
e.g., If an AutoScalingGroup config has Min=1, Desired=3, Max=5 and there is an Alarm set on an AutoScalingPolicy which says if CPU usage is <50% for consecutive 10 mins then
Remove 1 instances
then it will keep reducing the instance count by 1 whenever the alarm is triggered until the DesiredCount = MinCount.Lessons Learnt: Set the MinCount to be > 0 or = DesiredCount. This will make sure that the application is not brought down when the mincount=0 and CPU usage goes down.
Desired capacity simply means the number of instances that will come up / fired up when you launch the autoscaling. That means if desired capacity = 4, then 4 instances will keep on running until and unless any scale up or scale down event triggers. If scale up event occurs, the number of instances will go up till maximum capacity and if scale down event occurs it will go down till the minimum capacity.
Correct me if wrong, thanks.
Based on my reading, in layman's terms,
DesiredCapacity
value is automatically updated on scale-in and scale-out events.In other words,
Scale-in or Scale-out are done by decreasing or increasing the
DesiredCapacity
value.