ClouDigest: Digesting the Power of Autoscaling
Auto-Scaling is a critical feature in AWS that allows you to automatically adjust the number of EC2 instances in response to changes in demand. It ensures your application remains responsive and cost-efficient by scaling resources up or down as needed.
Components of Auto-Scaling:
Launch Configuration:
A Launch Configuration defines the configuration settings for the instances that Auto-Scaling launches. It includes information such as the Amazon Machine Image (AMI), instance type, security groups, and user data. Once created, the Launch Configuration cannot be modified. If you need to make changes, you'll have to create a new one.
Auto-Scaling Groups:
An Auto-Scaling Group is a logical grouping of EC2 instances that share similar characteristics. It is associated with a Launch Configuration, and its primary function is to manage and maintain the desired number of instances based on your scaling policies. You can specify the minimum, desired, and maximum number of instances in the Auto-Scaling Group.
Scaling Options:
a. Manual Scaling:
With manual scaling, you can manually adjust the number of instances in the Auto-Scaling Group according to your requirements. For instance, you can increase the number of instances during peak traffic periods and decrease them during low-traffic times.
b. Dynamic Scaling (Scaling Policies):
Dynamic scaling, also known as scaling policies, is an automated approach to adjust the number of instances based on predefined conditions. There are two types of scaling policies:
Scaling based on CPU utilization:
This policy scales the Auto-Scaling Group based on the average CPU utilization of the instances. You can set thresholds for scaling up or down when CPU utilization crosses certain levels.
Scaling based on demand or schedule:
This policy allows you to scale based on specific time-based schedules or custom metrics. For example, you can set a schedule to scale up during weekdays and scale down during weekends.
Scaling Policy and Alarm:
The scaling policy in AWS Auto-Scaling determines how many instances to scale based on specified conditions. For example, a scaling policy might dictate launching two instances when the CPU utilization of existing instances goes above 80 percent.
An alarm acts as a trigger for the scaling policy. It monitors specific metrics, such as CPU utilization or network traffic, and initiates scaling actions when the defined thresholds are breached.
In an Auto-Scaling Group, you specify the minimum, desired, and maximum number of instances. The scaling policy operates within these boundaries to add or remove instances as required.
Types of Scaling Policy:
Simple Scaling:
Simple Scaling involves a single trigger that results in a straightforward scaling action. For example, when CPU utilization exceeds 70 percent, the scaling policy might add two instances to the group.
Step Scaling:
Step Scaling allows for more complex scaling adjustments based on multiple thresholds. For instance, at 60 percent CPU utilization, two instances are added. At 80 percent, three more instances are added, and at 85 percent, four additional instances are launched.
Termination Policy for Instances:
When Auto-Scaling decides to terminate instances, it follows a termination policy to determine the order of termination. Instances are selected based on the following criteria:
Oldest Instance:
The instance with the longest running time will be terminated first.
Old Configuration Instance:
Instances with older configurations are prioritized for termination.
Closest to Next Instance Hour:
The instance closest to the next full hour is chosen for termination.
Instance Protection:
Instances can be protected from termination using the Instance Protection feature. When enabled, this ensures that specific instances remain operational even during scaling actions, safeguarding critical resources from accidental termination.
Best Practice: Avoid Direct Connection of Auto-Scaling Group to Load Balancer
Connecting an Auto-Scaling Group directly to a load balancer with running instances should be avoided. Such connections may lead to unintended termination of instances, potentially disrupting your application's availability and performance.
Comments