A Cluster Autoscaler is a Kubernetes component that automatically adjusts the size of a Kubernetes Cluster so that all pods have a place to run and there are no unneeded nodes. It Works with major Cloud providers – GCP, AWS and Azure. In this short tutorial we will explore how you can install and configure Cluster Autoscaler in your Amazon EKS cluster. The Cluster Autoscaler will automatically modify your node groups so that they scale out when you need more resources and scale in when you have underutilized resources.
You should have a working EKS cluster before you can use this guide. Our guide below should help you get started.
Easily Setup Kubernetes Cluster on AWS with EKS
If you’re looking at Pods autoscaling feature refer to below guide.
Using Horizontal Pod Autoscaler on Kubernetes EKS Cluster
Enable Cluster Autoscaler in an EKS Kubernetes Cluster
The Cluster Autoscaler requires some additional IAM policies and resource tagging before it can manage autoscaling in your cluster.
Step 1: Create EKS Additional IAM policy
The Cluster Autoscaler requires the following IAM permissions to make calls to AWS APIs on your behalf.
Create IAM policy json file:
cat >aws-s3-eks-iam-policy.json<<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup",
"ec2:DescribeLaunchTemplateVersions"
],
"Resource": "*",
"Effect": "Allow"
}
]
}
EOF
Apply the policy:
aws iam create-policy --policy-name EKS-Node-group-IAM-policy --policy-document file://aws-s3-eks-iam-policy.json
This is my policy creation output:
{
"Policy": {
"PolicyName": "EKS-Node-group-IAM-policy",
"PolicyId": "ANPATWFKCYAHACUQCHO3D",
"Arn": "arn:aws:iam::253750766592:policy/EKS-Node-group-IAM-policy",
"Path": "/",
"DefaultVersionId": "v1",
"AttachmentCount": 0,
"PermissionsBoundaryUsageCount": 0,
"IsAttachable": true,
"CreateDate": "2020-09-04T12:26:20+00:00",
"UpdateDate": "2020-09-04T12:26:20+00:00"
}
}
Step 2: Attach policy to EKS Node group
If you used the eksctl
commands to create your node groups with –asg-access option the permissions required are automatically provided and attached to your node IAM roles.
Login to AWS Console and go EC2 > EKS Instance > Description > IAM role
Click on the IAM role link to add permissions under Attach Policies
Attach the policy we created earlier.
Confirm settings.
Do the same on EKS > ClusterName > Details
Take note of IAM ARN used by the cluster then go to IAM > Roles and search for it.
Attach the policy we created to the role.
Confirm the policy is in the list of attached policies.
Step 3: Add Node group tags
The Cluster Autoscaler requires the following tags on your node Auto Scaling groups so that they can be auto-discovered.
Key | Value |
---|---|
k8s.io/cluster-autoscaler/<cluster-name> |
owned |
k8s.io/cluster-autoscaler/enabled |
true |
Navigate to EKS > Clusters > Clustername > Compute
Select Node Group and click Edit
Add the Tags at the bottom and save the changes when done.
Step 4: Deploy the Cluster Autoscaler in EKS
Login to the machine from where you execute kubectl command to deploy Cluster autoscaler.
kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
You can as well download the yaml file before applying.
wget https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
kubectl apply -f ./cluster-autoscaler-autodiscover.yaml
Run the following command to add the cluster-autoscaler.kubernetes.io/safe-to-evict
annotation to the deployment:
kubectl -n kube-system annotate deployment.apps/cluster-autoscaler cluster-autoscaler.kubernetes.io/safe-to-evict="false"
Edit the Cluster Autoscaler deployment:
kubectl edit deploy cluster-autoscaler -n kube-system
To set your Cluster name and add the following options
--balance-similar-node-groups
--skip-nodes-with-system-pods=false
See below screenshot for my setup.
Open the Cluster Autoscaler releases page in a web browser and find the latest Cluster Autoscaler version that matches your cluster’s Kubernetes major and minor version. For example, if your cluster’s Kubernetes version is 1.17 find the latest Cluster Autoscaler release that begins with 1.17. Record the semantic version number (1.17.n
) for that release to use in the next step.
Since my cluster is v1.17, I’ll use the latest container image version available for 1.17 which is 1.17.3:
kubectl -n kube-system set image deployment.apps/cluster-autoscaler cluster-autoscaler=eu.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:v1.17.3
Check to see if the Cluster Autoscaler Pod is running:
$ kubectl get pods -n kube-system -w
NAME READY STATUS RESTARTS AGE
aws-node-glfrs 1/1 Running 0 23d
aws-node-sgh8p 1/1 Running 0 23d
cluster-autoscaler-6f56b86d9b-p9gc7 1/1 Running 5 21m # It is running
coredns-6987776bbd-2mgxp 1/1 Running 0 23d
coredns-6987776bbd-vdn8j 1/1 Running 0 23d
efs-csi-node-p57gw 3/3 Running 0 18d
efs-csi-node-z7gh9 3/3 Running 0 18d
kube-proxy-5glzs 1/1 Running 0 23d
kube-proxy-hgqm5 1/1 Running 0 23d
metrics-server-7cb45bbfd5-kbrt7 1/1 Running 0 23d
You can watch for logs stream.
kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler
Output:
I0904 14:28:50.937242 1 scale_down.go:431] Scale-down calculation: ignoring 1 nodes unremovable in the last 5m0s
I0904 14:28:50.937257 1 scale_down.go:462] Node ip-192-168-138-244.eu-west-1.compute.internal - memory utilization 0.702430
I0904 14:28:50.937268 1 scale_down.go:466] Node ip-192-168-138-244.eu-west-1.compute.internal is not suitable for removal - memory utilization too big (0.702430)
I0904 14:28:50.937333 1 static_autoscaler.go:439] Scale down status: unneededOnly=false lastScaleUpTime=2020-09-04 13:57:03.11117817 +0000 UTC m=+15.907067864 lastScaleDownDeleteTime=2020-09-04 13:57:03.111178246 +0000 UTC m=+15.907067938 lastScaleDownFailTime=2020-09-04 13:57:03.111178318 +0000 UTC m=+15.907068011 scaleDownForbidden=false isDeleteInProgress=false scaleDownInCooldown=false
I0904 14:28:50.937358 1 static_autoscaler.go:452] Starting scale down
I0904 14:28:50.937391 1 scale_down.go:776] No candidates for scale down
Step 5: Testing EKS Cluster Autoscaler
Now that the installation is complete. Let’s put it to test.
I have two nodes in the cluster. Maximum number set in Node Group is 3.
$ $ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-192-168-138-244.eu-west-1.compute.internal Ready <none> 23d v1.17.9-eks-4c6976
ip-192-168-176-247.eu-west-1.compute.internal Ready <none> 23d v1.17.9-eks-4c6976
We’ll deploy a huge number of Pods to see if it will autoscale to maximum number of nodes set in Node group.
vim nginx-example-autoscale.yml
Add:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 100
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
Apply the YAML file
kubectl apply -f nginx-example-autoscale.yml
Watch as new nodes are created.
$ watch kubectl get nodes
You should see a new node created and added to the cluster.
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-192-168-119-255.eu-west-1.compute.internal Ready <none> 26m v1.17.9-eks-4c6976
ip-192-168-138-244.eu-west-1.compute.internal Ready <none> 23d v1.17.9-eks-4c6976
ip-192-168-176-247.eu-west-1.compute.internal Ready <none> 23d v1.17.9-eks-4c6976
Delete deployment with Pods and the cluster should scale down.
$ kubectl delete -f nginx-example-autoscale.yml
That’s all you need to configure Cluster Autoscaling in an EKS Kubernetes Cluster.
More on EKS:
Install Istio Service Mesh in EKS Kubernetes Cluster
Install CloudWatch Container Insights on EKS | Kubernetes
Deploying Prometheus on EKS Kubernetes Cluster
Kubernetes Learning courses: