AKS Auto scaling

We have two properties in AKS autoscaling

  1. Cluster Autoscaler

  2. Horizontal Pod Autoscaler

Cluster Autoscaler

If pod demand changes, he Kubernetes cluster Autoscaler adjusts the number of nodes based on the requested compute resources in the node pool.

#create a resource group
az group create --name myResourceGroup --location eastus

# create a AKS cluster with autoscaler enabled
az aks create \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --node-count 1 \ # 3 is recomended for production grade cluster
  --vm-set-type VirtualMachineScaleSets \
  --load-balancer-sku standard \
  --enable-cluster-autoscaler \ #here we are enabling cluster Autoscaler
  --min-count 1 \
  --max-count 3

# updating a existing cluster with cluster Autoscaler enabled
az aks update \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --enable-cluster-autoscaler \
  --min-count 1 \
  --max-count 3

in portal, go to your aks-cluster > settings > Node pools > select Autoscale

Notes:

  1. By default, the cluster Autoscaler checks the Metrics API server every 10 seconds for any required changes in node count.

  2. The cluster Autoscaler works with Kubernetes RBAC-enabled AKS clusters that run Kubernetes 1.10.x or higher.

  3. cluster Autoscaler is typically used alongside the horizontal pod autoscaler. When combined, the horizontal pod autoscaler increases or decreases the number of pods based on application demand, and the cluster autoscaler adjusts the number of nodes to run more pods.

Horizontal Pod Autoscaler

We can set a target CPU utilization % and HPA scales in or out as per to meet the requirement.

HPA need Kubernetes metrics server to verify the CPU metrics of a pod.

1) HPA checks data from metrics server in every 15 seconds

2) HPA Calculating the replicas

3) HPA asks to scale the Application replicas in or out

HPA requirements:

metrics data
CPU %
min replicas
max replicas

kubectl autoscale deployment mydeployment --cpu-percentage --min=2 --max=10

HPA manifest

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: store-front-hpa
spec:
  maxReplicas: 10 # define max replica count
  minReplicas: 3  # define min replica count
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: store-front # this should be your deployment name
    metrics: 50 # target CPU utilization

##############################################################
kubectl describe hpa/store-front-hpa

Notes:

By default, the HPA checks the Metrics API in every 15 seconds for any required changes in replica count, and the Metrics API retrieves data from the Kubelet every 60 seconds.