Use horizontal Pod autoscaler on Kubernetes EKS cluster

To
You can download this article in PDF format via the link below to support us.

Download the guide in PDF format

turn off
To

To
To

of Horizontal pod auto scaler Is a Kubernetes resource controller, it can be based on the observed CPU utilization or with Custom indicator stand by. The horizontal pane auto-scaling only applies to objects that can be zoomed. Objects that cannot be scaled (such as DaemonSets) cannot be used.

Horizontal Pod Autoscaler is implemented as a Kubernetes API resource and controller. This resource determines the behavior of the controller. The controller periodically adjusts the number of replicas in the replication controller or deployment to match the observed average CPU usage to the target specified by the user.

Use horizontal Pod autoscaler on Kubernetes EKS cluster

You must install Metrics Server before you can use Horizontal Pod Autoscaler on EKS Cluster. Please follow the guide below for complete installation steps.

Install Kubernetes Metrics Server on Amazon EKS cluster

Use the following command to verify that the indicator server is running properly.

$ kubectl get apiservice v1beta1.metrics.k8s.io -o yaml

apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"apiregistration.k8s.io/v1beta1","kind":"APIService","metadata":{"annotations":{},"name":"v1beta1.metrics.k8s.io"},"spec":{"group":"metrics.k8s.io","groupPriorityMinimum":100,"insecureSkipTLSVerify":true,"service":{"name":"metrics-server","namespace":"kube-system"},"version":"v1beta1","versionPriority":100}}
  creationTimestamp: "2020-08-12T11:27:13Z"
  name: v1beta1.metrics.k8s.io
  resourceVersion: "130943"
  selfLink: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
  uid: 83c44e41-6346-4dff-8ce2-aff665199209
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
    port: 443
  version: v1beta1
  versionPriority: 100
status:
  conditions:
  - lastTransitionTime: "2020-08-12T11:27:18Z"
    message: all checks passed
    reason: Passed
    status: "True"
    type: Available

Deploy a sample application to test HPA

Let’s deploy a test application and use it to demonstrate how Horizontal Pod Autoscaler works.

Create a demo namespace:

$ kubectl create ns demo
namespace/demo created

$ kubectl get ns
NAME              STATUS   AGE
default           Active   2d20h
demo              Active   22s
kube-node-lease   Active   2d20h
kube-public       Active   2d20h
kube-system       Active   2d20h

Deploy the sample Apache web server application by running the following commands in the terminal.

$ kubectl apply -f https://k8s.io/examples/application/php-apache.yaml -n demo
deployment.apps/php-apache created
service/php-apache created

You can also use the kubectl run command to deploy applications and create services.

$ kubectl run php-apache 
  --generator=run-pod/v1 
  --image=k8s.gcr.io/hpa-example 
  --requests=cpu=200m 
  --limits=cpu=500m 
  --expose 
  --port=80

Check your application status.

$ kubectl get pods -n demo
NAME                          READY   STATUS    RESTARTS   AGE
php-apache-79544c9bd9-wccnj   1/1     Running   0          40s

Create Kubernetes HPA resources

When the application is running, we can create HPA resources.

$ kubectl autoscale deployment php-apache --cpu-percent=70 --min=1 --max=5 -n demo
horizontalpodautoscaler.autoscaling/php-apache autoscaled

The above command creates an automatic scaler, when the CPU utilization exceeds 70%, the scaler will expand the Pod. The minimum number of pods is set to 1 piece The maximum is 5.

Use the following command to get detailed information about the autoscaler:

$ kubectl get hpa -n demo
NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   0%/70%    1         5         1          80s

$ kubectl describe hpa -n demo
Name:                                                  php-apache
Namespace:                                             demo
Labels:                                                
Annotations:                                           
CreationTimestamp:                                     Fri, 14 Aug 2020 21:38:12 +0300
Reference:                                             Deployment/php-apache
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  0% (1m) / 70%
Min replicas:                                          1
Max replicas:                                          5
Deployment pods:                                       1 current / 1 desired
Conditions:
  Type            Status  Reason               Message
  ----            ------  ------               -------
  AbleToScale     True    ScaleDownStabilized  recent recommendations were higher than current one, applying the highest recent recommendation
  ScalingActive   True    ValidMetricFound     the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  False   DesiredWithinRange   the desired count is within the acceptable range
Events:           

Increased load

Now, let’s increase the load by accessing our deployed services on Kubernetes from multiple locations. For this, we use busybox containers to generate load.

kubectl run -it --rm load-generator --image=busybox /bin/sh --generator=run-pod/v1 -n demo

You have logged in to the container terminal. Run the following command to execute a while loop, which will reach the service endpoint on http:///php-apache

/ # while true; do wget -q -O - http://php-apache; done

Open a separate terminal and see how the autoscaler creates more pods in the deployment as the load increases.

$ kubectl get hpa -n demo
NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   83%/70%   1         5         5          9m

As long as the actual CPU percentage is higher than the target percentage, the number of copies will increase, up to 5. under these circumstances, 83%, Therefore the quantity REPLICAS Continue growing.

Use the following command to stop loading Ctrl + C

Watch the autoscaler scale down the deployment:

$ kubectl get hpa -n demo -w 

It may take a few minutes to run the Pod, and then back to 1. After finishing, please clean the settings.

$ kubectl delete -f https://k8s.io/examples/application/php-apache.yaml -n demo
deployment.apps "php-apache" deleted
service "php-apache" deleted

Delete the automatic scaler.

$ kubectl delete hpa php-apache -n demo
horizontalpodautoscaler.autoscaling "php-apache" deleted

Finally delete the demo namespace.

$ kubectl delete ns demo
namespace "demo" deleted

You will use the same method to auto-scaling applications through HPA through Metrics Server.

More articles about Kubernetes:

Enable CloudWatch logging in the EKS Kubernetes cluster

Ceph persistent storage for Kubernetes using Cephfs

How to create an admin user to access the Kubernetes dashboard

To
You can download this article in PDF format via the link below to support us.

Download the guide in PDF format

turn off
To

To
To

Sidebar