• 8 October 2024

This article will provide training for using HPA in the ArvanCloud Container. The concept of HPA in the ArvanCloud Container is similar to that of Kubernetes.

Prerequisites

The only prerequisite for this system is to have an ArvanCloud account and access to the Cloud Container. So first, log in to your account. Then go to the profile section, create a new API KEY in the Machine User header, and save it somewhere.

To perform the steps of this article, you need to use the ArvanCloud command line; if necessary, put it in your PATH and give it administrative access and log in through the command line:

arvan login

Then paste the API KEY you received from the site in the continuation of the above command line.

What Is HPA?

In some systems, with the increase of requests, the system load may increase so much that the application cannot respond, and some requests may encounter errors. In this case, it is possible to respond to the desired load by increasing the application resources or uploading similar applications and load balancing between them.

Increasing the number of applications in the ArvanCloud Container is done by increasing the replicas parameter in deployment.

Now, if this increase in load is temporary or happens at different times of the day and then returns to normal, changing the number of applications manually can be a tedious task, or if you forget, it may cause an error in the response of the service. On the other hand, if you allocate a lot of resources to the application permanently, some resources will remain unused in times of low load. Therefore, you will be charged an additional fee.

HPA or Horizontal Pod Autoscaler, by receiving a series of initial settings, ensures that if the load on your application exceeds a specific value, by automatically adding the number of replicas, your service will not face resource limitations to respond to requests. On the other hand, if the load decreases, it avoids wasting resources and additional costs by automatically reducing the number of replicas.

Note: HPA in the ArvanCloud Container can only be applied to deployments that do not have a persistent disk (so-called stateless services).

What is a Readiness Probe?

Before explaining how to use HPA, we need to get acquainted with the concept of Readiness Probe. A running program may get an error for random reasons, and its execution may have problems. In the ArvanCloud Container Service, if the program encounters some errors, the desired Pod will be restarted automatically if deployment is used. However, some error situations may not be detected by the ArvanCloud Container Service, and the target container is still up but needs to service requests properly. The Readiness Probe can be used to solve this problem.

Note: In addition to the concept of Readiness Probe, there is another concept called Liveness Probe, both of these concepts will be discussed in detail in another article, and only a brief mention of Readiness Probe is made here.

With the definition of Readiness Probe, you can specify the conditions that the ArvanCloud Container Service will automatically check if these conditions are met, and if not, it will prevent traffic from entering the Pod by removing the IP of the Pod from the Endpoint of all services.

Note: To use HPA, it is mandatory to define Readiness Probe in Deployment.

Using HPA

To use HPA, you must first define a Readiness Probe for Deployment. For example, the following file contains the deployment definition of an Nginx and Readiness Probe.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx
        imagePullPolicy: IfNotPresent
        name: nginx
        ports:
        - containerPort: 2368
          protocol: TCP
        readinessProbe:
          exec:
            command:
            - nginx
            - -v
          initialDelaySeconds: 15
          timeoutSeconds: 1
        resources:
          limits:
            cpu: '1'
            ephemeral-storage: 0.5G
            memory: 1G
          requests:
            cpu: '1'
            ephemeral-storage: 0.5G
            memory: 1G

Note: Indentation is vital in yaml files, and the slightest shift can cause an error or unwanted settings to be returned.

spec.template.spec.containers.readinessProbe: This section contains the Readiness Probe definition. Readiness Probe can be used in three ways. First, execute a command like the above example, checking an HTTP endpoint and a TCP socket. The explanation of each of these methods is described in another article. In this example, the ArvanCloud Container Service ensures the container’s health by regularly checking the nginx -v command inside the container and checking the exit code of this command.

spec.template.spec.containers.readinessProbe.initialDelaySeconds: Sometimes, it takes time for the container to reach full execution mode, and in this period, the desired output may not be provided in response to the specified check. With appropriate settings, the ArvanCloud Container Service waits for a while after the container is loaded and before checking the specified conditions.

spec.template.spec.containers.readinessProbe.timeoutSeconds: The amount of time ArvanCloud Container Service waits for the condition probe response before it considers it failed.

Enter and save the above lines in a file called nginx-deployment.yaml. Then submit your deployment to the ArvanCloud Container Service through the command line with the following command.

arvan paas apply -f nginx-deployment.yaml

Then, with the following command, you can understand your deployment’s status and execution on the ArvanCloud Container Service.

arvan paas get deployment nginx-deployment

Defining HPA

CPU consumption should be set as an Autoscale indicator to define HPA in the ArvanCloud Container Service. This means that by specifying a specific limit for a Pod’s CPU consumption, ArvanCloud Container Service will increase the number of Deployment Pods if the specified amount is exceeded.

To define HPA for a Deployment, just enter the following command.

arvan paas autoscale deploy nginx-deployment --max 10 --min=1 --cpu-percent=50

The above command enables HPA for the deployment we defined before.

In this command, by specifying –max, we set the maximum number of replicas for Pod in case of load increase.

–min specifies the minimum number of replicas for Pod in case of load reduction.

–cpu-percent specifies that if the average CPU consumption of the current Pods exceeds the set number, the number of Pods should be automatically increased by the ArvanCloud Container Service so that the average CPU consumption is less than the specified limit or the number of Pods reaches the –max number. On the other hand, if the load decreases, the number of Pods is automatically reduced until the average CPU consumption is still below the specified limit or the number of Pods reaches the –min value.

By executing the above command, HPA is activated for the desired deployment. You can view the defined HPA status with the below command.

$ arvan paas get hpa
NAME               REFERENCE                     TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
nginx-deployment   Deployment/nginx-deployment   1%/50%   1         10        1          1h

Now, if the amount of load (CPU consumption) on the Pod increases, the number of replicas will automatically increase, as shown below.

$ arvan paas get hpa
NAME               REFERENCE                     TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
nginx-deployment   Deployment/nginx-deployment   50%/50%   1         10        2          1h

As the load decreases, the number of replicas returns to the previous value.

Also, with the following command, you can find the details of the desired HPA function.

$ arvan paas describe hpa nginx-deployment
Name:                                                  nginx-deployment
Namespace:                                             example-project
Labels:                                                
Annotations:                                           
CreationTimestamp:                                     Sat, 30 May 2020 11:08:21 +0430
Reference:                                             Deployment/nginx-deployment
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  50% (251m) / 50%
Min replicas:                                          1
Max replicas:                                          10
Deployment pods:                                       2 current / 2 desired
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    the last scale time was sufficiently old as to warrant a new scale
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type     Reason                        Age                From                       Message
  ----     ------                        ----               ----                       -------
  Normal   SuccessfulRescale             1h (x3 over 1h)    horizontal-pod-autoscaler  New size: 4; reason: cpu resource utilization (percentage of request) above target
  Normal   SuccessfulRescale             57m (x4 over 58m)  horizontal-pod-autoscaler  New size: 2; reason: Current number of replicas below Spec.MinReplicas
  Normal   SuccessfulRescale             52m                horizontal-pod-autoscaler  New size: 1; reason: All metrics below target
  Warning  FailedGetResourceMetric       51m (x2 over 51m)  horizontal-pod-autoscaler  did not receive metrics for any ready pods
  Warning  FailedComputeMetricsReplicas  51m (x2 over 51m)  horizontal-pod-autoscaler  failed to get cpu utilization: did not receive metrics for any ready pods
  Normal   SuccessfulRescale             46m (x2 over 1h)   horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target

For more information, you can refer to OKD and k8s documentation.