Testing memory-based horizontal pod autoscaling on OpenShift

Red Hat OpenShift offers horizontal pod autoscaling (HPA) primarily for CPUs, but it can also perform memory-based HPA, which is useful for applications that are more memory-intensive than CPU-intensive. In this article, I demonstrate how to use OpenShift's memory-based horizontal pod autoscaling feature (tech preview) to autoscale your pods if the demands on memory increase. The test performed in this article might not necessarily reflect a real application. The tests only aim to demonstrate memory-based HPA in the simplest way possible.

I use a simple PHP application (index.php) that creates a large array in memory every time a request is made. The code looks like this:

<?php

$arr = array();
$arr_size = 100000;

for ($i=1;$i<=$arr_size;$i++) {
$arr[] = $i;
}

echo "created an array of $arr_size entries";

?>

You can perform this test with any language. The load test works by creating multiple parallel curl requests to the application. I chose PHP for its personal convenience.

Check if `v2beta1` is enabled

Check if v2beta1 is enabled in your cluster:

# oc get --raw /apis/autoscaling/v2beta1
{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"autoscaling/v2beta1","resources":[{"name":"horizontalpodautoscalers","singularName":"","namespaced":true,"kind":"HorizontalPodAutoscaler","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"shortNames":["hpa"],"categories":["all"]},{"name":"horizontalpodautoscalers/status","singularName":"","namespaced":true,"kind":"HorizontalPodAutoscaler","verbs":["get","patch","update"]}]}

If it isn’t enabled, follow the documentation for enabling it on your respective OpenShift version.

Set up your testbed

Before you can test, you need to set up the environment you're testing in and the application you're testing.

Set up your environment

To set up your new environment:

Create a new project:

# oc new-project memhpa
Now using project "memhpa" on server "https://console.ocp.mylab:8443".

Create an image stream (this assumes that you have your authentication set up correctly):

# oc import-image myphp --insecure --from=registry.redhat.io/openshift3/php-55-rhel7:latest --confirm
imagestream.image.openshift.io/myphp imported

Apply limits to the namespaces before creating your first application:

# cat limits.yml 
apiVersion: "v1"
kind: "LimitRange"
metadata:
  name: "myphp-resource-limits"
spec:
  limits:
    - type: "Pod"
      max:
        cpu: "2"
        memory: "120Mi"
      min:
        cpu: "200m"
        memory: "6Mi"
    - type: "Container"
      max:
        cpu: "2"
        memory: "120Mi"
      min:
        cpu: "100m"
        memory: "4Mi"
      default:
        cpu: "300m"
        memory: "100Mi"
      defaultRequest:
        cpu: "200m"
        memory: "100Mi"
      maxLimitRequestRatio:
        cpu: "10"

# oc create -f limits.yml
limitrange/myphp-resource-limits created

Set up the application

To set up your application:

Create the application:

# oc new-app --name app1 myphp:latest~https://gitlab.mylab/myproject/phpapp.git --build-env "GIT_SSL_NO_VERIFY=true"

Wait until your application is built and running:

# oc get pods
NAME READY STATUS RESTARTS AGE
app1-1-build 0/1 Completed 0 2m
app1-1-plchw 1/1 Running 0 27s

Create a route:

# oc expose svc app1
route/app1-memhpa.apps.ocp.mylab created

Test your application

Now, to test your application:

Test the application:

# curl http://app1-memhpa.apps.ocp.mylab/
created an array of 100000

If your application fails, try reducing the size of the PHP array.

Create a memory-based HPA definition:

# cat hpa.yml 
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-resource-metrics-memory
spec:
  scaleTargetRef:
    apiVersion: apps.openshift.io/v1
    kind: DeploymentConfig
    name: app1
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: memory
      targetAverageUtilization: 90

# oc create -f hpa.yml
horizontalpodautoscaler.autoscaling/hpa-resource-metrics-memory created

# oc get hpa
NAME                          REFERENCE               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
hpa-resource-metrics-memory   DeploymentConfig/app1   <unknown>/90%   1         10        0          31m

Wait until the <unknown> value shown above changes to an integer:

# oc get hpa
NAME                          REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hpa-resource-metrics-memory   DeploymentConfig/app1   75%/90%   1         10        1          33m

Create a load. I use a simple script that loops through a curl command:

# cat loadphp.sh
#!/bin/bash

while true; do curl http://app1-memhpa.apps.ocp.mylab/; done

done

Run the following command a few times until you notice the load increasing:

# nohup ./loadphp.sh &

Observe the HPA

You will start to notice an increase in memory utilization and corresponding autoscaling:

# oc get hpa
NAME                          REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hpa-resource-metrics-memory   DeploymentConfig/app1   75%/90%   1         10        1          33m

# oc get hpa
NAME                          REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hpa-resource-metrics-memory   DeploymentConfig/app1   94%/90%   1         10        2          39m

# oc get hpa
NAME                          REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hpa-resource-metrics-memory   DeploymentConfig/app1   90%/90%   1         10        2          48m

# oc get hpa
NAME                          REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hpa-resource-metrics-memory   DeploymentConfig/app1   85%/90%   1         10        3          52m

Next, stop the load, and then watch the HPA. Several minutes after the load stops, the autoscaler eventually downscales the pods to one:

NAME                          REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hpa-resource-metrics-memory   DeploymentConfig/app1   29%/90%   1         10        1          1h

Conclusion

In this article, my aim was to provide the simplest of methods to set up and test memory-based horizontal pod autoscaling. I was able to demonstrate this process with a single PHP web page that creates a large array in memory, built using a basic Red Hat S2I PHP image, and set up in a namespace with limits and an HPA.

Once I set up the environment, I created a basic bash script to put a load on the application in order to observe an increasing load in memory, until the result is multiple autoscaled pods. After stopping the load, in a few minutes, the autoscaler reduced the pods down to one.

Special thanks to Damon Hatchett for peer-reviewing this article.

Last updated: June 29, 2020

Testing memory-based horizontal pod autoscaling on OpenShift

Check if `v2beta1` is enabled

Set up your testbed

Set up your environment

Set up the application

Test your application

Observe the HPA

Conclusion

Running AI inference on Rebellions ATOM NPU with Red Hat AI

How we built integration testing for fast-moving AI backend

Testing infrastructure red teaming with abliterated models

Build an enterprise RAG system with OGX

Solutions for SELinux MCS challenges with GitLab runners

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Testing memory-based horizontal pod autoscaling on OpenShift

Check if v2beta1 is enabled

Set up your testbed

Set up your environment

Set up the application

Test your application

Observe the HPA

Conclusion

Running AI inference on Rebellions ATOM NPU with Red Hat AI

How we built integration testing for fast-moving AI backend

Testing infrastructure red teaming with abliterated models

Build an enterprise RAG system with OGX

Solutions for SELinux MCS challenges with GitLab runners

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Check if `v2beta1` is enabled