Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Use kube-burner to measure Red Hat OpenShift VM and storage deployment at scale

September 4, 2024
Jenifer Abrams
Related topics:
Automation and managementVirtualization
Related products:
Red Hat OpenShiftRed Hat OpenShift Container Platform

Share:

    Scale testing is critical for understanding how a cluster will hold up under production load. Generally, you may want to scale test to reach a certain max density as the end goal, but it is often also useful to scale up from smaller batch sizes to observe how performance may change as the overall cluster becomes more loaded. Those of us that work in the area of performance analysis know there are many ways to measure a workload and standardizing on a tool can help provide more comparable results across different configurations and environments. 

    This article will take users through the process of using the Red Hat performance and scale team’s workload tool called kube-burner (which has been accepted as a CNCF sandbox project) to test deployments at scale. While you can learn more about all the ways kube-burner can be used for scalability testing, including how the tool is extended for egress coverage, this guide will focus on customizing a kube-burner workload for virtual machine (VM) deployment at scale on Red Hat OpenShift, with an additional focus on storage attachments and cloning. 

    The goal of this workload example is to exercise a few different areas of the OpenShift cluster and virtualization control plane, as well as stress storage functionality provided by the Container Storage Interface (CSI) and underlying StorageClass through the use of DataVolume cloning and multiple disk attachments in each VM. This guide will walk through all the details of running a custom workload so that the examples provided can be extended in many different ways by changing the underlying templates and job actions.

    Workload tool overview

    The kube-burner tool can be built from source or users can download the latest release binary from the repository.

    The tool can be executed in a number of ways, for instance by using the OpenShift wrapper functionality that can run predefined workload types and index the results into an ElasticSearch instance. Here is one example:

    ./kube-burner-ocp cluster-density-v2 --qps=20 --burst=20 --gc=true --iterations=50 --churn=false --es-server=https://my-es.com --es-index=my-kube-burner

    However, in order to simplify the infrastructure needs for the custom examples provided in this guide, we will instead use the local indexer type which will save the run summary and metric data to a local directory.  

    The examples below will focus on running a custom workload which can be launched using the init action and the -c configuration flag pointing to a custom YAML defining the workload, for example:

    ./kube-burner init  -c my-workload.yml

    The tool provides many configuration options for the workload as described in the reference section, including what measurements and metrics to collect, submission options that control how the objects are created, and even some options around "churning" jobs. This guide will walk through some of the workload configuration options in more detail with a full custom workload example. 

    Kube-burner jobTypes can be used to measure the deletion of certain objects as part of the workload, and users can also clean up all objects created by the benchmark using the destroy action, by passing the uuid of the run to be deleted, for example:

    ./kube-burner destroy  --uuid=<run_uuid>

    Finally, users can control the measurements and metrics collected during the run, and the thresholds to be considered a "passing" run. In these examples our main focus is vmiLatency, reported in milliseconds, which enables the collection of latency measurements for various stages of the pod, VM, and VMI objects throughout their lifecycle. This measurement includes a breakdown of latency for the virt-launcher pod creation, scheduling, initialization, and ready stages, as well as the VMI (running instance of the VM object) creation, scheduling, pending, and ready stages until the full “VMReady” state is reached. 

    The rest of this guide will go through the workload customization details and finally bring it all together with an example full run and result analysis. 

    Custom workload

    In this case, the goal is to customize a workload to stress both VM and storage deployments. 

    Note that for clusters where VM deployment at scale is the goal, for best performance it is recommended to enable the "HighBurst" setting in OpenShift Virtualization:

    oc patch -n openshift-cnv hco kubevirt-hyperconverged --type=json -p='[{"op": "add", "path": "/spec/tuningPolicy", "value": "highBurst"}]' 

    See the Tuning and Scaling guide to learn more. 

    Templates

    In order for the custom workload configuration to deploy the desired objects, the YAML definitions should be created with any appropriate overrides. In this case all the files are stored in a local templates directory (this path can be customized in the workload configuration file).

    This example is defining 3 types of objects. Variables are used to provide convenient user overrides, however aren’t necessary in the definitions other than the Replica variable for objects that should be scaled up (the VM definition in this case).

    DataVolume Source

    In this workload configuration, the goal is to create a VM boot source that can easily be cloned to start many VMs in parallel. In this case a qcow2 is being referenced by an HTTP URL to provide this source image, an example URL is provided later in the workload configuration. 

    Here is the full templates/dv-source.yml example:

    apiVersion: cdi.kubevirt.io/v1beta1
    kind: DataVolume
    metadata:
      name: dv-source
    spec:
      source:
        http:
            ## source image url:
            url: "{{.url}}"
      pvc:
        accessModes:
          - ReadWriteMany
        resources:
          requests:
            storage: {{.storage}}
        volumeMode: {{.volumemode}}
        storageClassName: {{.storageclass}}

    DataVolume VolumeSnapshot

    Refer to any recommendations provided by your StorageClass provisioner on how to efficiently clone volumes. In this example case we are using OpenShift Data Foundation which supports VolumeSnapshots, each VM disk clone will reference this snapshot as the source.

    Here is the full templates/dv-volsnap.yml example:

    apiVersion: snapshot.storage.k8s.io/v1
    kind: VolumeSnapshot
    metadata:
      name: dv-volsnap
    spec:
      volumeSnapshotClassName: {{.volsnapclass}}
      source:
        persistentVolumeClaimName: dv-source

    VirtualMachine

    There is plenty of flexibility in the exact VM definition, but the key for the scale testing goal covered here is to exercise source image cloning, which is achieved by using a snapshot source in the dataVolumeTemplates section, and to stress additional PVC attachments, which is achieved by creating an additional blank disk—more blank disks could be added as needed to test more attachments per VM. Note that in this configuration, the dataVolumeTemplates are defined inside the VM definition, meaning that the clone and blank PVC follows the lifecycle of the VM object which is the goal of this particular scale test, however the DataVolume definitions could be separated from the VM if desired. 

    Here is the full templates/vm-dv.yml example:

    apiVersion: kubevirt.io/v1
    kind: VirtualMachine
    metadata:
      labels:
        kubevirt-vm: vm-{{.name}}-{{.Replica}}
      name: {{.name}}-{{.Replica}}
    spec:
      ## controls if the VM boots up on creation:
      running: {{.VMIrunning}}
      template:
        metadata:
          labels:
            kubevirt-vm: vm-{{.name}}-{{.Replica}}
            kubevirt.io/os: {{.OS}}
        spec:
          domain:
            ## vcpus:
            cpu:
              cores: {{.cpuCores}}
           ## create 3 VM disks (rootdisk clone, cloudinit, blank)
            devices:
              disks:
              - disk:
                  bus: virtio
                name: rootdisk
              - disk:
                  bus: virtio
                name: cloudinitdisk
              - disk:
                  bus: virtio
                name: blank-1
              ## default network interface:
              interfaces:
              - masquerade: {}
                model: virtio
                name: default
              networkInterfaceMultiqueue: true
              rng: {}
            ## VM memory size:
            resources:
              requests:
                memory: {{.memory}}
          ## default pod network:
          networks:
          - name: default
            pod: {}
          ## volume definitions for the VM disks:
          volumes:
          - dataVolume:
              name: dvclone-{{.Replica}}
            name: rootdisk
          - cloudInitNoCloud:
              userData: |-
                #cloud-config
                password: fedora
                chpasswd: { expire: False }
            name: cloudinitdisk
          - dataVolume:
              name: blank-1-{{.Replica}}
            name: blank-1
      ## Data Volume population of the volumes:
      dataVolumeTemplates:
      ## rootdisk:
      - metadata:
          name: dvclone-{{.Replica}}
        spec:
         source:
           snapshot:
            name: dv-volsnap
         storage:
           accessModes:
            - ReadWriteMany
           resources:
            requests:
              storage: {{.storage}}
           storageClassName: {{.storageclass}}
           volumeMode: {{.volumemode}}
      ## blank disk:
      - metadata:
          name: blank-1-{{.Replica}}
        spec:
         source:
           blank: {}
         pvc:
           accessModes:
            - ReadWriteMany
           resources:
            requests:
              storage: 500Mi
           storageClassName: {{.storageclass}}
           volumeMode: {{.volumemode}}

    Workload configuration

    Configuring the workload definition and job order is important to deploy the objects in a functional manner and to measure the intended operations. 

    Here is the full vm-dvclone-density.yml workload configuration example. Each section is explained in more detail below:

    metricsEndpoints:
      - indexer:
          type: local
    global:
      measurements:
      - name: vmiLatency
    jobs:
      - name: dv-source
        jobType: create
        jobIterations: {{.ITERATIONS}}
        namespacedIterations: true
        namespace: vm-density
        qps: {{.QPS}}
        burst: {{.BURST}}
        maxWaitTimeout: 1h
        jobPause: 5m
        objects:
        - objectTemplate: templates/dv-source.yml
          replicas: 1
          inputVars:
            url: https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Cloud/x86_64/images/Fedora-Cloud-Base-Generic.x86_64-40-1.14.qcow2
            storage: 10Gi
            storageclass: ocs-storagecluster-ceph-rbd
            volumemode: Block
        - objectTemplate: templates/dv-volsnap.yml
          replicas: 1
          inputVars:
            volsnapclass: ocs-storagecluster-rbdplugin-snapclass
      - name: vm-density
        jobType: create
        jobIterations: {{.ITERATIONS}}
        qps: {{.QPS}}
        burst: {{.BURST}}
        namespacedIterations: true
        namespace: vm-density
        maxWaitTimeout: 1h
        jobPause: 1m
        objects:
        - objectTemplate: templates/vm-dv.yml
          replicas: {{ .OBJ_REPLICAS }}
          inputVars:
            name: vm-dv-density
            OS: fedora
            cpuCores: 1
            memory: 1G
            storage: 10Gi
            storageclass: ocs-storagecluster-ceph-rbd
            volumemode: Block
            VMIrunning: true
      - name: delete-job
        jobType: delete
        waitForDeletion: true
        qps: {{.QPS}}
        burst: {{.BURST}}
        jobPause: 1m
        objects:
        - kind: VirtualMachine
          labelSelector: {kube-burner-job: vm-density}
          apiVersion: kubevirt.io/v1
        - kind: VirtualMachineInstance
          labelSelector: {kube-burner-job: vm-density}
          apiVersion: kubevirt.io/v1
        - kind: Pod
          labelSelector: {kubevirt.io: virt-launcher}
          apiVersion: v1
        - kind: Namespace
          labelSelector: {kube-burner-job: vm-density}

    First, the configuration starts by defining a "local" indexer type (i.e., save to local directory) and the desired vmiLatency measurement. Then each jobType is defined in order. Note that each job could be further configured by changing any of the default parameters.

    Some general parameters to understand for these jobs:

    • namespacedIterations: true means that each set of object replicas (i.e., an "iteration") will be created in a new namespace, the total number of namespaces is controlled by jobIterations.
    • qps and burst settings control how quickly kube-burner will submit the object creation queries—the example run below will set these parameters to 1,000 which allows a high creation rate so that the benchmark is not introducing throttling itself.
    • The objects section describes what objects are to be associated with that job (based on the template definitions).
    • maxWaitTimeout controls how long the benchmark waits before it may determine a failure.
    • jobPause is a pause introduced after the job is complete, pausing after each job type can be useful to help separate actions for observability and measurement.
    • Optionally, a jobIterationDelay could also be added to control the run in smaller "batch" increments, this delay runs between each job iteration and each delete operation.

    The first job creates the DataVolume source and VolumeSnapshot, in this case it is configured to create a single source and snapshot per benchmark namespace (i.e., replica 1). The storageclass and volsnapclass are set to use OpenShift Data Foundation defaults in this configuration and an example Fedora qcow2 URL is provided. Note that if there is a very long delay in data source import actions, the jobPause may need to be increased before moving on to the next stage. 

    The second job creates the VMs, which in turn (based on the example object definition) create the snapshot clones and empty PVCs for the VM disks. This job is also namespaced, and in this case $OBJ_REPLICAS number of VMs are created per $ITERATIONS number of namespaces. This job specifically starts the VMs as running: true and waits for the VMI (VirtualMachineInstance) Running state for the latency measurements. 

    Finally, this configuration measures the deletion of objects with the delete-job. Note that currently this doesn’t delete the DataVolume sources and VolumeSnapshots, although the kube-burner destroy action can be used after a run, and the default behavior is cleanup: true which will delete previously created namespaces on the next run.

    VM and storage deployment at scale

    Below is a full walk through of running the custom workload:

    wget https://github.com/kube-burner/kube-burner/releases/download/v1.10.1/kube-burner-V1.10.1-linux-x86_64.tar.gz
    tar zxvf kube-burner-V1.10.1-linux-x86_64.tar.gz
    # create the following files in the local directory, as shown above:
    ## vm-dvclone-density.yml  (workload config file)
    ## templates/dv-source.yml
    ## templates/dv-volsnap.yml
    ## templates/vm-dv.yml

    Run the workload using the example configurations detailed above and define any variables, in the example below the benchmark will create 10 namespaces, each with 50 VMs, in parallel (i.e., 500 total VMs, each VM has 1 boot drive PVC and 1 blank PVC). 

    Keep in mind that max object density is dependent on each cluster environment including total schedulable resources, storage capacity, and total pod capacity as determined by the maxPods setting on each worker. 

    Note that the benchmark is executed against the cluster defined by the $KUBECONFIG environment variable, $HOME/.kube/config, or in-cluster config, in that order:

    # ITERATIONS = number of namespaces to create
    # OBJ_REPLICAS = number of VMs to create per namespace
    QPS=1000 BURST=1000 ITERATIONS=10 OBJ_REPLICAS=50 ./kube-burner init  -c vm-dvclone-density.yml

    The tool will log status as the run progresses, and assuming all VMIs successfully reach Running state the workload will end with a message similar to the following:

    level=info msg="👋 Exiting kube-burner <run_uuid>" file="kube-burner.go:85"

    Since the configuration is using the local indexing type, the results will be stored in a local directory called collected-metrics. This will include some general information about the run in jobSummary.json, detailed per object measurements in podLatencyMeasurement-vm-density.json and finally a summary of the latency performance in podLatencyQuantilesMeasurement-vm-density.json which includes max, avg, and P99/P95/P50 measurements (in ms) for various "quantileName"s for the different stages of the pod and VM lifecycle. In this case the "VMReady" measurement is the most interesting in terms of total VM deployment performance, however the other measurements may be of interest to understanding the overall behavior as well. 

    Adding Metrics

    Optionally, metric data can also be captured during a workload by adding a metric endpoint definition.

    The following section walks through how to add metrics collection to the example run shown above, in this case we will use the kubevirt-metrics.yaml metric definitions provided by the kube-burner repository, but any set of metric definitions can be configured as needed:

    # Save the metric definition file to the local directory:
    wget https://raw.githubusercontent.com/kube-burner/kube-burner/main/examples/metrics-profiles/kubevirt-metrics.yaml
    # Create the metric endpoint definition, referencing the metric file:
    cat <<EOF >>  metrics-endpoint.yml
    - endpoint: {{ .PROM }}
      token: {{.PROM_TOKEN}}
      metrics:
        - kubevirt-metrics.yaml
      indexer:
        type: local
    EOF
    # Run the workload:
    ## provide the PROM url and PROM_TOKEN for the cluster as shown
    ## add the -e flag for the metrics endpoint configuration as shown
    PROM=https://$(oc get route -n openshift-monitoring prometheus-k8s -o jsonpath="{.spec.host}") PROM_TOKEN=$(oc create token -n openshift-monitoring prometheus-k8s) QPS=1000 BURST=1000 ITERATIONS=10 OBJ_REPLICAS=50 ./kube-burner init  -c vm-dvclone-density.yml -e metrics-endpoint.yml

    Since the local indexer type is used in these examples, after the run completes it will produce multiple JSON files in the collected-metrics directory along with the run results detailing all the data captured from the metric queries. 

    Conclusion

    Hopefully this guide has provided some useful end-to-end examples walking through creation of a custom kube-burner workload to test VM deployments at scale while also exercising storage scalability. The examples can be modified as needed to adjust to other specific testing needs, but the general workflow is to determine which objects need to be created, the order of the job actions, and the metrics and measurements of interest. The same sort of workload configuration could be created for a pod and PVC scale case. 

    In many VM and CSI scale cases it is also important to exercise scaling in a multi-worker environment and, ideally, to further stress the underlying compute and storage resources by running a workload in the VMs so that they perform some amount of I/O to the attached PVCs. Also keep in mind that scaling performance evaluations during OpenShift cluster, virtualization operator, and storage operator upgrades remains an important use case to harden for your environment. 

    These scale testing methods can be used to develop best practices and recommendations (such as batch sizing) when deploying many VMs and storage attachments in parallel, depending on the cluster configuration, desired density, and underlying storage provider. This can help cluster administrators determine what density and deployment rate achieves efficient VM startup times, depending on user needs. 

    Happy benchmarking! 

    Related Posts

    • Test Kubernetes performance and scale with kube-burner

    • Egress IP Scale Testing in OpenShift Container Platform

    • Scale testing image-based upgrades for single node OpenShift

    • Enable OpenShift Virtualization on Red Hat OpenShift

    Recent Posts

    • Integrate Red Hat AI Inference Server & LangChain in agentic workflows

    • Streamline multi-cloud operations with Ansible and ServiceNow

    • Automate dynamic application security testing with RapiDAST

    • Assessing AI for OpenShift operations: Advanced configurations

    • OpenShift Lightspeed: Assessing AI for OpenShift operations

    What’s up next?

    Learn how to create and manage your virtual machines (VMs) using Red Hat OpenShift and the Developer Sandbox.

    Start the activity
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue