Skip to main content
Redhat Developers  Logo
  • AI

    Get started with AI

    • Red Hat AI
      Accelerate the development and deployment of enterprise AI solutions.
    • AI learning hub
      Explore learning materials and tools, organized by task.
    • AI interactive demos
      Click through scenarios with Red Hat AI, including training LLMs and more.
    • AI/ML learning paths
      Expand your OpenShift AI knowledge using these learning resources.
    • AI quickstarts
      Focused AI use cases designed for fast deployment on Red Hat AI platforms.
    • No-cost AI training
      Foundational Red Hat AI training.

    Featured resources

    • OpenShift AI learning
    • Open source AI for developers
    • AI product application development
    • Open source-powered AI/ML for hybrid cloud
    • AI and Node.js cheat sheet

    Red Hat AI Factory with NVIDIA

    • Red Hat AI Factory with NVIDIA is a co-engineered, enterprise-grade AI solution for building, deploying, and managing AI at scale across hybrid cloud environments.
    • Explore the solution
  • Learn

    Self-guided

    • Documentation
      Find answers, get step-by-step guidance, and learn how to use Red Hat products.
    • Learning paths
      Explore curated walkthroughs for common development tasks.
    • Guided learning
      Receive custom learning paths powered by our AI assistant.
    • See all learning

    Hands-on

    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.
    • Interactive labs
      Learn by doing in these hands-on, browser-based experiences.
    • Interactive demos
      Click through product features in these guided tours.

    Browse by topic

    • AI/ML
    • Automation
    • Java
    • Kubernetes
    • Linux
    • See all topics

    Training & certifications

    • Courses and exams
    • Certifications
    • Skills assessments
    • Red Hat Academy
    • Learning subscription
    • Explore training
  • Build

    Get started

    • Red Hat build of Podman Desktop
      A downloadable, local development hub to experiment with our products and builds.
    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.

    Download products

    • Access product downloads to start building and testing right away.
    • Red Hat Enterprise Linux
    • Red Hat AI
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat Developer Toolset

    References

    • E-books
    • Documentation
    • Cheat sheets
    • Architecture center
  • Community

    Get involved

    • Events
    • Live AI events
    • Red Hat Summit
    • Red Hat Accelerators
    • Community discussions

    Follow along

    • Articles & blogs
    • Developer newsletter
    • Videos
    • Github

    Get help

    • Customer service
    • Customer support
    • Regional contacts
    • Find a partner

    Join the Red Hat Developer program

    • Download Red Hat products and project builds, access support documentation, learning content, and more.
    • Explore the benefits

Use kube-burner to measure Red Hat OpenShift VM and storage deployment at scale

September 4, 2024
Jenifer Abrams
Related topics:
Automation and managementVirtualization
Related products:
Red Hat OpenShiftRed Hat OpenShift Container Platform

    Scale testing is critical for understanding how a cluster will hold up under production load. Generally, you may want to scale test to reach a certain max density as the end goal, but it is often also useful to scale up from smaller batch sizes to observe how performance may change as the overall cluster becomes more loaded. Those of us that work in the area of performance analysis know there are many ways to measure a workload and standardizing on a tool can help provide more comparable results across different configurations and environments. 

    This article will take users through the process of using the Red Hat performance and scale team’s workload tool called kube-burner (which has been accepted as a CNCF sandbox project) to test deployments at scale. While you can learn more about all the ways kube-burner can be used for scalability testing, including how the tool is extended for egress coverage, this guide will focus on customizing a kube-burner workload for virtual machine (VM) deployment at scale on Red Hat OpenShift, with an additional focus on storage attachments and cloning. 

    The goal of this workload example is to exercise a few different areas of the OpenShift cluster and virtualization control plane, as well as stress storage functionality provided by the Container Storage Interface (CSI) and underlying StorageClass through the use of DataVolume cloning and multiple disk attachments in each VM. This guide will walk through all the details of running a custom workload so that the examples provided can be extended in many different ways by changing the underlying templates and job actions.

    Workload tool overview

    The kube-burner tool can be built from source or users can download the latest release binary from the repository.

    The tool can be executed in a number of ways, for instance by using the OpenShift wrapper functionality that can run predefined workload types and index the results into an ElasticSearch instance. Here is one example:

    ./kube-burner-ocp cluster-density-v2 --qps=20 --burst=20 --gc=true --iterations=50 --churn=false --es-server=https://my-es.com --es-index=my-kube-burner

    However, in order to simplify the infrastructure needs for the custom examples provided in this guide, we will instead use the local indexer type which will save the run summary and metric data to a local directory.  

    The examples below will focus on running a custom workload which can be launched using the init action and the -c configuration flag pointing to a custom YAML defining the workload, for example:

    ./kube-burner init  -c my-workload.yml

    The tool provides many configuration options for the workload as described in the reference section, including what measurements and metrics to collect, submission options that control how the objects are created, and even some options around "churning" jobs. This guide will walk through some of the workload configuration options in more detail with a full custom workload example. 

    Kube-burner jobTypes can be used to measure the deletion of certain objects as part of the workload, and users can also clean up all objects created by the benchmark using the destroy action, by passing the uuid of the run to be deleted, for example:

    ./kube-burner destroy  --uuid=<run_uuid>

    Finally, users can control the measurements and metrics collected during the run, and the thresholds to be considered a "passing" run. In these examples our main focus is vmiLatency, reported in milliseconds, which enables the collection of latency measurements for various stages of the pod, VM, and VMI objects throughout their lifecycle. This measurement includes a breakdown of latency for the virt-launcher pod creation, scheduling, initialization, and ready stages, as well as the VMI (running instance of the VM object) creation, scheduling, pending, and ready stages until the full “VMReady” state is reached. 

    The rest of this guide will go through the workload customization details and finally bring it all together with an example full run and result analysis. 

    Custom workload

    In this case, the goal is to customize a workload to stress both VM and storage deployments. 

    Note that for clusters where VM deployment at scale is the goal, for best performance it is recommended to enable the "HighBurst" setting in OpenShift Virtualization:

    oc patch -n openshift-cnv hco kubevirt-hyperconverged --type=json -p='[{"op": "add", "path": "/spec/tuningPolicy", "value": "highBurst"}]' 

    See the Tuning and Scaling guide to learn more. 

    Templates

    In order for the custom workload configuration to deploy the desired objects, the YAML definitions should be created with any appropriate overrides. In this case all the files are stored in a local templates directory (this path can be customized in the workload configuration file).

    This example is defining 3 types of objects. Variables are used to provide convenient user overrides, however aren’t necessary in the definitions other than the Replica variable for objects that should be scaled up (the VM definition in this case).

    DataVolume Source

    In this workload configuration, the goal is to create a VM boot source that can easily be cloned to start many VMs in parallel. In this case a qcow2 is being referenced by an HTTP URL to provide this source image, an example URL is provided later in the workload configuration. 

    Here is the full templates/dv-source.yml example:

    apiVersion: cdi.kubevirt.io/v1beta1
    kind: DataVolume
    metadata:
      name: dv-source
    spec:
      source:
        http:
            ## source image url:
            url: "{{.url}}"
      pvc:
        accessModes:
          - ReadWriteMany
        resources:
          requests:
            storage: {{.storage}}
        volumeMode: {{.volumemode}}
        storageClassName: {{.storageclass}}

    DataVolume VolumeSnapshot

    Refer to any recommendations provided by your StorageClass provisioner on how to efficiently clone volumes. In this example case we are using OpenShift Data Foundation which supports VolumeSnapshots, each VM disk clone will reference this snapshot as the source.

    Here is the full templates/dv-volsnap.yml example:

    apiVersion: snapshot.storage.k8s.io/v1
    kind: VolumeSnapshot
    metadata:
      name: dv-volsnap
    spec:
      volumeSnapshotClassName: {{.volsnapclass}}
      source:
        persistentVolumeClaimName: dv-source

    VirtualMachine

    There is plenty of flexibility in the exact VM definition, but the key for the scale testing goal covered here is to exercise source image cloning, which is achieved by using a snapshot source in the dataVolumeTemplates section, and to stress additional PVC attachments, which is achieved by creating an additional blank disk—more blank disks could be added as needed to test more attachments per VM. Note that in this configuration, the dataVolumeTemplates are defined inside the VM definition, meaning that the clone and blank PVC follows the lifecycle of the VM object which is the goal of this particular scale test, however the DataVolume definitions could be separated from the VM if desired. 

    Here is the full templates/vm-dv.yml example:

    apiVersion: kubevirt.io/v1
    kind: VirtualMachine
    metadata:
      labels:
        kubevirt-vm: vm-{{.name}}-{{.Replica}}
      name: {{.name}}-{{.Replica}}
    spec:
      ## controls if the VM boots up on creation:
      running: {{.VMIrunning}}
      template:
        metadata:
          labels:
            kubevirt-vm: vm-{{.name}}-{{.Replica}}
            kubevirt.io/os: {{.OS}}
        spec:
          domain:
            ## vcpus:
            cpu:
              cores: {{.cpuCores}}
           ## create 3 VM disks (rootdisk clone, cloudinit, blank)
            devices:
              disks:
              - disk:
                  bus: virtio
                name: rootdisk
              - disk:
                  bus: virtio
                name: cloudinitdisk
              - disk:
                  bus: virtio
                name: blank-1
              ## default network interface:
              interfaces:
              - masquerade: {}
                model: virtio
                name: default
              networkInterfaceMultiqueue: true
              rng: {}
            ## VM memory size:
            resources:
              requests:
                memory: {{.memory}}
          ## default pod network:
          networks:
          - name: default
            pod: {}
          ## volume definitions for the VM disks:
          volumes:
          - dataVolume:
              name: dvclone-{{.Replica}}
            name: rootdisk
          - cloudInitNoCloud:
              userData: |-
                #cloud-config
                password: fedora
                chpasswd: { expire: False }
            name: cloudinitdisk
          - dataVolume:
              name: blank-1-{{.Replica}}
            name: blank-1
      ## Data Volume population of the volumes:
      dataVolumeTemplates:
      ## rootdisk:
      - metadata:
          name: dvclone-{{.Replica}}
        spec:
         source:
           snapshot:
            name: dv-volsnap
         storage:
           accessModes:
            - ReadWriteMany
           resources:
            requests:
              storage: {{.storage}}
           storageClassName: {{.storageclass}}
           volumeMode: {{.volumemode}}
      ## blank disk:
      - metadata:
          name: blank-1-{{.Replica}}
        spec:
         source:
           blank: {}
         pvc:
           accessModes:
            - ReadWriteMany
           resources:
            requests:
              storage: 500Mi
           storageClassName: {{.storageclass}}
           volumeMode: {{.volumemode}}

    Workload configuration

    Configuring the workload definition and job order is important to deploy the objects in a functional manner and to measure the intended operations. 

    Here is the full vm-dvclone-density.yml workload configuration example. Each section is explained in more detail below:

    metricsEndpoints:
      - indexer:
          type: local
    global:
      measurements:
      - name: vmiLatency
    jobs:
      - name: dv-source
        jobType: create
        jobIterations: {{.ITERATIONS}}
        namespacedIterations: true
        namespace: vm-density
        qps: {{.QPS}}
        burst: {{.BURST}}
        maxWaitTimeout: 1h
        jobPause: 5m
        objects:
        - objectTemplate: templates/dv-source.yml
          replicas: 1
          inputVars:
            url: https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Cloud/x86_64/images/Fedora-Cloud-Base-Generic.x86_64-40-1.14.qcow2
            storage: 10Gi
            storageclass: ocs-storagecluster-ceph-rbd
            volumemode: Block
        - objectTemplate: templates/dv-volsnap.yml
          replicas: 1
          inputVars:
            volsnapclass: ocs-storagecluster-rbdplugin-snapclass
      - name: vm-density
        jobType: create
        jobIterations: {{.ITERATIONS}}
        qps: {{.QPS}}
        burst: {{.BURST}}
        namespacedIterations: true
        namespace: vm-density
        maxWaitTimeout: 1h
        jobPause: 1m
        objects:
        - objectTemplate: templates/vm-dv.yml
          replicas: {{ .OBJ_REPLICAS }}
          inputVars:
            name: vm-dv-density
            OS: fedora
            cpuCores: 1
            memory: 1G
            storage: 10Gi
            storageclass: ocs-storagecluster-ceph-rbd
            volumemode: Block
            VMIrunning: true
      - name: delete-job
        jobType: delete
        waitForDeletion: true
        qps: {{.QPS}}
        burst: {{.BURST}}
        jobPause: 1m
        objects:
        - kind: VirtualMachine
          labelSelector: {kube-burner-job: vm-density}
          apiVersion: kubevirt.io/v1
        - kind: VirtualMachineInstance
          labelSelector: {kube-burner-job: vm-density}
          apiVersion: kubevirt.io/v1
        - kind: Pod
          labelSelector: {kubevirt.io: virt-launcher}
          apiVersion: v1
        - kind: Namespace
          labelSelector: {kube-burner-job: vm-density}

    First, the configuration starts by defining a "local" indexer type (i.e., save to local directory) and the desired vmiLatency measurement. Then each jobType is defined in order. Note that each job could be further configured by changing any of the default parameters.

    Some general parameters to understand for these jobs:

    • namespacedIterations: true means that each set of object replicas (i.e., an "iteration") will be created in a new namespace, the total number of namespaces is controlled by jobIterations.
    • qps and burst settings control how quickly kube-burner will submit the object creation queries—the example run below will set these parameters to 1,000 which allows a high creation rate so that the benchmark is not introducing throttling itself.
    • The objects section describes what objects are to be associated with that job (based on the template definitions).
    • maxWaitTimeout controls how long the benchmark waits before it may determine a failure.
    • jobPause is a pause introduced after the job is complete, pausing after each job type can be useful to help separate actions for observability and measurement.
    • Optionally, a jobIterationDelay could also be added to control the run in smaller "batch" increments, this delay runs between each job iteration and each delete operation.

    The first job creates the DataVolume source and VolumeSnapshot, in this case it is configured to create a single source and snapshot per benchmark namespace (i.e., replica 1). The storageclass and volsnapclass are set to use OpenShift Data Foundation defaults in this configuration and an example Fedora qcow2 URL is provided. Note that if there is a very long delay in data source import actions, the jobPause may need to be increased before moving on to the next stage. 

    The second job creates the VMs, which in turn (based on the example object definition) create the snapshot clones and empty PVCs for the VM disks. This job is also namespaced, and in this case $OBJ_REPLICAS number of VMs are created per $ITERATIONS number of namespaces. This job specifically starts the VMs as running: true and waits for the VMI (VirtualMachineInstance) Running state for the latency measurements. 

    Finally, this configuration measures the deletion of objects with the delete-job. Note that currently this doesn’t delete the DataVolume sources and VolumeSnapshots, although the kube-burner destroy action can be used after a run, and the default behavior is cleanup: true which will delete previously created namespaces on the next run.

    VM and storage deployment at scale

    Below is a full walk through of running the custom workload:

    wget https://github.com/kube-burner/kube-burner/releases/download/v1.10.1/kube-burner-V1.10.1-linux-x86_64.tar.gz
    tar zxvf kube-burner-V1.10.1-linux-x86_64.tar.gz
    # create the following files in the local directory, as shown above:
    ## vm-dvclone-density.yml  (workload config file)
    ## templates/dv-source.yml
    ## templates/dv-volsnap.yml
    ## templates/vm-dv.yml

    Run the workload using the example configurations detailed above and define any variables, in the example below the benchmark will create 10 namespaces, each with 50 VMs, in parallel (i.e., 500 total VMs, each VM has 1 boot drive PVC and 1 blank PVC). 

    Keep in mind that max object density is dependent on each cluster environment including total schedulable resources, storage capacity, and total pod capacity as determined by the maxPods setting on each worker. 

    Note that the benchmark is executed against the cluster defined by the $KUBECONFIG environment variable, $HOME/.kube/config, or in-cluster config, in that order:

    # ITERATIONS = number of namespaces to create
    # OBJ_REPLICAS = number of VMs to create per namespace
    QPS=1000 BURST=1000 ITERATIONS=10 OBJ_REPLICAS=50 ./kube-burner init  -c vm-dvclone-density.yml

    The tool will log status as the run progresses, and assuming all VMIs successfully reach Running state the workload will end with a message similar to the following:

    level=info msg="👋 Exiting kube-burner <run_uuid>" file="kube-burner.go:85"

    Since the configuration is using the local indexing type, the results will be stored in a local directory called collected-metrics. This will include some general information about the run in jobSummary.json, detailed per object measurements in podLatencyMeasurement-vm-density.json and finally a summary of the latency performance in podLatencyQuantilesMeasurement-vm-density.json which includes max, avg, and P99/P95/P50 measurements (in ms) for various "quantileName"s for the different stages of the pod and VM lifecycle. In this case the "VMReady" measurement is the most interesting in terms of total VM deployment performance, however the other measurements may be of interest to understanding the overall behavior as well. 

    Adding Metrics

    Optionally, metric data can also be captured during a workload by adding a metric endpoint definition.

    The following section walks through how to add metrics collection to the example run shown above, in this case we will use the kubevirt-metrics.yaml metric definitions provided by the kube-burner repository, but any set of metric definitions can be configured as needed:

    # Save the metric definition file to the local directory:
    wget https://raw.githubusercontent.com/kube-burner/kube-burner/main/examples/metrics-profiles/kubevirt-metrics.yaml
    # Create the metric endpoint definition, referencing the metric file:
    cat <<EOF >>  metrics-endpoint.yml
    - endpoint: {{ .PROM }}
      token: {{.PROM_TOKEN}}
      metrics:
        - kubevirt-metrics.yaml
      indexer:
        type: local
    EOF
    # Run the workload:
    ## provide the PROM url and PROM_TOKEN for the cluster as shown
    ## add the -e flag for the metrics endpoint configuration as shown
    PROM=https://$(oc get route -n openshift-monitoring prometheus-k8s -o jsonpath="{.spec.host}") PROM_TOKEN=$(oc create token -n openshift-monitoring prometheus-k8s) QPS=1000 BURST=1000 ITERATIONS=10 OBJ_REPLICAS=50 ./kube-burner init  -c vm-dvclone-density.yml -e metrics-endpoint.yml

    Since the local indexer type is used in these examples, after the run completes it will produce multiple JSON files in the collected-metrics directory along with the run results detailing all the data captured from the metric queries. 

    Conclusion

    Hopefully this guide has provided some useful end-to-end examples walking through creation of a custom kube-burner workload to test VM deployments at scale while also exercising storage scalability. The examples can be modified as needed to adjust to other specific testing needs, but the general workflow is to determine which objects need to be created, the order of the job actions, and the metrics and measurements of interest. The same sort of workload configuration could be created for a pod and PVC scale case. 

    In many VM and CSI scale cases it is also important to exercise scaling in a multi-worker environment and, ideally, to further stress the underlying compute and storage resources by running a workload in the VMs so that they perform some amount of I/O to the attached PVCs. Also keep in mind that scaling performance evaluations during OpenShift cluster, virtualization operator, and storage operator upgrades remains an important use case to harden for your environment. 

    These scale testing methods can be used to develop best practices and recommendations (such as batch sizing) when deploying many VMs and storage attachments in parallel, depending on the cluster configuration, desired density, and underlying storage provider. This can help cluster administrators determine what density and deployment rate achieves efficient VM startup times, depending on user needs. 

    Happy benchmarking! 

    Related Posts

    • Test Kubernetes performance and scale with kube-burner

    • Egress IP Scale Testing in OpenShift Container Platform

    • Scale testing image-based upgrades for single node OpenShift

    • Enable OpenShift Virtualization on Red Hat OpenShift

    Recent Posts

    • Debugging image mode with Red Hat OpenShift 4.20: A practical guide

    • EvalHub: Because "looks good to me" isn't a benchmark

    • SQL Server HA on RHEL: Meet Pacemaker HA Agent v2 (tech preview)

    • Deploy with confidence: Continuous integration and continuous delivery for agentic AI

    • Every layer counts: Defense in depth for AI agents with Red Hat AI

    What’s up next?

    Learn how to create and manage your virtual machines (VMs) using Red Hat OpenShift and the Developer Sandbox.

    Start the activity
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2026 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Chat Support

    Please log in with your Red Hat account to access chat support.