Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • View All Red Hat Products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Secure Development & Architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • Product Documentation
    • API Catalog
    • Legacy Documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Scale testing image-based upgrades for single node OpenShift

June 28, 2024
Alex Krzos
Related topics:
Automation and managementEdge computingGitOpsKubernetesOperators
Related products:
Red Hat Advanced Cluster Management for KubernetesRed Hat OpenShift

Share:

    Image-based upgrades (IBU) are a developer preview feature in Red Hat OpenShift Container Platform 4.15 that reduce the time required to upgrade a single node OpenShift cluster. The image-based upgrade can perform both Z and Y stream upgrades, include operator upgrades in the image, and rollback to the previous version manually or automatically upon failure. Image-based upgrade can also directly upgrade OpenShift Container Platform 4.y to 4.y+2, whereas a traditional OpenShift upgrade would require two separate upgrades to achieve the same end result (4.Y to 4.Y+1 to 4.Y+2). 

    An image-based upgrade consists of several stages that the Lifecycle Agent will take a cluster through. Those stages are prep, upgrade, rollback, and idle. The prep stage can be completed outside a maintenance window and downloads the seed image and pre-caches additional images required to perform the upgrade. The upgrade stage performs the upgrade, and rollback can undo an upgrade which returns a cluster back to the previous version. The idle stage is achieved via a finalize policy which brings the cluster back to a state in which a new image-based upgrade can be started and can be initiated after an upgrade or rollback stage completes.

    As part of scale and performance testing, we run thousands of upgrades to find bugs, determine success rates, and generally stress test an environment. Scalability and performance testing will find bugs while a system is under immense stress, produce rare bugs such as race conditions, validate a system can complete a function at a specified scale, and determine resource requirements.

    Test goal

    The main goal of this round of scale testing image-based upgrades is to ensure that Red Hat Advanced Cluster Management for Kubernetes (RHACM) in combination with Topology Aware Lifecycle Manager (TALM) can orchestrate upgrades to 3,500 single node OpenShifts. The time spent per cluster upgrade will then be compared to traditional cluster upgrade data points from similar fleet upgrade tests that have been completed in RHACM Zero Touch Provisioning (ZTP) scale tests.

    Test method

    To accomplish this, a hybrid testbed that includes a 3-node bare metal OpenShift cluster with RHACM, TALM, and GitOps installed is set up. Another 136 machines serve as hypervisors to host virtual machine single node OpenShifts (SNOs). The hosted VMs are prepared with blank disks such that the Infrastructure Operator which is included with RHACM will install OpenShift via Virtual Media on the target VMs. The Hub cluster then deploys SNOs via a combination of GitOps and ZTP to 3,500 virtual machine single node OpenShifts across the hypervisors.

    Once initial deployment is complete, we can then use a PolicyGenTemplate to create RHACM Policies for image-based upgrades. These policies are orchestrated by TALM using a ClusterGroupUpgrade or CGU resource to perform image-based upgrades across the fleet of SNOs. To orchestrate the stages of IBU across 3,500 clusters, the fleet is initially split into 7 groups of 500 clusters. The groups are defined by the clusters selected in each prep CGU spec.clusters array and the additional stages will use labels applied via successful policies in the CGU. This allows the ability to keep failed clusters for analysis and opening bugs. When a cluster completes prep stage, it is labeled such that the associated upgrade CGU will select only the completed prep stage clusters via the CGU spec.clusterSelector.

    Each stage will be allowed to run to completion before moving to a new stage, this is to allow time to run diagnostic and data collection scripts to evaluate each stage. This means we will apply 7 CGUs to move all clusters into Prep stage, wait for completion before moving to the upgrade stage. After prep stage completes for all 3,500 clusters, then we begin upgrading the fleet with the upgrade CGUs. 

    To upgrade the fleet a rate of 1 upgrade CGU applied every 15 minutes was chosen with a 30 minute timeout. This means we expected all clusters in that CGU group to be upgraded or failed to upgrade in 30 minutes after the CGU was applied and enabled. Simply by the definition of the count of clusters (3,500) and the rate 1 CGU every 15 minutes along with the timeout, meant that the maximum amount of time during this entire fleet upgrade would be only 2 hours. Additionally, diagnostic data was collected off of every SNO that completed an upgrade to determine its individual upgrade time. 

    Test results

    After completing several iterations and gathering of IBU duration data, we observed 76 to 87.5% less time spent upgrading each individual cluster when compared across all statistics of 1,000s of upgrade durations. With IBU, while reviewing the 99 percentile of cluster upgrade times, our VM-based clusters completed an upgrade in less than 14 minutes. See Figure 1.

    Image Based Upgrade vs Traditional Upgrade
    Figure 1: IBU duration data test results.

    Conclusion

    In conclusion, we found that RHACM and TALM can upgrade 3,500 SNOs significantly faster via image based upgrades. IBU becomes even faster if you leverage the ability to skip Y-stream builds to achieve a 4.Y+2 upgrade and the fact you can include Operator upgrades in the image as well. This testing was artificially limited to 500 clusters every 15 minutes as a rate of IBU upgrades, thus we will continue to test larger groups to find the limits of IBU, RHACM, and TALM.

    Related Posts

    • Installing Red Hat Advanced Cluster Management (ACM) for Kubernetes

    • Deploy an Operator via GitOps using Advanced Cluster Management

    • Red Hat OpenShift Container Platform Load Testing Tips

    • Deploy Red Hat AMQ Streams and Fuse on OpenShift Container Platform 4

    • How to install single node OpenShift on AWS

    • How to install single node OpenShift on bare metal

    Recent Posts

    • Staying ahead of artificial intelligence threats

    • Strengthen privacy and security with encrypted DNS in RHEL

    • How to enable Ansible Lightspeed intelligent assistant

    • Why some agentic AI developers are moving code from Python to Rust

    • Confidential VMs: The core of confidential containers

    What’s up next?

    Download the GitOps Cookbook to quickly get started with development cycles on Kubernetes following a GitOps approach.

    Get the e-book
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue