Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Ensure a scalable and performant environment for ROSA with hosted control planes

May 30, 2024
Russell Zaleski Murali Krishnasamy David Sanz Moreno, Mohit Sheth
Related topics:
KubernetesSecure Coding
Related products:
Red Hat OpenShiftRed Hat OpenShift Service on AWS

Share:

    Red Hat OpenShift Service on AWS (ROSA) is a turn-key and performant solution to run managed Red Hat OpenShift on Amazon Web Services (AWS). It lets users to focus more on their applications and less on the infrastructure or platform it runs on, helping drive developer innovation. 

    Ensuring that OpenShift is performant and scalable is a core tenant of the OpenShift Performance and Scale team at Red Hat. Prior to its release (and still to this day), ROSA undergoes a vast array of performance and scale testing to ensure that it delivers industry leading performance. These tests run the gamut from control plane and data path focus, to upgrades, to network performance. These tests have been used to help measure and better the performance of “classic” ROSA, but what happens when we move to hosted control planes?

    What are hosted control planes?

    Hosted control planes (HCPs) is a new deployment model for ROSA, delivering an innovative way to run the OpenShift control plane as hosted control planes with the power of the HyperShift open source project.

    This deployment model allows the control plane pods to be run separately from the rest of the cluster and packed onto a single OpenShift management cluster (MC) which is managed by Red Hat and totally transparent to the user. From a customer point of view, you receive a more application optimized focus, faster deployment time, and a better value proposition as you are only paying a flat fee instead of running and managing the control plane when compared to classic ROSA. This is just the tip of the iceberg that ROSA with HCP provides (for additional information, give the announcement blog post a read).

    Performance and scalability’s role

    While ROSA with HCP is a great step forward, it was not without a lot of toil from countless folks from a wide range of teams. During the development of ROSA with HCP, the Red Hat OpenShift Performance and Scale team collaborated heavily cross-functionally for the early testing and successful general availability of hosted control planes. The team both developed new tools and leveraged existing tools that provided scalable and consistent testing options, crafted test cases optimized for managed service environments, performed end-to-end testing of the infrastructure for accurate capacity planning, and pushed the hosted control plane environment to its limits. All this was done to ensure that our customers have the best possible experience on ROSA with HCPs from day 1.

    The right tool for the job

    There has long since been a saying to "use the right tool for the job." If you are hanging a photo on a wall, you might need a hammer. If you are cutting a log, you might need a saw. At the core of our tooling is HCP-Burner and Kube-burner, two performance and scale test orchestration tools.

    HCP-Burner was created as a wrapper to automate create-use-destroy of OpenShift clusters on different managed platforms like ROSA and Azure Red Hat OpenShift, and specifically HCP environments. Through the use of this tool we are able to spin up hundreds of HCPs, run a designated workload on them, and index their results to our ElasticSearch environment for storage, retrieval, and subsequent analysis. This gives us a consistent way to measure the usage of the HCP environment as well as track key performance indicators (KPIs) such as installation times, cluster utilization, and more.

    In conjunction with HCP-Burner we make heavy use of Kube-burner. Kube-burner is a Kubernetes performance and test orchestration toolset that provides an easy way to create a large number of intertwined objects, measure KPIs specific to the test, and alert on important metrics. For HCP testing, we actively used Kube-burner to generate and churn varying types and degrees of load on our HCPs. To this end, a targeted managed OpenShift platform test was crafted with actual customer usage in mind.

    Testing

    To ensure a stable and performant environment, the team established a list of KPIs, as earlier noted. Some of these key items and their results are discussed below.

    Hosted control plane ready time

    One of the key tenets of HCPs is their faster installation times compared to classic or traditional OpenShift clusters. We wanted to devise a test to measure our performance on that promise. Before getting into the nuts and bolts of ready time, lets take a moment to understand the potential states of an HCP when installing and what ready means for the purposes of these tests.

    HCP lifecycle states:

    • Waiting: Normally seen immediately following an installation request and generally lasting only a few moments while the cluster service makes some checks and placement decisions.
    • Validating: Following the waiting state, validating occurs when access to resources is being confirmed in the customer’s AWS environment. 
    • Installing: Installing is where most of the time is spent leading up to an HCP being ready.
    • Ready: Ready state ensures that the Kubernetes API server is up and running, and access has been established but provisioning HCP worker nodes may still be in progress. 
    • Error: When an HCP encounters an issue during any of the regular states it can end up in an error state

    Knowing these potential states allows us to accurately track the time spent from when a user requests a cluster up through it being declared as ready. With this knowledge we can determine if any particular state exceeds its expected runtime as well as identify targets for potential improvement.

    One such test that shows this is the 320 hosted cluster test. During this test, 320 hosted clusters were requested in 4 batches of 80. Each HCP install request had a 1 minute delay before requesting the next. Once the HCPs were up, they would begin running a cluster density style workload via Kube-burner based on the HCP specific installation parameters. After this, the clusters would be cleaned up in the same fashion as they were installed.

    Let's analyze the durations of the HCP preflight checks, which cover the time spent in the waiting and validating states during the pre-provisioning phase. The hosted cluster will be in a waiting state while the cluster service picks the best suitable management cluster to host them. Subsequently, it transitions to the validating state to ensure that the underlay network on the customer's AWS environment is appropriately configured to handle HCP functionality. This involves deploying test instances and conducting checks to validate connectivity back to the management cluster.

    Looking at the waiting time for all these clusters, we can quickly see a stair-step style pattern (Figure 1). This is concerning as it implies that with increased cluster count comes increased wait times. While these times are still relatively short (28 seconds at worst) they grew by approximately 2.5 times on average from the first batch to the last. This growth trend is particularly concerning as it can lead to large delays as cluster count increases.

    alt text
    Figure 1: Growth of waiting time during a 320 hosted cluster test.

    Turning our attention to the validating time, there is a much more consistent average across the 320 installations (Figure 2). That said, we can see obvious outliers in the data where the time spent would be 2 to 3 times the average. In more extreme cases in other tests this state was seen to peak at over 20 minutes! That is longer than the entire installation should normally take. This was investigated further and after adjusting the retry interval and timeout of network verifier script, which validates the underlay AWS networking we were able to get to a much more consistent and less exaggerated worst case timings.

    alt text
    Figure 2: Inconsistent validating time during a 320 hosted cluster test.

    Lastly, looking at these times from a cluster ready perspective (i.e., the cumulative time up to reaching the ready state,) the installation timings are fairly consistent as shown in Figure 3. However, we can see the impact of the spikes from the validating state early on in the test. In these cases it increases total cluster ready time by multiple minutes above the average.

    alt text
    Figure 3: Cluster Ready time during a 320 hosted cluster test.

    Ensuring an optimal management cluster size

    As stated earlier, HyperShift utilizes the concept of management clusters where the hosted control planes reside on its worker nodes. Ensuring that these worker nodes are sufficiently utilized but not overburdened can mean walking a tightrope. Breaking down the problem into two parts, we have to ensure that there is capacity for current and upcoming HCPs while simultaneously being prudent not to waste resources.

    If there is not enough capacity, then we can potentially have 2 problems. First and most obvious, existing HCPs will suffer from reduced performance or potentially fail. Utilizing the managed OCP Kube-burner tests we mapped out the expected usage of the HCP pods by measuring the resource usage of control plane pods under load. With that knowledge it is fairly simple to determine an ideal average size of worker instances based on expected cpu and memory consumption. 

    This would be perfect if it were not for the "noisy neighbor" problem. Not all clusters are used or sized similarly. Some are CI systems, some are build systems, some host elaborate websites, and so on. These application differences result in different usage patterns. The good news is that given the mixed environment most of the usage spikes occur at different times and on different workers. However there are always some clusters which push the limits no matter what. It’s here where we need to rely on 2 things. First, by appropriately setting the pods requests and limits we can curb how aggressive these clusters can be and ensure the space for their workloads. Secondly, having an appropriate mechanism to either isolate and/or migrate the noisy HCP to a larger instance size to allow for its needed usage and growth. 

    After solving for existing clusters we can take a look at the second problem: new installations. If there is not sufficient capacity to place an HCP when the installation request comes in, then it will take longer than normal as the management cluster will need to provision additional worker instances for the required capacity. Knowing the required resources to run an HCP’s pods, we can utilize the Cluster Proportional Autoscaler (CPA) to ensure a buffer of space for number N HCPs.

    The CPA spins up lower PriorityClass pods as temporary placeholders to reserve compute resources. These pod replicas can be controlled by CPA config patterns, functioning as basic pause pods that are evicted when higher PriorityClass pods require resources, similar to an operating system cache. HCP pods promptly utilize available compute resources without delay to host control plane services, while the placeholder pods either migrate to different workers or prompt CPA to engage new workers. This approach ensures that the MC maintains sufficient reserved resources for HCP pods in anticipation of demand. 

    For example, if we wanted to size the management cluster in way such that there is space for up to 2 new HCPs, we could do something like the example below. We pre-calculate the total CPU core and memory requirements for an HCP per AvailabilityZone and adjust the size of the placeholder pods accordingly. In the example below, each placeholder pod is configured to request 2 CPUs and 7 GiB of memory resources per AZ, allowing three replicas to reserve resources for one HC. These pod replicas scale up or down based on the CPA configuration, which supports two patterns: linear and ladder. The advantage of the ladder configuration lies in its ability to scale up the placeholder pod replicas at different rates in response to the management cluster size. Moreover, it can be configured to scale down when the MC doesn't anticipate new HCP requests, which is not available with linear configuration.

    Note that in this example we are focusing on the number of cores to reserve capacity for new HCPs and stating that 48 cores MC will reserve a buffer for 1 HCP. Additionally, we assume that each MC worker has 16 cores. The ladder configuration ensures a buffer space for 1 HCP when the MC is new and has 3 workers. It reserves resources for 2 HCP when the cluster is operating near ~50% capacity and reduces its allocation when the MC is fully utilized.

     ladder: |-
        {
          "coresToReplicas":
          [
            [ 48, 3 ],    # MC size >= 3 workers, reserves 1 HCP resource
            [ 192, 6 ],    # MC size >= 12 workers, reserves 2 HCP resource  
            [ 432, 0 ]    # MC size >= 27 workers, 0 buffer 
          ]
        }

    Using the CPA gives us a good amount of flexibility and greatly smoothes out the installation times. Figure 4 shows the 80th percentile (P80) installation times with and without using the CPA. Using the P80 reduces the outliers in the other stages while still giving a fairly accurate picture of what is going on. As is clearly seen, the P80 installing time goes from 14m:34s down to 8m:49s. That's a huge improvement of about 5m:45s.

    alt text
    Figure 4: Cluster Ready times with and without the use of CPA.

    Looking at the individual cluster ready times, we can see a similar picture (Figure 5).

    alt text
    Figure 5: Gaps and inconsistencies when not using CPA.

    Without the CPA we can clearly see the gaps and rising install times that occur as the normal OpenShift Autoscaler attempts to catch up. But with a roughly 5 minute installation time per new worker, the delay builds and builds. Looking at it with the CPA it is much smoother (Figure 6). There will always be some variance but we eliminate the large spikes seen previously.

    alt text
    Figure 6: Significantly more consistent results when using CPA.

    Individual HCP performance

    Based on the above test cases, and the optimizations arising out of those findings, the management clusters can now sustain the load required. None of that matters, however, if the individual HCP with which the customer interfaces and runs their workloads on doesn’t perform or scale on par with a classic ROSA cluster. For this, we used our existing toolset and compared the results to ROSA Classic. These tests consisted of control and data plane workloads.

    The data path was exercised using ingress-perf and k8s-netperf tools. ingress-perf has a wealth of options which we utilize to target ingress/router testing. Configuration options include the type of benchmark termination, number of connections, number of samples, duration, concurrency, and etc. We benchmarked edge, http, passthrough, and re-encrypt terminations to ensure their throughput and latency is acceptable. 

    Similarly, k8s-netperf is used to ensure coverage of TCP and UDP throughput and latency. These tests are run with message sizes ranging from 64 to 8192 in 1 or 2 pairs. They are then executed in scenarios that test node-to-node, pod-to-pod, and pod-to-pod via a service. These networking tests help to ensure appropriate performance of the data plane.

    The control plane was tested in multiple scenarios utilizing Kube-burner's different testing options. First up was a trio of node density tests that measure the 99th percentile of pod ready latency, which is a KPI to represent the time taken for a pod to go from scheduling to all container running. Node Density Light creates a maximum number of sleep pods on each of the HCPs’ worker nodes. Node Density CNI spawns replicas of a webserver and a curl pod has a startup probe that reflects network flow programming latency by curling the webserver via a ServiceIP. Lastly, Node Density Heavy creates replicas of a database and client application. We can see that at 120 HCP workers the latency is nearly identical to that of classic ROSA (Figure 7).

    alt text
    Figure 7: 99th Percentile pod ready latency during 120 node Node Density test.

    Similarly, Cluster Density (v2) is a density focused test that creates pods, builds, secrets, routes, and more across the entire cluster. A percentage of these resources are then churned (deleted and recreated) at set intervals to more accurately represent a live environment where changes actively occur. These resources put an appreciable strain on the control plane. We track and measure this control plane component resource usage as well as API and ETCD latencies to ensure components a) are not regressing in resource utilization and b) responding to queries in an appropriate amount of time. Lastly, there is an upgrade test that is executed while running the cluster density test. This is to ensure that the system can upgrade at load, as it is imperative to upgrade OpenShift to consume critical fixes and the latest and greatest features that come with each successive release..

    Unearthing some unseen challenges

    There will be challenges with any project. While ROSA with hosted control planes did well overall, there were still a few challenges that are worth discussing. Luckily through a close collaboration with quality engineering, development, and project management teams these were able to be resolved.

    While by and large things have gone well for ROSA with hosted control planes, it has not all been roses. There have been issues along the way. Many have been minor but some have had a major impact.

    HCP upgrade failures

    The previously mentioned, upgrade test is where we can see one of these major issues. While running cluster density (v2) by itself passes at an expected hosted cluster size when an upgrade is executed while these objects exist we hit a critical failure. As can be seen in Figure 8 in the hosted cluster CPU and memory utilization charts, there is a massive spike of resource consumption after the control plane is upgraded due to the Kubernetes API server. The repeated gaps and new pods show that we are running out of resources (in this case memory was the main constraint) and causing the worker to run out of memory.

    alt text
    Figure 8: Visible restarting of kube-apiserver due to Out of Memory issues.

    This is obviously a critical failure as upgrades need to function properly for an appropriate HCP lifecycle. The good news is that a fix is on the way. In this fix, the team implemented GOMEMLIMIT to stop the excessive heap growth of the Kubernetes API server that was a main cause of the increased consumption.

    Management cluster upgrades

    Another upgrade related issue was found on the management cluster, which is critical back office infrastructure that is essential for seamless operation of the HCP solution. The test consists of loading up a management cluster to its maximum number of HCPs and then upgrading the MC. HCPs are watched for timeouts and errors that may indicate more than expected outages. During this test we found that nearly half of the HCPs experienced a longer than expected kube-apiserver outage. Digging deeper we found that while the kube-apiservers were happy as a clam on the management cluster the HCP router pods were no longer “finding” their corresponding kube-apiservers. Through some great engineering work this was quickly resolved via RHSA-2023:7599

    A stable and performant environment

    Due to the testing done throughout the project, we were able to ensure a stable and performant environment for ROSA with HCP to customers at GA. There were bumps and bruises along the way, but today it can stand tall in its capabilities and scale. There will be ongoing continuous performance testing through automation to alleviate the risk of regressions and ensure that HCPs can meet the most demanding customer requirements. 

    If you are interested in trying OpenShift Service on AWS with hosted control planes, check out our quick start guide.

    Last updated: June 24, 2024

    Related Posts

    • What's new for developers in Red Hat OpenShift 4.15

    • How to add public Ingress to a PrivateLink ROSA cluster

    • How to easily deploy OpenShift on Azure using a GUI

    • Create AWS resources with Kubernetes and Operators

    • How to install single node OpenShift on AWS

    • Test Kubernetes performance and scale with kube-burner

    Recent Posts

    • How Kafka improves agentic AI

    • How to use service mesh to improve AI model security

    • How to run AI models in cloud development environments

    • How Trilio secures OpenShift virtual machines and containers

    • How to implement observability with Node.js and Llama Stack

    What’s up next?

    Explore how the migration toolkit for containers paves the way for seamless migration of application workloads from Red Hat OpenShift Service on AWS (ROSA) classic to ROSA hosted control plane clusters, right down to the namespace level.

    Start the activity
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue