This is the first In a series of five articles showing how developers can use an extensive set of metrics offered by Red Hat OpenShift to diagnose and fix performance problems. I'll describe a real-life success story where I did performance testing on the Service Binding Operator to get it accepted into the Developer Sandbox for Red Hat OpenShift. I'll describe the Service Binding Operator's performance challenges, how I planned my troubleshooting, and how I created and viewed the metrics.
This first article lays out the motivation for the whole effort, the Service Binding Operator's environment and testing setup, the requirements I had to meet to get the Service Binding Operator accepted into the Developer Sandbox, and the tooling made available by Developer Sandbox for performance testing. In Part 2, we will set up the test environment and I'll introduce the testing scenarios.
Read the whole series:
- Part 1: Performance requirements
- Part 2: The test environment
- Part 3: Collecting runtime metrics
- Part 4: Gathering performance metrics
- Part 5: Test rounds and results (August 5)
My team wanted to make the Service Binding Operator available on the Developer Sandbox for Red Hat OpenShift. Our goal was to use the operator in a demo and workshop titled Connecting to your Managed Kafka instance from the Developer Sandbox for Red Hat OpenShift at the April 2021 Red Hat Summit. We also think this operator is useful to developers experimenting with the sandbox.
One of the requirements to get the Service Binding Operator accepted into the Developer Sandbox was to pass a performance evaluation. This evaluation would basically ensure that the operator wouldn't crash the Developer Sandbox while being invoked by a reasonable load of active users.
Environment: The Developer Sandbox
Technically, the Developer Sandbox is a couple of operators installed on an ordinary OpenShift cluster, which is an instance of the Red Hat OpenShift Dedicated managed, cloud-based service. Developer Sandbox is scaled to support many concurrent users and their activity. For each developer (active user) registered on the sandbox, two namespaces are created to help the developer try out, play with, and learn about the OpenShift environment.
From the perspective of the Service Binding Operator, the Developer Sandbox is just a regular OpenShift instance. So, making the Service Binding Operator available to Developer Sandbox users is a simple matter of installing the operator into the underlying OpenShift cluster.
There are two ways to install an operator into the sandbox's cluster. The first way takes advantage of the
redhat-operators catalog source, which is available out of the box in OpenShift through the Operator Lifecycle Manager. That catalog source hosts the official Red Hat releases of the Service Binding Operator.
The second installation method is specific to OpenShift Dedicated, which offers add-ons for Red Hat tools. But because Service Binding Operator is still a technology preview (not yet generally available), there is no OpenShift Dedicated add-on for it. So, we decided to go with the first option and install the Service Binding Operator via the in-cluster Operator Hub from the official catalog source.
Developer Sandbox requirements and limitations
The Developer Sandbox team specified several requirements that the operator had to meet to be accepted and installed on the production instance of Developer Sandbox. Some were operational, but the ones relevant to this series are related to performance.
These requirements address the Service Binding Operator's integration into OpenShift Dedicated and the Developer Sandbox.
- The operator must not require the creation of any additional namespaces other than its own.
- It must be available on OpenShift Dedicated to run on OpenShift Dedicated clusters. (I'll show how to upload the Service Binding Operator in the next article in this series.)
- It must be able to operate with the Red Hat OpenShift Application Services Operator.
The remainder of this series focuses on what I did to meet the performance requirements for the Developer Sandbox. Essentially, we could have only one instance of the Service Binding Operator in the whole cluster, and that operator had to be able to handle a maximum of 3,000 users per cluster, as well as up to 10,000 namespaces. (Consider that Service Binding Operator requires a namespace, and that each user gets two more.)
As a result, a Developer Sandbox cluster consumes a lot of Kubernetes or OpenShift resources. The 3,000 users and up to 10,000 namespaces can lead to up to 100,000 role bindings, hundreds of thousands of secrets, and thousands of config maps, pods, deployments, build configs, and so on, per cluster. The operator must be able to handle that load without consuming too many compute resources, which would compromise cluster stability.
Tooling for performance evaluations
The Developer Sandbox team has its own setup tool, which was originally used to test the Developer Sandbox itself. The tool was made available for other operator teams for conducting performance evaluations. The only prerequisite is to have access to an OpenShift cluster of the scale equivalent to what we need in production where the tested operator (the Service Binding Operator and Red Hat OpenShift Application Services Operator in our case) is installed and running. The tool can then do the following:
- Install the sandbox on the OpenShift cluster.
- Simulate a specified x number of developers registering into the sandbox and simulate y of the x developers as active, creating workloads in their namespaces. For example, we could simulate a cluster where 3,000 users are registered and 1,000 of them are active.
- Clean the cluster if necessary.
The tool creates a default set of workloads in one namespace of each active user and makes it possible to add custom workloads (typically specific for the tested operator usage) in the active users' simulations.
This article has laid out the performance requirements for the Service Binding Operator to be to accepted to the Developer Sandbox for Red Hat OpenShift. The rest of this series documents the performance journey through the following tasks:
- Provision the OpenShift cluster
- Install the sandbox into the OpenShift cluster
- Install Service Binding Operator and the Red Hat OpenShift Application Services Operator into the OpenShift cluster
- Simulate active users "using the Service Binding Operator"
- Collect runtime metrics from OpenShift
- Extract and compute the test results
- Compile a report