This series shows how I solved a real-life performance problem by gathering metrics from Red Hat OpenShift in the Developer Sandbox for Red Hat OpenShift. Part 1 laid out the development environment and requirements. Now, we will set up the test environment and I will introduce two different test scenarios.
Read the whole series:
- Part 1: Performance requirements
- Part 2: The test environment
- Part 3: Collecting runtime metrics
- Part 4: Gathering performance metrics
- Part 5: Test rounds and results (August 5)
Provisioning the OpenShift cluster
The Developer Sandbox for Red Hat OpenShift provides a temporary cloud platform, which is useful for testing an application before deploying it. As I explained in Part 1, the application under test is the Service Binding Operator. I ran the projection using Red Hat OpenShift Container Platform, but you could use any OpenShift cluster to generate metrics. I used the openshift-install tool to set up a cluster in Amazon Web Services (AWS) for a sandbox that meets the prerequisites of an operator in production. The provisions for the cluster were:
- Three coordinator nodes of m5.4xlarge size (16 vCPU, 64 GiB memory)
- Three worker nodes of m5.2xlarge size (8 vCPU, 32 GiB memory)
Installing the Developer Sandbox
My installation uses the Developer Sandbox setup tool, which I introduced in Part 1. The installation steps are as follows:
-
Clone the repository containing the CodeReady Toolchain E2E tests:
git clone git@github.com:codeready-toolchain/toolchain-e2e.git
-
Use a Makefile to install the Developer Sandbox Operators:
make dev-deploy-e2e
Installing the Service Binding Operator
Use the following installation script to install the Service Binding Operator and Red Hat OpenShift Application Services Operator in OpenShift Container Platform:
export SBO_INDEX_IMAGE=${SBO_INDEX_IMAGE:-quay.io/redhat-developer/servicebinding-operator:index}
export SBO_CHANNEL=${SBO_CHANNEL:-beta}
export SBO_PACKAGE=${SBO_PACKAGE:-service-binding-operator}
export SBO_CATSRC_NAMESPACE=${SBO_CATSRC_NAMESPACE:-openshift-marketplace}
export SBO_CATSRC_NAME=${SBO_CATSRC_NAME:-sbo-operators}
export RHOAS_INDEX_IMAGE=${RHOAS_INDEX_IMAGE:-quay.io/rhoas/service-operator-registry:autolatest}
export RHOAS_CHANNEL=${RHOAS_CHANNEL:-beta}
export RHOAS_PACKAGE=${RHOAS_PACKAGE:-rhoas-operator}
export RHOAS_CATSRC_NAMESPACE=${RHOAS_CATSRC_NAMESPACE:-openshift-marketplace}
export RHOAS_CATSRC_NAME=${RHOAS_CATSRC_NAME:-rhoas-operators}
export RHOAS_NAMESPACE=${RHOAS_NAMESPACE:-openshift-operators}
DOCKER_CFG=$(mktemp)
chmod -r $DOCKER_CFG
echo "Installing Service Binding Operator"
curl -s https://raw.githubusercontent.com/redhat-developer/service-binding-operator/master/install.sh | \
OPERATOR_INDEX_IMAGE=$SBO_INDEX_IMAGE \
OPERATOR_CHANNEL=$SBO_CHANNEL \
OPERATOR_PACKAGE=$SBO_PACKAGE \
CATSRC_NAMESPACE=$SBO_CATSRC_NAMESPACE \
CATSRC_NAME=$SBO_CATSRC_NAME \
SKIP_REGISTRY_LOGIN=true \
DOCKER_CFG=$DOCKER_CFG \
/bin/bash -s
rm -f $DOCKER_CFG
echo "Installing RHOAS Operator"
oc apply -f - << EOD
---
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: $RHOAS_CATSRC_NAME
namespace: $RHOAS_CATSRC_NAMESPACE
spec:
displayName: RHOAS Operators
icon:
base64data: ""
mediatype: ""
image: $RHOAS_INDEX_IMAGE
priority: -400
publisher: RHOAS
sourceType: grpc
updateStrategy:
registryPoll:
interval: 260s
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: $RHOAS_PACKAGE
namespace: $RHOAS_NAMESPACE
spec:
channel: $RHOAS_CHANNEL
installPlanApproval: Automatic
name: $RHOAS_PACKAGE
source: $RHOAS_CATSRC_NAME
sourceNamespace: $RHOAS_CATSRC_NAMESPACE
EOD
#Wait for the operator to get up and running
retries=50
until [[ $retries == 0 ]]; do
kubectl get deployment/rhoas-operator -n $RHOAS_NAMESPACE >/dev/null 2>&1 && break
echo "Waiting for rhoas-operator to be created in $RHOAS_NAMESPACE namespace"
sleep 5
retries=$(($retries - 1))
done
kubectl rollout status -w deployment/rhoas-operator -n $RHOAS_NAMESPACE
Run the following command to execute the script:
SBO_INDEX_IMAGE=registry.redhat.io/redhat/redhat-operator-index:v4.7 SBO_CHANNEL=preview SBO_PACKAGE=rh-service-binding-operator ./install-operators.sh
Simulating active users
As mentioned in Part 1, the Developer Sandbox setup tool can extend the "activity" of simulated active users in one namespace by adding additional workloads. For the purpose of this test, I used two sets of Deployment + Service + Route resources, one for the backing service and one for the application. I included a single ServiceBinding
to bind the backing service route URL to the application.
The backing service is a BusyBox container with an exposed route (the URL to be bound to the application). The application is Service Binding Operator's generic test application, which is good for testing because it is simple and lightweight.
The simulation tool "provisions" users in sequence. This means that it registers each simulated user into the sandbox. Then, if the user is defined as active, it creates a workload in one of the user's namespaces to simulate their "activity."
To run the simulation tool, I used the following command:
go run setup/main.go --template=<workload-template-file> --operators-limit 0 --users 2000 --active 2000 --username zippy
I used two slightly different approaches (scenarios) to determine performance. Each scenario is represented by a workload template file referenced in the above command and described in the following sections.
Scenario 1: With a ServiceBinding resource
The first scenario ("with SBR") includes a ServiceBinding
resource in the users' provisioning. The backing service and application are created in the active user's namespace together with a ServiceBinding
, so that the Service Binding Operator needs to perform the binding only intermittently. The load on the Service Binding Operator to process ServiceBinding
resources and perform the binding is distributed throughout the duration of the user provisioning, which takes a couple of hours for 2,000 users. The workload template for this scenario is perf-test.with-sbr.yaml
:
kind: Template
apiVersion: v1
metadata:
name: sbo-perf-with-sbr
objects:
- apiVersion: apps/v1
kind: Deployment
metadata:
name: sbo-perf-app
labels:
app: sbo-perf-app
spec:
replicas: 1
strategy:
type: RollingUpdate
selector:
matchLabels:
app: sbo-perf-app
template:
metadata:
labels:
app: sbo-perf-app
spec:
containers:
- name: sbo-generic-test-app
image: quay.io/redhat-developer/sbo-generic-test-app:20200923
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
- apiVersion: v1
kind: Service
metadata:
labels:
app: sbo-perf-app
name: sbo-perf-app
spec:
ports:
- port: 8080
protocol: TCP
targetPort: 8080
selector:
app: sbo-perf-app
- apiVersion: route.openshift.io/v1
kind: Route
metadata:
labels:
app: sbo-perf-app
name: sbo-perf-app
annotations:
service.binding/host: path={.spec.host}
spec:
port:
targetPort: 8080
to:
kind: "Service"
name: sbo-perf-app
- apiVersion: apps/v1
kind: Deployment
metadata:
name: sbo-perf-svc
labels:
app: sbo-perf-svc
spec:
replicas: 1
strategy:
type: RollingUpdate
selector:
matchLabels:
app: sbo-perf-svc
template:
metadata:
labels:
app: sbo-perf-svc
spec:
containers:
- name: busybox
image: busybox
imagePullPolicy: IfNotPresent
command: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600']
ports:
- containerPort: 8080
- apiVersion: v1
kind: Service
metadata:
labels:
app: sbo-perf-svc
name: sbo-perf-svc
spec:
ports:
- port: 8080
protocol: TCP
targetPort: 8080
selector:
app: sbo-perf-svc
- apiVersion: route.openshift.io/v1
kind: Route
metadata:
labels:
app: sbo-perf-svc
name: sbo-perf-svc
annotations:
service.binding/host: path={.spec.host}
spec:
port:
targetPort: 8080
to:
kind: "Service"
name: sbo-perf-svc
- apiVersion: binding.operators.coreos.com/v1alpha1
kind: ServiceBinding
metadata:
name: service-binding
spec:
services:
- group: route.openshift.io
version: v1
kind: Route
name: sbo-perf-svc
application:
name: sbo-perf-app
group: apps
version: v1
resource: deployments
Scenario 2: Without the ServiceBinding resource
The second scenario ("without SBR") provisions users without the ServiceBinding
resource. The backing service and the application are created in the active user's namespace without a ServiceBinding
. The workload template for this scenario is perf-test.without-sbr.yaml
:
kind: Template
apiVersion: v1
metadata:
name: sbo-perf-without-sbr
objects:
- apiVersion: apps/v1
kind: Deployment
metadata:
name: sbo-perf-app
labels:
app: sbo-perf-app
spec:
replicas: 1
strategy:
type: RollingUpdate
selector:
matchLabels:
app: sbo-perf-app
template:
metadata:
labels:
app: sbo-perf-app
spec:
containers:
- name: sbo-generic-test-app
image: quay.io/redhat-developer/sbo-generic-test-app:20200923
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
- apiVersion: v1
kind: Service
metadata:
labels:
app: sbo-perf-app
name: sbo-perf-app
spec:
ports:
- port: 8080
protocol: TCP
targetPort: 8080
selector:
app: sbo-perf-app
- apiVersion: route.openshift.io/v1
kind: Route
metadata:
labels:
app: sbo-perf-app
name: sbo-perf-app
annotations:
service.binding/host: path={.spec.host}
spec:
port:
targetPort: 8080
to:
kind: "Service"
name: sbo-perf-app
- apiVersion: apps/v1
kind: Deployment
metadata:
name: sbo-perf-svc
labels:
app: sbo-perf-svc
spec:
replicas: 1
strategy:
type: RollingUpdate
selector:
matchLabels:
app: sbo-perf-svc
template:
metadata:
labels:
app: sbo-perf-svc
spec:
containers:
- name: busybox
image: busybox
imagePullPolicy: IfNotPresent
command: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600']
ports:
- containerPort: 8080
- apiVersion: v1
kind: Service
metadata:
labels:
app: sbo-perf-svc
name: sbo-perf-svc
spec:
ports:
- port: 8080
protocol: TCP
targetPort: 8080
selector:
app: sbo-perf-svc
- apiVersion: route.openshift.io/v1
kind: Route
metadata:
labels:
app: sbo-perf-svc
name: sbo-perf-svc
annotations:
service.binding/host: path={.spec.host}
spec:
port:
targetPort: 8080
to:
kind: "Service"
name: sbo-perf-svc
Only after all of the users are provisioned and the resources have settled into place are the ServiceBinding
resources created. All of this happens in a very short time—in fact, almost simultaneously. The following script creates all the ServiceBinding
resources, one for each active users namespace:
oc get deploy --all-namespaces -o json | jq -rc '.items[] | select(.metadata.name | contains("sbo-perf-app")).metadata.namespace' > workload.namespace.list
split -l 300 workload.namespace.list sbr-segment
for i in sbr-segment*; do
for j in $(cat $i); do
oc apply -n $j --server-side=true -f - << EOD
apiVersion: binding.operators.coreos.com/v1alpha1
kind: ServiceBinding
metadata:
name: service-binding
spec:
services:
- group: route.openshift.io
version: v1
kind: Route
name: sbo-perf-svc
application:
name: sbo-perf-app
group: apps
version: v1
resource: deployments
EOD
sleep 0.02s
done &
done
wait
rm -rf sbr-segment*
rm -rf workload.namespace.list
Creating the ServiceBinding
resources all at once simulates a situation where all of the active users do the binding in their namespace almost simultaneously. This happens, for example, when thousands of Red Hat Summit attendees see an announcement or demonstration at the same time and start playing with the Service Binding Operator in their sandboxes. That's the most extreme stress under which the Service Binding Operator will run, and the test should ensure that nothing crashes.
Next steps
In this article, you saw how to set up the test environment for testing the Service Binding Operator's real-life performance under stress. Next, we will look at the infrastructure to collect metrics during the tests.
Read next: Part 3: Collecting runtime metrics.
Last updated: September 20, 2023