Skip to main content
Redhat Developers  Logo
  • AI

    Get started with AI

    • Red Hat AI
      Accelerate the development and deployment of enterprise AI solutions.
    • AI learning hub
      Explore learning materials and tools, organized by task.
    • AI interactive demos
      Click through scenarios with Red Hat AI, including training LLMs and more.
    • AI/ML learning paths
      Expand your OpenShift AI knowledge using these learning resources.
    • AI quickstarts
      Focused AI use cases designed for fast deployment on Red Hat AI platforms.
    • No-cost AI training
      Foundational Red Hat AI training.

    Featured resources

    • OpenShift AI learning
    • Open source AI for developers
    • AI product application development
    • Open source-powered AI/ML for hybrid cloud
    • AI and Node.js cheat sheet

    Red Hat AI Factory with NVIDIA

    • Red Hat AI Factory with NVIDIA is a co-engineered, enterprise-grade AI solution for building, deploying, and managing AI at scale across hybrid cloud environments.
    • Explore the solution
  • Learn

    Self-guided

    • Documentation
      Find answers, get step-by-step guidance, and learn how to use Red Hat products.
    • Learning paths
      Explore curated walkthroughs for common development tasks.
    • Guided learning
      Receive custom learning paths powered by our AI assistant.
    • See all learning

    Hands-on

    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.
    • Interactive labs
      Learn by doing in these hands-on, browser-based experiences.
    • Interactive demos
      Click through product features in these guided tours.

    Browse by topic

    • AI/ML
    • Automation
    • Java
    • Kubernetes
    • Linux
    • See all topics

    Training & certifications

    • Courses and exams
    • Certifications
    • Skills assessments
    • Red Hat Academy
    • Learning subscription
    • Explore training
  • Build

    Get started

    • Red Hat build of Podman Desktop
      A downloadable, local development hub to experiment with our products and builds.
    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.

    Download products

    • Access product downloads to start building and testing right away.
    • Red Hat Enterprise Linux
    • Red Hat AI
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat Developer Toolset

    References

    • E-books
    • Documentation
    • Cheat sheets
    • Architecture center
  • Community

    Get involved

    • Events
    • Live AI events
    • Red Hat Summit
    • Red Hat Accelerators
    • Community discussions

    Follow along

    • Articles & blogs
    • Developer newsletter
    • Videos
    • Github

    Get help

    • Customer service
    • Customer support
    • Regional contacts
    • Find a partner

    Join the Red Hat Developer program

    • Download Red Hat products and project builds, access support documentation, learning content, and more.
    • Explore the benefits

What's new in OpenShift Container Platform system management

May 29, 2026
Neeraj Krishna Gopalakrishna
Related topics:
Cloud automationAutomation and managementLinuxVirtualization
Related products:
Red Hat OpenShiftRed Hat OpenShift Container Platform

    I've spent years helping teams troubleshoot node stability issues in production Red Hat OpenShift clusters, and one pattern keeps appearing: nodes with insufficient system reserves running out of memory or experiencing CPU starvation for critical system daemons. The problem has become more pronounced as nodes have grown larger. I've seen clusters running 256 GB worker nodes where system daemons were competing with hundreds of pods for just 1 GB of reserved memory.

    Starting with Red Hat OpenShift Container Platform 4.21, that's changing. OpenShift Container Platform will now automatically calculate and allocate system-reserved resources for newly created clusters, along with enforcing CPU limits on system daemons. Designed to improve node stability across the board, these changes have important implications for capacity planning that every OpenShift administrator needs to understand.

    This article describes everything you need to know about the default enablement of AutoSizingReserved (OpenShift Container Platform 4.21+) and system-reserved-compressible (OpenShift Container Platform 4.22+).

    What's changing

    These changes ensure system stability on large nodes while maintaining backward compatibility for existing deployments. Starting with OpenShift Container Platform 4.21, it enables AutoSizingReserved by default for new clusters. This feature automatically calculates system resource reservations based on node size, ensuring that larger nodes get proportionally more reserves for system daemons. However, existing clusters preserve their current behavior until you explicitly opt in.

    Also with OpenShift Container Platform 4.22, system-reserved-compressible enforcement on worker nodes uses cgroup controls to enforce CPU limits on system daemons, providing predictable CPU allocation during contention. Performance profile nodes are automatically excluded from enforcement, and control plane nodes continue using existing system resource management.

    The problem these changes solve

    To appreciate why these defaults are changing, it helps to understand how resource reservation worked previously. The kubelet has always supported reserving system resources (e.g., CPU, memory, and ephemeral storage) for node-level daemons like kubelet, CRI-O, and other system services. This ensures that critical system processes have the resources they need even when the node is running at capacity.

    However, earlier versions of OpenShift Container Platform disabled the autoSizingReserved parameter by default. This meant administrators had to manually configure system reserves, and many clusters ran with minimal to no reserves beyond basic defaults. A typical default might reserve 1 GB of memory regardless of whether the node had 8 GB or 512 GB of total memory. This worked on smaller nodes but created real problems on large nodes running dense workloads.

    I've debugged incidents where massive nodes with hundreds of pods experienced memory pressure, not because workloads exceeded their limits but because system daemons were fighting for scraps. The kubelet might try to manage 300 pods while competing for memory with those same pods. Under memory pressure, the kernel's out-of-memory killer would sometimes terminate system processes instead of workload containers, causing cascading failures.

    The new defaults address this by automatically scaling system reserves based on node capacity. Larger nodes get proportionally more reserves, ensuring system stability while still maximizing allocatable capacity for workloads. Additionally, CPU enforcement ensures system daemons stay within their allocated share even on high-CPU-count nodes.

    When these changes take effect

    Let me break down exactly what's different and when these changes take effect.

    AutoSizingReserved becomes default in 4.21

    Starting with OpenShift Container Platform 4.21, any new cluster you create will have autoSizingReserved enabled by default. This means the kubelet will automatically calculate how much CPU, memory, and ephemeral storage to reserve for system daemons based on the node's total capacity. The calculation uses carefully tuned formulas that balance protecting system stability with maximizing allocatable resources for workloads.

    If you're upgrading from 4.20 to 4.21, the important part for existing clusters is that this feature remains disabled. This is intentional. We don't want to surprise you with different capacity allocations in the middle of an upgrade. Your existing nodes will continue operating exactly as they did before, and you can enable auto-sizing on your own timeline after verifying the capacity impact.

    System-reserved-compressible enforcement in 4.22

    OpenShift Container Platform 4.22 introduces CPU enforcement for system reserves through a feature called system-reserved-compressible. While 4.21 calculates appropriate CPU reserves, it doesn't strictly enforce them at the cgroup level. System daemons could still burst beyond their allocation if CPU was available, which is fine, but they could also consume more than intended during contention.

    It's worth clarifying what this feature actually does. The cgroup v2 compressible CPU capability already exists in the Linux kernel. This feature simply applies that capability to system reservations. When enabled, the kubelet configures the systemd system.slice cgroup (where kubelet, CRI-O, and other system services run) with CPU limits and enforces them through cgroup controls.

    OpenShift Container Platform 4.22+ enables this enforcement by default on worker nodes. Note that master nodes do not have this enabled by default. Control plane nodes continue to use their existing resource management approach to ensure cluster management operations aren't impacted.

    The compressible term refers to how CPU differs from memory as a resource type. CPU is compressible, meaning when a process exceeds its CPU limit, the kernel throttles it, reducing its CPU time without killing the process. This contrasts with memory, which is incompressible. When a process exceeds its memory limit, the out-of-memory (OOM) killer terminates it. This distinction matters because CPU enforcement gracefully degrades performance under pressure, while memory enforcement can cause abrupt failures.

    How to calculate the reservation

    When enabling AutoSizingReserved, the kubelet uses specific formulas to determine how much CPU and memory to reserve for system services. Resource reservation for system daemons is an industry-wide practice across managed Kubernetes platforms. 

    Various cloud providers have developed similar approaches to ensure node stability as follows:

    • Google Kubernetes Engine (GKE) uses a tiered memory reservation formula based on total node memory, reserving 25% of the first 4 GB, 20% of the next 4 GB, and smaller percentages for larger memory allocations. Review the GKE documentation on node allocatable resources.

    • Azure Kubernetes Service (AKS) implements graduated CPU and memory reservations, with the first core and first 4 GB receiving higher reservation percentages similar to OpenShift's approach. Details are available in the AKS resource reservations documentation.

    • Amazon Elastic Kubernetes Service (EKS) calculates reserved resources based on instance size and maximum pod count per node, accounting for both system processes and networking overhead. Refer to the EKS best practices guide.

    • The upstream Kubernetes documentation provides the foundational concepts for system resource reservations that these implementations build upon.

    OpenShift Container Platform's AutoSizingReserved feature follows similar principles, specifically tuned for OpenShift Container Platform architecture and operational requirements. The following formulas reflect extensive testing across diverse production workloads.

    Memory calculation formula

    The memory reservation, optimized to be less aggressive on smaller nodes, still provides adequate protection for larger nodes:

    • First 8 GiB of memory: 1 GiB (flat reservation, matching the old non-dynamic default)
    • Next 120 GiB of memory (up to 128 GiB total): 6% of memory
    • Above 128 GiB: 2% of any memory above 128 GiB

    This table shows memory reservation at common node sizes:

    Total Memory

    Reserved Memory

    Allocatable Memory

    8 GiB

    1 GiB

    ~7 GiB

    16 GiB

    1.48 GiB

    ~14.52 GiB

    32 GiB

    2.44 GiB

    ~29.56 GiB

    64 GiB

    4.36 GiB

    ~59.64 GiB

    128 GiB

    8.2 GiB

    ~119.8 GiB

    256 GiB

    10.44 GiB

    ~245.56 GiB

    512 GiB

    15.56 GiB

    ~496.44 GiB

    Let's walk through a specific example for a 16 GB node:

    • First 8 GiB: 1.0 GiB (flat)
    • Next 8 GiB: 8 × 6% = 0.48 GiB
    • Total reserved: 1.48 GiB (leaving ~14.52 GiB for workloads)

    CPU calculation formula

    The CPU reservation uses a base-plus-increment model but enforces a strict minimum floor.

    The logic:

    • Base (1st Core): 60 millicores (0.06 CPU)
    • Increment: 12 millicores (0.012 CPU) for every additional core beyond the first
    • Minimum threshold: Compare this result against a floor of 0.5 CPU. If the calculated value is less than 0.5, the system enforces a reservation of 0.5 CPU.

    Let's see how this applies to a smaller worker node with 4 vCPUs:

    1. Calculate raw requirement:
      • Base (1st Core): 0.06
      • Additional (3 Cores): 3 × 0.012 = 0.036
      • Raw Total: 0.06 + 0.036 = 0.096 CPU
    2. Apply threshold: Since 0.096 is less than the minimum of 0.5, the reservation raises to the floor value.
    3. Final reserved: 0.5 vCPU (500 millicores)

    Thus, on a 4-core machine, it reserves 0.5 vCPU for system daemons, leaving 3.5 vCPUs allocatable for pods.

    CPU enforcement for system daemons

    Previously, OpenShift Container Platform calculated CPU reserves as an accounting measure. The kubelet knew that it reserved 0.5 CPU for system processes and factored that into the node's allocatable capacity. But nothing prevented system processes from using more CPU if it was available. This flexibility is generally good—you want system daemons to use idle CPU for housekeeping tasks. The problem arises under contention.

    In addition to calculating system-reserved resources, OpenShift Container Platform now enforces CPU limits on system daemons through cgroup-based enforcement.

    What is system-reserved-compressible

    Previously, while OpenShift Container Platform calculated how much CPU should be reserved for system processes, this reservation was more of an accounting measure—it didn't actually enforce limits on system processes. This meant that on nodes with high CPU counts, system daemons could consume more CPU than intended, potentially impacting workload performance.

    With system-reserved-compressible enabled:

    • The kubelet enforces CPU limits on system daemons via systemReservedCgroup: /system.slice.
    • It constrains system processes to their allocated CPU share through cgroup controls.
    • This improves CPU allocation predictability, especially on nodes with high CPU counts.

    How it works

    The kubelet configuration now includes the following YAML:

    systemReservedCgroup: /system.slice
    enforceNodeAllocatable:
      - pods
      - system-reserved-compressible

    This tells the kubelet to enforce node allocatable limits on both pods (as before) and system-reserved resources. The enforcement happens at the cgroup level, where the Linux kernel mechanisms ensure that CPU distributes according to configured weights and limits.

    It's important to understand that this isn't a hard cap in the traditional sense. CPU shares in Linux cgroups work proportionally during contention. When the node isn't under CPU pressure—your workload pods are idle and system daemons need CPU to pull images or perform garbage collection—system processes can use more than their reserved share. The unused CPU is available, so the kernel allows it.

    How enforcement behaves in practice

    The enforcement becomes meaningful when CPU contention occurs. Imagine your node is running at high utilization: workload pods are consuming their CPU requests, system daemons want CPU for ongoing operations, and there aren't enough cycles to satisfy everyone. In this scenario, the cgroup controller ensures the constraining of system processes to their configured share. If it reserves 0.5 CPU for system daemons on a 4-core node, system.slice will receive approximately that much CPU during contention, and workload pods will receive their expected share of the remaining 3.5 CPUs.

    What actually happens when system daemons hit their CPU limit? They get throttled. The kernel reduces the CPU time allocated to processes in system.slice, spreading their work over a longer period. This means operations might take longer—an image pull might be slower, or garbage collection might lag—but the processes continue running. This is fundamentally different from memory limits, where exceeding the limit triggers the OOM killer and terminates processes. CPU throttling gracefully degrades performance rather than causing failures, which is exactly what you want for system daemons under pressure.

    This predictability is crucial for clusters running latency-sensitive workloads. Without enforcement, I've seen production nodes where system daemons consumed significant CPU during busy periods, indirectly throttling workload containers that expected consistent CPU access. With enforcement, workload pods get more predictable CPU performance even when the node is under heavy load.

    Compatibility with performance profiles

    The kubelet cannot simultaneously enforce systemReservedCgroup and --reserved-cpus (used by performance profiles in the node tuning operator). For automatic handling, when detecting a performance profile with reservedSystemCPUs, the systemReservedCgroup automatically clears and enforceNodeAllocatable sets to ["pods"] only, ensuring the preservation of existing performance profile behavior without requiring any manual changes.

    Note: The control plane nodes are not impacted. They are excluded from this change, and their resource reservation behavior remains untouched.

    How to enable auto-sizing after upgrading

    The upgrade process has a specific mechanism to ensure the preservation of your current configuration.

    1. The pre-upgrade patch (4.20.6): Before upgrading to 4.21, your cluster will mandate a patch to version 4.20.6. During this patch, a specific MachineConfig named 50-worker-auto-sizing-disabled is automatically applied to your cluster. This config explicitly forces autoSizingReserved to remain disabled.

    2. Enable the feature in 4.21: Once you have successfully upgraded to OpenShift Container Platform 4.21, the 50-worker-auto-sizing-disabled config persists, keeping the feature off. To enable auto-sizing and allow OpenShift Container Platform to manage system reserves dynamically, you simply need to remove this restriction.

    3. To enable autoSizingReserved, delete the blocking MachineConfig. 

      oc delete machineconfig 50-worker-auto-sizing-disabled
    4. Monitor the rollout: Deleting MachineConfig will trigger the machine config operator (MCO) to revert the nodes to the default 4.21 behavior (enabled). The MCO will drain, reconfigure, and reboot the nodes in the pool one by one. Ensure your cluster has enough spare capacity to handle the rolling reboot before executing this command.

    Verify the configuration

    After enabling these features (either on a new cluster or after removing the blocking MachineConfig), you can verify the configuration.

    Verify AutoSizingReserved:

    # SSH into a node and check allocatable resources
    oc debug node/<node-name>
    chroot /host
    cat /etc/node-sizing-enabled.env

    Verify System-Reserved-Compressible:

    # Check kubelet configuration for system-reserved-compressible
    oc debug node/<node-name>
    chroot /host
    cat /etc/kubernetes/kubelet.conf | grep -A2 systemReservedCgroup
    cat /etc/kubernetes/kubelet.conf | grep -A3 enforceNodeAllocatable

    Expected output:

    systemReservedCgroup: /system.slice
    enforceNodeAllocatable:
      - pods
      - system-reserved-compressible

    If you have performance profiles with reservedSystemCPUs configured, this is for nodes with performance profiles:

    # Verify systemReservedCgroup is NOT present
    cat /etc/kubernetes/kubelet.conf | grep systemReservedCgroup
    # Verify enforceNodeAllocatable only contains pods
    cat /etc/kubernetes/kubelet.conf | grep enforceNodeAllocatable

    Final thoughts

    These changes represent an important maturation of OpenShift Container Platform node management capabilities. By automatically scaling system reserves based on node capacity and enforcing CPU limits on system daemons, OpenShift Container Platform 4.21 and 4.22 provide better out-of-the-box stability while still giving administrators the flexibility to customize when needed.

    For new clusters, the defaults should work well for most use cases. You can deploy your workloads knowing that nodes have adequate system reserves without manual tuning. For existing clusters, the upgrade path gives you control over when to adopt the new behavior, allowing you to plan for capacity changes on your timeline.

    I've seen too many production incidents caused by insufficient system reserves—nodes crashing under memory pressure, system daemons competing with workloads for CPU, or kubelet becoming unresponsive due to resource starvation. These changes address the root cause of many of those issues, and I expect they'll meaningfully improve cluster stability across the Red Hat OpenShift ecosystem.

    If you're planning an upgrade to 4.21, take time to understand how these changes will affect your specific clusters. Test them in non-production environments first, verify the capacity impact, and plan your rollout accordingly. Once enabled, you'll have more predictable, stable nodes that can reliably run the workloads you're deploying.

    In rare cases where other slices are running CPU-intensive workloads, contention from slices other than system.slice and kubepods.slice may still impact overall CPU allocation. These changes primarily address the issue of massive nodes only getting 1 GiB of reserved memory despite running hundreds of pods.

    For more details about configuring node resources in OpenShift Container Platform, refer to the official documentation for managing node resources.

    Related Posts

    • Configure a split disk on OpenShift Container Platform

    • A deep dive into OpenShift Container Platform 4.20 performance

    • Building multi-architecture container images on OpenShift Container Platform clusters

    • How to modify system-reserved parameters on OpenShift nodes

    Recent Posts

    • What's new in OpenShift Container Platform system management

    • Claude as your performance analysis partner

    • LogAn: Large-scale log analysis with small language models

    • stalld’s BPF Backend: Breaking Free from debugfs

    • Running AI inference on Rebellions ATOM NPU with Red Hat AI

    What’s up next?

    Learning Path RoCE_Multi-nodeAI_OS_featured_image

    RoCE multi-node AI training on Red Hat OpenShift

    Learn how to run distributed AI training on Red Hat OpenShift using RoCE with...
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2026 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Chat Support

    Please log in with your Red Hat account to access chat support.