This article explains the steps for calculating the system-reserved values for OpenShift nodes and configuring the values for optimal performance of the platform. This will prevent node resource overcommitment and enhance the overall efficiency of your Red Hat OpenShift environment.
Implementation steps
This section details the steps to optimize cluster stability. We’ll demonstrate how to reserve CPU and memory resources for underlying node components and other system components to ensure reliable scheduling and prevent node resource overcommitment.
Step 1:
Reserve a portion of the CPU and memory resources for use by the underlying node components, such as kubelet and kube-proxy, and the remaining system components, such as sshd and NetworkManager. By specifying the resources to reserve, you provide the scheduler with more information about the remaining CPU and memory resources that a node has available for use by pods.
Example output:
The following example outlines the recommended optimal values for system-reserved resources. Once these values are confirmed as appropriate for your environment, please proceed to update the configuration accordingly.
shemadhr@shemadhr-mac ~ % oc debug node/worker-2.testocp.lab.psi.pnq2.redhat.com
Temporary namespace openshift-debug-vk22l is created for debugging node...
Starting pod/worker-2testocplabpsipnq2redhatcom-debug-9zjq9 ...
To use host binaries, run `chroot /host`
Pod IP: 10.74.215.42
If you don't see a command prompt, try pressing enter.
sh-5.1#
sh-5.1# chroot /host bash
[root@worker-2 /]#
[root@worker-2 /]# cat /etc/node-sizing
node-sizing-enabled.env node-sizing-version.json node-sizing.env
[root@worker-2 /]# cat /etc/node-sizing.env
SYSTEM_RESERVED_MEMORY=1Gi
SYSTEM_RESERVED_CPU=500m
SYSTEM_RESERVED_ES=1Gi
[root@worker-2 /]# NODE_SIZES_ENV=/tmp/node-sizing.txt /usr/local/sbin/dynamic-system-reserved-calc.sh true
[root@worker-2 /]#
[root@worker-2 /]# cat /tmp/node-sizing.txt
SYSTEM_RESERVED_MEMORY=2Gi
SYSTEM_RESERVED_CPU=0.08
SYSTEM_RESERVED_ES=1GiStep 2:
You can also calculate recommended system-reserved values by referring to the guidelines provided in this article. The following are the available methods to review the current configuration of system-reserved parameters.
Option 1:
For reference, I have reviewed one of the worker nodes where the system-reserved configuration is currently set to the default values: 500m CPU and 1Gi Memory. You can verify this by comparing the CPU and memory under Capacity and Allocatable section of the node description.
Hostname: worker-2.testocp.lab.psi.pnq2.redhat.com
Capacity:
cpu: 16
ephemeral-storage: 313981932Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 98875428Ki
pods: 250
Allocatable:
cpu: 15500m
ephemeral-storage: 288292006229
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 97724452Ki
pods: 250Option 2:
To verify the system-reserved resource allocations on a node, you can SSH /debug into the node and inspect the file /etc/node-sizing.env, which contains the configured values.
[root@registry ~]# oc get no
NAME STATUS ROLES AGE VERSION
master-0.testocp.lab.psi.pnq2.redhat.com Ready control-plane,master 6d v1.29.11+ef2a55c
master-1.testocp.lab.psi.pnq2.redhat.com Ready control-plane,master 6d v1.29.11+ef2a55c
master-2.testocp.lab.psi.pnq2.redhat.com Ready control-plane,master 6d v1.29.11+ef2a55c
worker-0.testocp.lab.psi.pnq2.redhat.com Ready worker 6d v1.29.11+ef2a55c
worker-1.testocp.lab.psi.pnq2.redhat.com Ready worker 6d v1.29.11+ef2a55c
worker-2.testocp.lab.psi.pnq2.redhat.com Ready worker 6d v1.29.11+ef2a55c
[root@registry ~]# oc debug node/worker-1.testocp.lab.psi.pnq2.redhat.com
Starting pod/worker-1testocplabpsipnq2redhatcom-debug-47g6t ...
To use host binaries, run `chroot /host`
sh-5.1# chroot /host bash
[root@worker-1 /]#
[root@worker-1 /]# cat /etc/node-sizing.env
SYSTEM_RESERVED_MEMORY=1Gi
SYSTEM_RESERVED_CPU=500m
SYSTEM_RESERVED_ES=1Gi
[root@worker-1 /]#Step 3:
You can allow the Red Hat OpenShift Container Platform to automatically determine the optimal system-reserved CPU and memory resources for your nodes or you can manually determine and set the best resources for your nodes.
If the automatic allocation of resources is not enabled, it is possible to check the values it will generate for a specific node using the script already included in the MachineConfigs starting with 00 in current OpenShift releases as follows.
$ oc debug node/[node_name] [...]
sh-4.4# chroot /host bash
# NODE_SIZES_ENV=/tmp/node-sizing.txt /usr/local/sbin/dynamic-system-reserved-calc.sh true
# cat /tmp/node-sizing.txt
SYSTEM_RESERVED_MEMORY=2Gi
SYSTEM_RESERVED_CPU=0.10
SYSTEM_RESERVED_ES=1GiSummary
This article provided a guide to optimizing cluster stability by modifying system-reserved parameters on OpenShift nodes. We demonstrated the process of calculating and configuring these values to ensure optimal platform performance and reliable scheduling, thereby preventing node resource overcommitment and enhancing overall efficiency. We also provided methods for reviewing current system-reserved parameter configurations and offered guidance on automatic and manual resource allocation. By following these instructions, you can achieve more efficiency and cluster stability in your OpenShift environments.