Image mode is a powerful capability in Red Hat OpenShift that allows customization to CoreOS-based nodes. While this feature provides unprecedented flexibility, it can also introduce new layers of complexity when things go wrong. This guide demonstrates common debugging scenarios in OpenShift 4.20 and beyond, and provides practical troubleshooting steps to get clusters back on track. There are many tools and techniques that make debugging image mode less mysterious and more manageable.
Understanding the image mode process
When image mode for OpenShift is enabled, the typical workflow involves three stages. Each stage has distinct failure points. The stages are: MachineOSConfig (MOSC) creation, MachineOSBuild (MOSB) creation and execution, and application of the new image to nodes.
Stage 1: MachineOSConfig creation
The process begins when a MachineOSConfig resource is created targeting a specific MachineConfigPool. This resource acts as the blueprint, defining how the custom OS image is built and where it gets stored.
What to watch for:
- Validation errors: Resource creation may fail when required fields are missing or incorrectly configured.
- Secret references: Ensure all referenced pull and push secrets exist in the
openshift-machine-config-operatornamespace. - Registry specifications: Verify that
renderedImagePushSpecpoints to a valid, accessible registry location.
At this stage, issues are typically configuration errors that prevent a resource from being created or accepted by the cluster.
If the MOSC resource was successfully created, then the machine-os-builder pod should be healthy and running in the openshift-machine-config-operator namespace.
$ oc get pods -n openshift-machine-config-operator \
-l k8s-app=machine-os-builder
NAME READY STATUS RESTARTS
machine-os-builder-b8f..h94 1/1 Running 0If this pod is visible and running, your debugging can proceed to the next step, which builds the image.
If this pod is not visible, then errors have occurred.
If a forbidden value is used for any of the fields in MachineOSConfig, it's printed in the create command output. For example, given this YAML file:
$ cat ./mosc.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineOSConfig
metadata:
name: worker
spec:
machineConfigPool:
name: infra
currentImagePullSecret:
name: current-image-pull
imageBuilder:
imageBuilderType: Job
baseImagePullSecret:
name: base-image-pull
renderedImagePushSecret:
name: rendered-image
renderedImagePushSpec: "quay.io/sregidor/sregidor-os:mco_layering"The output of oc create is:
$ oc create -f ./mosc.yaml
The MachineOSConfig "worker" is invalid:
* spec.imageBuilder.imageBuilderType: Unsupported value: "job": supported values: "Job"The example shows that spec.imageBuilder.imageBuilderType is set to job instead of the required Job (with a capital "J").
Another example:
$ oc create -f ./mosc.yaml
The MachineOSConfig "worker" is invalid: <nil>: Invalid value: "object": MachineOSConfig name must match the referenced MachineConfigPool name; can only have one MachineOSConfig per MachineConfigPoolIf the configured values are not forbidden but nevertheless are causing problems, the information to detect those problems is in the openshift-machine-config-operator pod, in the triggered events, and in the machine-config ClusterOperator. The most detailed information is in the openshift-machine-config-operator pod.
For example, supposed the secrets haven't been created for this sample YAML configuration:
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineOSConfig
metadata:
name: infra
spec:
machineConfigPool:
name: infra
currentImagePullSecret:
name: current-image-pull
imageBuilder:
imageBuilderType: Job
baseImagePullSecret:
name: base-image-pull
renderedImagePushSecret:
name: rendered-image
renderedImagePushSpec: "quay.io/sregidor/sregidor-os:mco_layering"Nevertheless, the resource can be created:
$ oc create -f ./mosc.yaml
machineosconfig.machineconfiguration.openshift.io/infra createdHowever, the builder pod is not created:
$ oc get pods -n openshift-machine-config-operator |grep buildThe error can be found in the openshift-machine-config-operator pod:
$ oc logs -n openshift-machine-config-operator-7498f4576b-h5vzj
...
E1017 08:56:53.431756 1 operator.go:467] "Unhandled Error" err="could not update Machine OS Builder deployment: could not validate renderedImagePushSecret \"rendered-image\" for MachineOSConfig infra: secret rendered-image from infra is not found. Did you use the right secret name?"
...There are also events reporting the error:
$ oc get events -n openshift-machine-config-operator --sort-by metadata.creationTimestamp |tail -3
34s Warning OperatorDegraded: MachineOSBuilderFailed /machine-config Failed to resync 4.20.0-0-2025-10-16-080835-test-ci-ln-bfn63jk-latest because: could not update Machine OS Builder deployment: could not validate renderedImagePushSecret "rendered-image" for MachineOSConfig infra: secret rendered-image from infra is not found. Did you use the right secret name?
11s Warning OperatorDegraded: MachineOSBuilderFailed /machine-config Failed to resync 4.20.0-0-2025-10-16-080835-test-ci-ln-bfn63jk-latest because: could not update Machine OS Builder deployment: could not validate baseImagePullSecret "base-image-pull" for MachineOSConfig infra: secret base-image-pull from infra is not found. Did you use the right secret name?
96s Normal ConfigMapUpdated deployment/openshift-machine-config-operator Updated ConfigMap/kube-rbac-proxy -n openshift-machine-config-operator:...You can get information from the machine-config ClusterOperator, too:
$ oc get co machine-config
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
machine-config 4.20.0-0-2025-10-16-080835-test-ci-ln-bfn63jk-latest True False True 76m Failed to resync 4.20.0-0-2025-10-16-080835-test-ci-ln-bfn63jk-latest because: could not update Machine OS Builder deployment: could not validate renderedImagePushSecret "rendered-image" for MachineOSConfig infra: secret rendered-image from infra is not found. Did you use the right secret name?Stage 2: MachineOSBuild (MOSB) creation and image build process
After the MachineOSConfig has been successfully created, and the machine-os-builder pod is running, openshift-machine-config-operator automatically generates a MachineOSBuild resource. The MachineOSBuild resource controls an actual image build job that pulls the base CoreOS image, applies the customizations (in a Containerfile), and pushes the result to the specified registry.
To execute this process, several auxiliary secrets and configmaps are created in the openshift-machine-config-operator namespace.
What to watch for:
- Build status: Monitor the MachineOSBuild resource for conditions showing
Succeeded=TrueorFailed=True. - Job failures: Verify that the build job in the
openshift-machine-config-operatornamespace completes successfully. - Image pull errors: Authentication failures when pulling the base image indicate problems with
baseImagePullSecret. - Build errors: Containerfile syntax issues, missing packages, or failed RUN commands cause build failures.
- Image push errors: A problem pushing to the registry suggests an issue with
renderedImagePushSecretor registry permissions.
This is where most failures occur, because it involves pulling images, executing build steps, and pushing results, all of which depend on external resources and credentials.
Useful output is displayed while the image is being built. While acquiring the MachineOSBuild resource:
$ oc -n openshift-machine-config-operator get machineosbuild
NAME PREPARED BUILDING SUCCEEDED
infra-b1b93a87b88b18b3ad70e9fb2596b2cd False True False
INTERRUPTED FAILED AGE
False False 108sCreating the job in the MachineOSBuild execution:
$ oc -n openshift-machine-config-operator get job
NAME STATUS COMPLETIONS DURATION AGE
build-infra-b1b93a87b..b2cd Running 0/1 105s 105s
The pod controlled by the job, which executes the actual build process:
$ oc -n openshift-machine-config-operator get pods
NAME READY STATUS RESTARTS AGE
build-infra-b1b93a87b..b2cd-q7tsb 0/1 Init:0/1 0 2m49s
...Note that only changes to kernel arguments, kernel type, OSImageURL, or extension bundles create a new job and trigger a new image build process. All other MachineConfig changes reuse the existing MOSB and do not trigger a new build.
This stage can be considered successful if:
- The MachineOSBuild was created and is reporting
Succeeded=TrueandFailed=False - The job is automatically removed by the
machine-os-builderpod
$ oc -n openshift-machine-config-operator get machineosbuild
NAME PREPARED BUILDING SUCCEEDED INTERRUPTED FAILED
infra-f509ba5..e99 False False True False FalseWhen the MachineOSBuild resource is not created
When MachineOSBuild is not created, or is not successful, it indicates that an error has occurred. The process in charge of creating the MachineOSBuild resource is the machine-os-builder pod. This error is not very common, but if it happens, you must read the logs in this pod to find the causes:
$ oc -n openshift-machine-config-operator logs machine-os-builder-b8f48488f-nsdbk
....
I1017 09:42:42.524084 1 reconciler.go:634] New MachineOSBuild created: infra-f509ba5b2d76bcc5a113fd81de75ee99When the MachineOSBuild was created, but failed
The most common cause of a failed MachineOSBuild is that the job building the image failed to build it. If the MachineOSBuild resource fails, the first step is to locate the associated job.
When the job is not created
The machine-os-builder pod is in charge of creating or deleting a job. If the job cannot be found, read the logs in this pod for further information:
$ oc -n openshift-machine-config-operator logs machine-os-builder-b8f48488f-nsdbkDebugging a failed job
Debugging a failed job can take many forms, depending on the problem. Focusing on one problem at a time helps you confirm your theory about the cause of the problem.
For example, suppose the following MOSC is created:
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineOSConfig
metadata:
name: infra
spec:
machineConfigPool:
name: infra
currentImagePullSecret:
name: current-image-pull
imageBuilder:
imageBuilderType: Job
baseImagePullSecret:
name: base-image-pull
renderedImagePushSecret:
name: rendered-image
renderedImagePushSpec: "quay.io/sregidor/sregidor-os:mco_layering"
containerFile:
- content: |-
RUN curl --fail -L https://github.com/example/yq/releases/latest/download/yq_linux_amd64_wrong -o /usr/bin/yq && chmod +x /usr/bin/yqYou run the oc create command:
$ oc create -f mosc.yaml
machineosconfig.machineconfiguration.openshift.io/infra createdBut the MachineConfigPool shows as degraded:
$ oc get mcp infra
NAME CONFIG UPDATED UPDATING DEGRADED M..COUNT READYMACHINECOUNT
infra rendered-infra-620..a43 False False True 1 0
UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
0 0 158m
$ oc get mcp infra -oyaml
...
- lastTransitionTime: "2025-10-17T10:55:00Z"
message: 'Failed to build OS image for pool infra (MachineOSBuild: infra-32ef35dea3e553071277954842edb33a):
Failed: Build Failed'
reason: BuildFailed
status: "True"
type: ImageBuildDegraded
...The MOSB resource shows as failing:
$ oc -n openshift-machine-config-operator get machineosbuild
NAME PREPARED BUILDING SUCCEEDED INTERRUPTED FAILED AGE
infra-32e..33a False False False False True 31mSo you locate the job:
$ oc get job -l machineconfiguration.openshift.io/machine-os-config=infra
NAME STATUS COMPLETION DURATION AGE
build-infra-32e..33a Failed 0/1 31m 31mThese are the pods launched by the failed job:
$ oc -n openshift-machine-config-operator get pods
NAME READY STATUS RESTARTS AGE
build-infra-32e..33a-2jg2t 0/1 Init:Error 0 25m
build-infra-32e..33a-bzfcp 0/1 Init:Error 0 29m
build-infra-32e..33a-cndjm 0/1 Init:Error 0 32m
build-infra-32e..33a-lqlk9 0/1 Init:Error 0 22mExamine the logs of the failed pod to determine the cause. The build pods have two containers: image-build and create-digest-configmap.
- The container
image-buildbuilds the image and pushes it - The container
create-digest-configmapcreates an auxiliary configmap with the right digest so that it can be read andopenshift-machine-config-operatorcan update the MOSB and MOSC resources
To identify errors in the build process, examine the image-build container in the build pod:
$ oc -n openshift-machine-config-operator logs \
build-infra-32ef35dea3e553071277954842edb33a-2jg2t \
-c image-build
...
time="2025-10-17T10:51:32Z" level=debug msg="Running &exec.Cmd{Path:\"/bin/sh\", Args:[]string{\"/bin/sh\", \"-c\", \"curl --fail -L https://github.com/example/yq/releases/latest/download/yq_linux_amd64_wrong -o /usr/bin/yq && chmod +x /usr/bin/yq\"}, Env:[]string{\"HTTP_PROXY=\", \"HTTPS_PROXY=\", \"NO_PROXY=\", \"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\", \"HOSTNAME=0430829320a1\", \"HOME=/root\"}, Dir:\"/\", Stdin:(*os.File)(0xc0001280a0), Stdout:(*os.File)(0xc0001280a8), Stderr:(*os.File)(0xc0001280b0), ExtraFiles:[]*os.File(nil), SysProcAttr:(*syscall.SysProcAttr)(0xc00017c0c0), Process:(*os.Process)(nil), ProcessState:(*os.ProcessState)(nil), ctx:context.Context(nil), Err:error(nil), Cancel:(func() error)(nil), WaitDelay:0, childIOFiles:[]io.Closer(nil), parentIOPipes:[]io.Closer(nil), goroutine:[]func() error(nil), goroutineErr:(<-chan error)(nil), ctxResult:(<-chan exec.ctxResult)(nil), createdByStack:[]uint8(nil), lookPathErr:error(nil), cachedLookExtensions:struct { in string; out string }{in:\"\", out:\"\"}} (PATH = \"\")"
%Total %Rec %Xfer Avg Speed Time Time Time Current Dload Upload... Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 9 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
curl: (22) The requested URL returned error: 404
subprocess exited with status 22
subprocess exited with status 22
time="2025-10-17T10:51:32Z" level=debug msg="Error building at step {Env:[HTTP_PROXY= HTTPS_PROXY= NO_PROXY= PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] Command:run Args:[curl --fail -L https://github.com/example/yq/releases/latest/download/yq_linux_amd64_wrong -o /usr/bin/yq && chmod +x /usr/bin/yq] Flags:[] Attrs:map[] Message:RUN curl --fail -L https://github.com/example/yq/releases/latest/download/yq_linux_amd64_wrong -o /usr/bin/yq && chmod +x /usr/bin/yq Heredocs:[] Original:RUN curl --fail -L https://github.com/example/yq/releases/latest/download/yq_linux_amd64_wrong -o /usr/bin/yq && chmod +x /usr/bin/yq}: exit status 22"
Error: building at STEP "RUN curl --fail -L https://github.com/example/yq/releases/latest/download/yq_linux_amd64_wrong -o /usr/bin/yq && chmod +x /usr/bin/yq": exit status 22The logs show that curl returned curl: (22) The requested URL returned error: 404 when attempting to reach https://github.com/example/yq/releases/latest/download/yq_linux_amd64_wrong. This happens because there is a typo in the URL and the actual URL should be https://github.com/example/yq/releases/latest/download/yq_linux_amd64.
After you find the error, you can edit the MOSC resource. In this example, using the correct URL in the Containerfile section triggers a new MOSB resource that successfully builds the image and applies the config.
Other kinds of errors are possible. For example, the lack of permissions to pull or push an image is relatively common in some environments. In this case, a pod reports that a configured secret doesn't have permission to push an image:
$ oc logs build-infra-5e0c7aaf3cf26e8fab9dd111bb336342-czzjb -c image-build
....
Copying blob sha256:29f46dbdbc11454d191cd70ebbd18aec36bc2afc72757d38f2ad473b6dba1c75
Copying blob sha256:d0a1fe72e3dceadb214f96787144ef31672f2b2a429a3798717d739a55a9b574
Error: pushing image "quay.io/sregidor/sregidor-os:infra-5e0c7aaf3cf26e8fab9dd111bb336342" to "docker://quay.io/sregidor/sregidor-os:infra-5e0c7aaf3cf26e8fab9dd111bb336342": writing blob: initiating layer upload to /v2/sregidor/sregidor-os/blobs/uploads/ in quay.io: unauthorized: access to the requested resource is not authorizedIn this example, there was a problem in the build. If the build process is not failing but the build pod fails, then you can examine the create-digest-configmap container to see whether there was a problem creating the configmap with the digest info.
Auxiliary resources
To build the image, openshift-machine-config-operator uses several auxiliary resources temporarily stored in the openshift-machine-config-operator namespace. These resources are only present during the build process. However, if the build fails, they remain available for debugging purposes.
Those auxiliary resources are mounted in the build pod, so it can use them. Locate them using the oc get command:
$ oc get cm -n openshift-machine-config-operator \
--sort-by metadata.creationTimestamp
...
additionaltrustbundle-infra-32e..33a 1 47m
etc-policy-infra-32ef35dea3e553..33a 1 47m
mc-infra-32ef35dea3e55307127795..33a 1 47m
containerfile-infra-32ef35dea3e..33a 1 47m
etc-registries-infra-32ef35dea3e..33a 1 47m
$ oc get secret -n openshift-machine-config-operator \
--sort-by metadata.creationTimestamp
NAME TYPE DATA AGE
...
global-pull-secret-copy kubernetes.io/dockerconfigjson 1 48m
final-infra-32e..33a kubernetes.io/dockerconfigjson 1 48m
base-infra-32e..33a kubernetes.io/dockerconfigjson 1 48mThe additional trust bundle configmap (in this example, additionaltrustbundle-infra-32e…33a) stores the necessary bundles to use Red Hat Enterprise Linux (RHEL) packages in the Containerfile. It must be taken from a copy of the etc-pki-entitlement secret in the openshift-config-managed namespace. If the build is having problems using RHEL packages, then ensure the resource is storing the correct bundles.
The current machine config configmap (mc-infra-32e…33a in this example) stores the MachineConfig resource that must be applied to the nodes in this MachineConfigPool. To see its content:
$ oc get cm -n openshift-machine-config-operator \
mc-infra-32ef35dea3e553071277954842edb33a \
-o jsonpath='{.data.machineconfig\.json\.gz}' | \
base64 -d | gunzip | jq | lessThe container file configmap (containerfile-infra-32e…33a in this example) stores the full container file used to build the image. To see its contents:
$ oc get cm -o yaml \
containerfile-infra-32ef35dea3e553071277954842edb33a \
-o jsonpath='{.data.Containerfile}'The etc registries and policies configmaps (etc-registries-infra-32e…33a and etc-policy-infra-32e…33a in this example) contain the registry configuration (registries.conf) and the policies (policy.json) used in the cluster so that they can be used in the build process as well. Look at those resources when there are problems with the container registries:
$ oc -n openshift-machine-config-operator get cm \
-o yaml etc-registries-infra-32ef35dea3e553071277954842edb33a
apiVersion: v1
data:
registries.conf: |
unqualified-search-registries = ['registry.access.r.com', 'docker.io']
...The secrets are the ones configured in the MOSC resource. They contain the credentials to pull and push the necessary images.
If the MOSB fails, these auxiliary resources are not removed so that they can be used for further debugging.
Stage 3: Image applied to nodes
After a successful build, the openshift-machine-config-operator rolls out the new image updating the machineconfiguration.openshift.io/desiredImage annotation in the nodes and the MachineConfigDaemon pods apply the image.
What to watch for:
- Pool update status: The MachineConfigPool show
Updating=Trueas nodes begin updating - Image pull failures: Nodes may fail to download the image if
currentImagePullSecretis incorrect - Network connectivity: Nodes must be able to reach the registry where the image is stored
- Node degradation: Nodes stuck in degraded state due to failed updates should be checked
- Reboot issues: Nodes should successfully reboot into the new OS image
- Stalled updates: If the pool remains in the
Updatingstate too long, investigate individual node statuses
In this final stage, issues typically relate to a node's ability to access and apply a layered image.
Success
The MCP should report an updated status:
$ oc get mcp infra
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT
infra rendered-infra-f47..e74 True False False 3
READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
3 3 0 61mVerify the proper application of the image on the nodes:
$ oc debug -q node/ip-10-0-10-154.compute.example -- chroot \
/host rpm-ostree status
State: idle
Deployments:
ostree-unverified-registry:quay.io/sregidor/sregidor-os@sha256:876..ef13
Digest: sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13
Version: 9.6.20251013-1 (2025-10-17T12:09:08Z)
$ oc debug -q node/ip-10-0-10-154.compute.example -- chroot \
/host which yq /usr/bin/yq
$ oc debug -q node/ip-10-0-10-154.compute.example -- chroot /host yq -h
yq is a portable command-line data file processor (https://github.com/mikefarah/yq/)
See https://mikefarah.gitbook.io/yq/ for detailed documentation and examples.
Usage:
yq [flags]
yq [command]
...Error
At this point the debugging process is very similar to the one followed when applying a new MachineConfig. Focus on checking the MachineConfigPool status, the information in the MachineConfigNodes resources and the logs of the machine-config-daemon pods.
In case of error, the MCP shows as degraded:
$ oc get mcp infra
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT
infra rendered-infra-620..a43 False False True 3
READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
0 0 1 3h51m
$ oc get mcp infra -oyaml
...
- lastTransitionTime: "2025-10-17T12:23:48Z"
message: 'Node ip-10-0-75-69.compute.example is reporting: "Node ip-10-0-75-69.compute.example
upgrade failure. Failed to update OS to quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13
after retries: timed out waiting for the condition", Node ip-10-0-75-69.compute.example
is reporting: "Failed to update OS to quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13
after retries: timed out waiting for the condition"'
reason: 1 nodes are reporting degraded status on sync
status: "True"
type: NodeDegraded
And the detailed information can be found in the machine-config-daemon pod logs:
$ oc logs -n openshift-machine-config-operator $(oc get pods \
-n openshift-machine-config-operator -l "k8s-app=machine-config-daemon" \
--field-selector "spec.nodeName=ip-10-0-75-69.compute.example" \
-o jsonpath="{.items[0].metadata.name}") -c machine-config-daemon
...
I1017 12:26:52.042570 2750 update.go:2546] Updating OS to layered image "quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13"
I1017 12:26:52.042590 2750 image_manager_helper.go:92] Running captured: rpm-ostree --version
I1017 12:26:52.055729 2750 image_manager_helper.go:194] Linking rpm-ostree authfile to /etc/mco/internal-registry-pull-secret.json
I1017 12:26:52.055759 2750 rpm-ostree.go:183] Executing rebase to quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13
I1017 12:26:52.055764 2750 update.go:2630] Running: rpm-ostree rebase --experimental ostree-unverified-registry:quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13
Pulling manifest: ostree-unverified-registry:quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13
W1017 12:26:52.427068 2750 update.go:2591] Failed to update OS to quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13 (will retry): error running rpm-ostree rebase --experimental ostree-unverified-registry:quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13: error: Creating importer: failed to invoke method OpenImage: failed to invoke method OpenImage: reading manifest sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13 in quay.io/sregidor/sregidor-os: manifest unknownAlso verify information reported by the MachineConfigNode resources. This is especially important because in future versions of OpenShift, more information regarding the image mode process will be added to those resources in order to make the debugging process easier.
$ oc get machineconfignode -o wide
NAME POOLNAME DESIREDCONFIG CURRENTCONFIG UPDATED AGE UPDATEPREPARED UPDATEEXECUTED UPDATEPOSTACTIONCOMPLETE UPDATECOMPLETE RESUMED UPDATEDFILESANDOS CORDONEDNODE DRAINEDNODE REBOOTEDNODE UNCORDONEDNODE
ip-10-0-10-154.compute.example infra rendered-infra-620..a43 rendered-infra-620..a43 True 4h34m False False False False False False False False False False
ip-10-0-22-152.compute.example master rendered-master-93a022e91aa2bf815e4efed220ac97ea rendered-master-93a022e91aa2bf815e4efed220ac97ea True 4h44m False False False False False False False False False False
ip-10-0-41-78.compute.example infra rendered-infra-620..a43
...
$ oc get machineconfignode ip-10-0-75-69.compute.example -o yaml
...
- lastTransitionTime: "2025-10-17T12:22:13Z"
message: 'Node ip-10-0-75-69.compute.example upgrade failure. Failed
to update OS to quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13
after retries: timed out waiting for the condition'
reason: NodeDegraded
status: "True"Successful debugging
Debugging image mode doesn't have to be a black box operation. When you understand the three distinct stages (MachineOSConfig validation, MachineOSBuild execution, and image deployment to nodes) of the process, failures can be systematically narrowed down to identify where they occur and what the root cause is. The key is knowing where to look:
openshift-machine-config-operatorpod logs for MOSC issues- Build job pod logs for image build failures
machine-config-daemonpod logs for node-level problems
Image mode failures usually happen during the build stage, often caused by pull secret authentication issues, Containerfile errors, or registry permission problems. The debugging techniques in this guide empower you to perform effective troubleshooting and, ultimately, successful deployment of customized node images.