Skip to main content
Redhat Developers  Logo
  • AI

    Get started with AI

    • Red Hat AI
      Accelerate the development and deployment of enterprise AI solutions.
    • AI learning hub
      Explore learning materials and tools, organized by task.
    • AI interactive demos
      Click through scenarios with Red Hat AI, including training LLMs and more.
    • AI/ML learning paths
      Expand your OpenShift AI knowledge using these learning resources.
    • AI quickstarts
      Focused AI use cases designed for fast deployment on Red Hat AI platforms.
    • No-cost AI training
      Foundational Red Hat AI training.

    Featured resources

    • OpenShift AI learning
    • Open source AI for developers
    • AI product application development
    • Open source-powered AI/ML for hybrid cloud
    • AI and Node.js cheat sheet

    Red Hat AI Factory with NVIDIA

    • Red Hat AI Factory with NVIDIA is a co-engineered, enterprise-grade AI solution for building, deploying, and managing AI at scale across hybrid cloud environments.
    • Explore the solution
  • Learn

    Self-guided

    • Documentation
      Find answers, get step-by-step guidance, and learn how to use Red Hat products.
    • Learning paths
      Explore curated walkthroughs for common development tasks.
    • Guided learning
      Receive custom learning paths powered by our AI assistant.
    • See all learning

    Hands-on

    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.
    • Interactive labs
      Learn by doing in these hands-on, browser-based experiences.
    • Interactive demos
      Click through product features in these guided tours.

    Browse by topic

    • AI/ML
    • Automation
    • Java
    • Kubernetes
    • Linux
    • See all topics

    Training & certifications

    • Courses and exams
    • Certifications
    • Skills assessments
    • Red Hat Academy
    • Learning subscription
    • Explore training
  • Build

    Get started

    • Red Hat build of Podman Desktop
      A downloadable, local development hub to experiment with our products and builds.
    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.

    Download products

    • Access product downloads to start building and testing right away.
    • Red Hat Enterprise Linux
    • Red Hat AI
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat Developer Toolset

    References

    • E-books
    • Documentation
    • Cheat sheets
    • Architecture center
  • Community

    Get involved

    • Events
    • Live AI events
    • Red Hat Summit
    • Red Hat Accelerators
    • Community discussions

    Follow along

    • Articles & blogs
    • Developer newsletter
    • Videos
    • Github

    Get help

    • Customer service
    • Customer support
    • Regional contacts
    • Find a partner

    Join the Red Hat Developer program

    • Download Red Hat products and project builds, access support documentation, learning content, and more.
    • Explore the benefits

Debugging image mode with Red Hat OpenShift 4.20: A practical guide

Tips for troubleshooting common image mode scenarios in OpenShift 4.20

May 19, 2026
Sergio Regidor de la Rosa
Related topics:
Application development and deliveryContainersLinuxKubernetes
Related products:
Red Hat OpenShift

    Image mode is a powerful capability in Red Hat OpenShift that allows customization to CoreOS-based nodes. While this feature provides unprecedented flexibility, it can also introduce new layers of complexity when things go wrong. This guide demonstrates common debugging scenarios in OpenShift 4.20 and beyond, and provides practical troubleshooting steps to get clusters back on track. There are many tools and techniques that make debugging image mode less mysterious and more manageable.

    Understanding the image mode process

    When image mode for OpenShift is enabled, the typical workflow involves three stages. Each stage has distinct failure points. The stages are: MachineOSConfig (MOSC) creation, MachineOSBuild (MOSB) creation and execution, and application of the new image to nodes.

    Stage 1: MachineOSConfig creation

    The process begins when a MachineOSConfig resource is created targeting a specific MachineConfigPool. This resource acts as the blueprint, defining how the custom OS image is built and where it gets stored.

    What to watch for:

    • Validation errors: Resource creation may fail when required fields are missing or incorrectly configured.
    • Secret references: Ensure all referenced pull and push secrets exist in the openshift-machine-config-operator namespace.
    • Registry specifications: Verify that renderedImagePushSpec points to a valid, accessible registry location.

    At this stage, issues are typically configuration errors that prevent a resource from being created or accepted by the cluster.

    If the MOSC resource was successfully created, then the machine-os-builder pod should be healthy and running in the openshift-machine-config-operator namespace.

    $ oc get pods -n openshift-machine-config-operator \
    -l k8s-app=machine-os-builder
    NAME                         READY  STATUS  RESTARTS
    machine-os-builder-b8f..h94   1/1   Running   0

    If this pod is visible and running, your debugging can proceed to the next step, which builds the image.

    If this pod is not visible, then errors have occurred.

    If a forbidden value is used for any of the fields in MachineOSConfig, it's printed in the create command output. For example, given this YAML file:

    $ cat ./mosc.yaml 
    apiVersion: machineconfiguration.openshift.io/v1
    kind: MachineOSConfig
    metadata:
      name: worker
    spec:
      machineConfigPool:
        name: infra
      currentImagePullSecret:
        name: current-image-pull
      imageBuilder:
        imageBuilderType: Job
      baseImagePullSecret:
        name: base-image-pull
      renderedImagePushSecret:
        name: rendered-image
      renderedImagePushSpec: "quay.io/sregidor/sregidor-os:mco_layering"

    The output of oc create is:

    $ oc create -f ./mosc.yaml
    The MachineOSConfig "worker" is invalid: 
    * spec.imageBuilder.imageBuilderType: Unsupported value: "job": supported values: "Job"

    The example shows that spec.imageBuilder.imageBuilderType is set to job instead of the required Job (with a capital "J").

    Another example:

    $ oc create -f ./mosc.yaml
    The MachineOSConfig "worker" is invalid: <nil>: Invalid value: "object": MachineOSConfig name must match the referenced MachineConfigPool name; can only have one MachineOSConfig per MachineConfigPool

    If the configured values are not forbidden but nevertheless are causing problems, the information to detect those problems is in the openshift-machine-config-operator pod, in the triggered events, and in the machine-config ClusterOperator. The most detailed information is in the openshift-machine-config-operator pod.

    For example, supposed the secrets haven't been created for this sample YAML configuration:

    apiVersion: machineconfiguration.openshift.io/v1
    kind: MachineOSConfig
    metadata:
      name: infra
    spec:
      machineConfigPool:
        name: infra
      currentImagePullSecret:
        name: current-image-pull
      imageBuilder:
        imageBuilderType: Job
      baseImagePullSecret:
        name: base-image-pull
      renderedImagePushSecret:
        name: rendered-image
      renderedImagePushSpec: "quay.io/sregidor/sregidor-os:mco_layering"

    Nevertheless, the resource can be created:

    $ oc create -f ./mosc.yaml
    machineosconfig.machineconfiguration.openshift.io/infra created

    However, the builder pod is not created:

    $ oc get pods -n openshift-machine-config-operator |grep build

    The error can be found in the openshift-machine-config-operator pod:

    $ oc logs -n openshift-machine-config-operator-7498f4576b-h5vzj 
    ...
    E1017 08:56:53.431756       1 operator.go:467] "Unhandled Error" err="could not update Machine OS Builder deployment: could not validate renderedImagePushSecret \"rendered-image\" for MachineOSConfig infra: secret rendered-image from infra is not found. Did you use the right secret name?"
    ...

    There are also events reporting the error:

    $ oc get events  -n openshift-machine-config-operator --sort-by metadata.creationTimestamp  |tail -3
    34s         Warning   OperatorDegraded: MachineOSBuilderFailed   /machine-config                                                      Failed to resync 4.20.0-0-2025-10-16-080835-test-ci-ln-bfn63jk-latest because: could not update Machine OS Builder deployment: could not validate renderedImagePushSecret "rendered-image" for MachineOSConfig infra: secret rendered-image from infra is not found. Did you use the right secret name?
    11s         Warning   OperatorDegraded: MachineOSBuilderFailed   /machine-config                                                      Failed to resync 4.20.0-0-2025-10-16-080835-test-ci-ln-bfn63jk-latest because: could not update Machine OS Builder deployment: could not validate baseImagePullSecret "base-image-pull" for MachineOSConfig infra: secret base-image-pull from infra is not found. Did you use the right secret name?
    96s         Normal    ConfigMapUpdated                           deployment/openshift-machine-config-operator                                   Updated ConfigMap/kube-rbac-proxy -n openshift-machine-config-operator:...

    You can get information from the machine-config ClusterOperator, too:

    $ oc get co machine-config
    NAME             VERSION                                                AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
    machine-config   4.20.0-0-2025-10-16-080835-test-ci-ln-bfn63jk-latest   True        False         True       76m     Failed to resync 4.20.0-0-2025-10-16-080835-test-ci-ln-bfn63jk-latest because: could not update Machine OS Builder deployment: could not validate renderedImagePushSecret "rendered-image" for MachineOSConfig infra: secret rendered-image from infra is not found. Did you use the right secret name?

    Stage 2: MachineOSBuild (MOSB) creation and image build process

    After the MachineOSConfig has been successfully created, and the machine-os-builder pod is running, openshift-machine-config-operator automatically generates a MachineOSBuild resource. The MachineOSBuild resource controls an actual image build job that pulls the base CoreOS image, applies the customizations (in a Containerfile), and pushes the result to the specified registry.

    To execute this process, several auxiliary secrets and configmaps are created in the openshift-machine-config-operator namespace.

    What to watch for:

    • Build status: Monitor the MachineOSBuild resource for conditions showing Succeeded=True or Failed=True.
    • Job failures: Verify that the build job in the openshift-machine-config-operator namespace completes successfully.
    • Image pull errors: Authentication failures when pulling the base image indicate problems with baseImagePullSecret.
    • Build errors: Containerfile syntax issues, missing packages, or failed RUN commands cause build failures.
    • Image push errors: A problem pushing to the registry suggests an issue with renderedImagePushSecret or registry permissions.

    This is where most failures occur, because it involves pulling images, executing build steps, and pushing results, all of which depend on external resources and credentials.

    Useful output is displayed while the image is being built. While acquiring the MachineOSBuild resource:

    $ oc -n openshift-machine-config-operator get machineosbuild
    NAME                                     PREPARED   BUILDING   SUCCEEDED
    infra-b1b93a87b88b18b3ad70e9fb2596b2cd   False      True       False
    INTERRUPTED   FAILED   AGE
    False         False    108s

    Creating the job in the MachineOSBuild execution:

    $ oc -n openshift-machine-config-operator get job
    NAME                         STATUS   COMPLETIONS  DURATION  AGE
    build-infra-b1b93a87b..b2cd  Running  0/1          105s      105s
    

    The pod controlled by the job, which executes the actual build process:

    $ oc -n openshift-machine-config-operator get pods
    NAME                               READY STATUS    RESTARTS  AGE
    build-infra-b1b93a87b..b2cd-q7tsb  0/1   Init:0/1  0         2m49s
    ...

    Note that only changes to kernel arguments, kernel type, OSImageURL, or extension bundles create a new job and trigger a new image build process. All other MachineConfig changes reuse the existing MOSB and do not trigger a new build.

    This stage can be considered successful if:

    • The MachineOSBuild was created and is reporting Succeeded=True and Failed=False
    • The job is automatically removed by the machine-os-builder pod
    $ oc -n openshift-machine-config-operator get machineosbuild
    NAME                 PREPARED  BUILDING SUCCEEDED INTERRUPTED FAILED
    infra-f509ba5..e99   False     False    True      False       False

    When the MachineOSBuild resource is not created

    When MachineOSBuild is not created, or is not successful, it indicates that an error has occurred. The process in charge of creating the MachineOSBuild resource is the machine-os-builder pod. This error is not very common, but if it happens, you must read the logs in this pod to find the causes:

      $ oc -n openshift-machine-config-operator logs machine-os-builder-b8f48488f-nsdbk
      ....
      I1017 09:42:42.524084       1 reconciler.go:634] New MachineOSBuild created: infra-f509ba5b2d76bcc5a113fd81de75ee99

    When the MachineOSBuild was created, but failed

    The most common cause of a failed MachineOSBuild is that the job building the image failed to build it. If the MachineOSBuild resource fails, the first step is to locate the associated job.

    When the job is not created

    The machine-os-builder pod is in charge of creating or deleting a job. If the job cannot be found, read the logs in this pod for further information:

      $ oc -n openshift-machine-config-operator logs machine-os-builder-b8f48488f-nsdbk

    Debugging a failed job

    Debugging a failed job can take many forms, depending on the problem. Focusing on one problem at a time helps you confirm your theory about the cause of the problem.

    For example, suppose the following MOSC is created:

    apiVersion: machineconfiguration.openshift.io/v1
    kind: MachineOSConfig
    metadata:
      name: infra
    spec:
      machineConfigPool:
        name: infra
      currentImagePullSecret:
        name: current-image-pull
      imageBuilder:
        imageBuilderType: Job
      baseImagePullSecret:
        name: base-image-pull
      renderedImagePushSecret:
        name: rendered-image
      renderedImagePushSpec: "quay.io/sregidor/sregidor-os:mco_layering"
      containerFile:
          - content: |-
              RUN curl --fail -L https://github.com/example/yq/releases/latest/download/yq_linux_amd64_wrong -o /usr/bin/yq && chmod +x /usr/bin/yq

    You run the oc create command:

    $ oc create -f mosc.yaml
    machineosconfig.machineconfiguration.openshift.io/infra created

    But the MachineConfigPool shows as degraded:

    $ oc get mcp infra
    NAME   CONFIG                  UPDATED UPDATING DEGRADED M..COUNT READYMACHINECOUNT
    infra  rendered-infra-620..a43 False   False    True     1      0
    UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
    0                     0                      158m
    $ oc get mcp infra -oyaml
    ...
      - lastTransitionTime: "2025-10-17T10:55:00Z"
        message: 'Failed to build OS image for pool infra (MachineOSBuild: infra-32ef35dea3e553071277954842edb33a):
          Failed: Build Failed'
        reason: BuildFailed
        status: "True"
        type: ImageBuildDegraded
    ...

    The MOSB resource shows as failing:

    $ oc -n openshift-machine-config-operator get machineosbuild
    NAME           PREPARED BUILDING SUCCEEDED INTERRUPTED FAILED AGE
    infra-32e..33a False    False    False     False       True   31m

    So you locate the job:

    $ oc get job -l machineconfiguration.openshift.io/machine-os-config=infra
    NAME                 STATUS COMPLETION DURATION AGE
    build-infra-32e..33a Failed 0/1        31m      31m

    These are the pods launched by the failed job:

    $ oc -n openshift-machine-config-operator get pods
    NAME                       READY STATUS     RESTARTS AGE
    build-infra-32e..33a-2jg2t  0/1  Init:Error 0        25m
    build-infra-32e..33a-bzfcp  0/1  Init:Error 0        29m
    build-infra-32e..33a-cndjm  0/1  Init:Error 0        32m
    build-infra-32e..33a-lqlk9  0/1  Init:Error 0        22m

    Examine the logs of the failed pod to determine the cause. The build pods have two containers: image-build and create-digest-configmap.

    • The container image-build builds the image and pushes it
    • The container create-digest-configmap creates an auxiliary configmap with the right digest so that it can be read and openshift-machine-config-operator can update the MOSB and MOSC resources

    To identify errors in the build process, examine the image-build container in the build pod:

    $ oc -n openshift-machine-config-operator logs \
    build-infra-32ef35dea3e553071277954842edb33a-2jg2t \
    -c image-build
    ...
    time="2025-10-17T10:51:32Z" level=debug msg="Running &exec.Cmd{Path:\"/bin/sh\", Args:[]string{\"/bin/sh\", \"-c\", \"curl --fail -L https://github.com/example/yq/releases/latest/download/yq_linux_amd64_wrong -o /usr/bin/yq && chmod +x /usr/bin/yq\"}, Env:[]string{\"HTTP_PROXY=\", \"HTTPS_PROXY=\", \"NO_PROXY=\", \"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\", \"HOSTNAME=0430829320a1\", \"HOME=/root\"}, Dir:\"/\", Stdin:(*os.File)(0xc0001280a0), Stdout:(*os.File)(0xc0001280a8), Stderr:(*os.File)(0xc0001280b0), ExtraFiles:[]*os.File(nil), SysProcAttr:(*syscall.SysProcAttr)(0xc00017c0c0), Process:(*os.Process)(nil), ProcessState:(*os.ProcessState)(nil), ctx:context.Context(nil), Err:error(nil), Cancel:(func() error)(nil), WaitDelay:0, childIOFiles:[]io.Closer(nil), parentIOPipes:[]io.Closer(nil), goroutine:[]func() error(nil), goroutineErr:(<-chan error)(nil), ctxResult:(<-chan exec.ctxResult)(nil), createdByStack:[]uint8(nil), lookPathErr:error(nil), cachedLookExtensions:struct { in string; out string }{in:\"\", out:\"\"}} (PATH = \"\")"
    %Total %Rec %Xfer Avg Speed Time Time Time Current Dload  Upload...    Speed
    0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
    0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
    0     9    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
    curl: (22) The requested URL returned error: 404
    subprocess exited with status 22
    subprocess exited with status 22
    time="2025-10-17T10:51:32Z" level=debug msg="Error building at step {Env:[HTTP_PROXY= HTTPS_PROXY= NO_PROXY= PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] Command:run Args:[curl --fail -L https://github.com/example/yq/releases/latest/download/yq_linux_amd64_wrong -o /usr/bin/yq && chmod +x /usr/bin/yq] Flags:[] Attrs:map[] Message:RUN curl --fail -L https://github.com/example/yq/releases/latest/download/yq_linux_amd64_wrong -o /usr/bin/yq && chmod +x /usr/bin/yq Heredocs:[] Original:RUN curl --fail -L https://github.com/example/yq/releases/latest/download/yq_linux_amd64_wrong -o /usr/bin/yq && chmod +x /usr/bin/yq}: exit status 22"
    Error: building at STEP "RUN curl --fail -L https://github.com/example/yq/releases/latest/download/yq_linux_amd64_wrong -o /usr/bin/yq && chmod +x /usr/bin/yq": exit status 22

    The logs show that curl returned curl: (22) The requested URL returned error: 404 when attempting to reach https://github.com/example/yq/releases/latest/download/yq_linux_amd64_wrong. This happens because there is a typo in the URL and the actual URL should be https://github.com/example/yq/releases/latest/download/yq_linux_amd64.

    After you find the error, you can edit the MOSC resource. In this example, using the correct URL in the Containerfile section triggers a new MOSB resource that successfully builds the image and applies the config.

    Other kinds of errors are possible. For example, the lack of permissions to pull or push an image is relatively common in some environments. In this case, a pod reports that a configured secret doesn't have permission to push an image:

    $ oc logs build-infra-5e0c7aaf3cf26e8fab9dd111bb336342-czzjb -c image-build
    ....
    Copying blob sha256:29f46dbdbc11454d191cd70ebbd18aec36bc2afc72757d38f2ad473b6dba1c75
    Copying blob sha256:d0a1fe72e3dceadb214f96787144ef31672f2b2a429a3798717d739a55a9b574
    Error: pushing image "quay.io/sregidor/sregidor-os:infra-5e0c7aaf3cf26e8fab9dd111bb336342" to "docker://quay.io/sregidor/sregidor-os:infra-5e0c7aaf3cf26e8fab9dd111bb336342": writing blob: initiating layer upload to /v2/sregidor/sregidor-os/blobs/uploads/ in quay.io: unauthorized: access to the requested resource is not authorized

    In this example, there was a problem in the build. If the build process is not failing but the build pod fails, then you can examine the create-digest-configmap container to see whether there was a problem creating the configmap with the digest info.

    Auxiliary resources

    To build the image, openshift-machine-config-operator uses several auxiliary resources temporarily stored in the openshift-machine-config-operator namespace. These resources are only present during the build process. However, if the build fails, they remain available for debugging purposes.

    Those auxiliary resources are mounted in the build pod, so it can use them. Locate them using the oc get command:

    $ oc get cm -n openshift-machine-config-operator \
    --sort-by metadata.creationTimestamp
    ...
    additionaltrustbundle-infra-32e..33a   1  47m
    etc-policy-infra-32ef35dea3e553..33a   1  47m
    mc-infra-32ef35dea3e55307127795..33a   1  47m
    containerfile-infra-32ef35dea3e..33a   1  47m
    etc-registries-infra-32ef35dea3e..33a  1  47m
    
    $ oc get secret -n openshift-machine-config-operator \
    --sort-by metadata.creationTimestamp
    NAME                  TYPE                             DATA  AGE
    ...
    global-pull-secret-copy kubernetes.io/dockerconfigjson  1  48m
    final-infra-32e..33a    kubernetes.io/dockerconfigjson  1  48m
    base-infra-32e..33a     kubernetes.io/dockerconfigjson  1  48m

    The additional trust bundle configmap (in this example, additionaltrustbundle-infra-32e…33a) stores the necessary bundles to use Red Hat Enterprise Linux (RHEL) packages in the Containerfile. It must be taken from a copy of the etc-pki-entitlement secret in the openshift-config-managed namespace. If the build is having problems using RHEL packages, then ensure the resource is storing the correct bundles.

    The current machine config configmap (mc-infra-32e…33a in this example) stores the MachineConfig resource that must be applied to the nodes in this MachineConfigPool. To see its content:

    $ oc get cm -n openshift-machine-config-operator \
    mc-infra-32ef35dea3e553071277954842edb33a \
    -o jsonpath='{.data.machineconfig\.json\.gz}' | \
    base64 -d | gunzip | jq | less

    The container file configmap (containerfile-infra-32e…33a in this example) stores the full container file used to build the image. To see its contents:

    $ oc get cm -o yaml \
    containerfile-infra-32ef35dea3e553071277954842edb33a \
    -o jsonpath='{.data.Containerfile}'

    The etc registries and policies configmaps (etc-registries-infra-32e…33a and etc-policy-infra-32e…33a in this example) contain the registry configuration (registries.conf) and the policies (policy.json) used in the cluster so that they can be used in the build process as well. Look at those resources when there are problems with the container registries:

    $ oc -n openshift-machine-config-operator get cm \
    -o yaml etc-registries-infra-32ef35dea3e553071277954842edb33a
    apiVersion: v1
    data:
      registries.conf: |
        unqualified-search-registries = ['registry.access.r.com', 'docker.io']
    ...

    The secrets are the ones configured in the MOSC resource. They contain the credentials to pull and push the necessary images.

    If the MOSB fails, these auxiliary resources are not removed so that they can be used for further debugging.

    Stage 3: Image applied to nodes

    After a successful build, the openshift-machine-config-operator rolls out the new image updating the machineconfiguration.openshift.io/desiredImage annotation in the nodes and the MachineConfigDaemon pods apply the image.

    What to watch for:

    • Pool update status: The MachineConfigPool show Updating=True as nodes begin updating
    • Image pull failures: Nodes may fail to download the image if currentImagePullSecret is incorrect
    • Network connectivity: Nodes must be able to reach the registry where the image is stored
    • Node degradation: Nodes stuck in degraded state due to failed updates should be checked
    • Reboot issues: Nodes should successfully reboot into the new OS image
    • Stalled updates: If the pool remains in the Updating state too long, investigate individual node statuses

    In this final stage, issues typically relate to a node's ability to access and apply a layered image.

    Success

    The MCP should report an updated status:

    $ oc get mcp infra
    NAME   CONFIG                 UPDATED UPDATING DEGRADED MACHINECOUNT
    infra  rendered-infra-f47..e74 True   False    False    3            
    READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT  AGE
    3                 3                   0                     61m

    Verify the proper application of the image on the nodes:

    $ oc debug -q node/ip-10-0-10-154.compute.example -- chroot \
    /host rpm-ostree status
    State: idle
    Deployments:
    ostree-unverified-registry:quay.io/sregidor/sregidor-os@sha256:876..ef13 
           Digest: sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13
           Version: 9.6.20251013-1 (2025-10-17T12:09:08Z)
    $ oc debug -q node/ip-10-0-10-154.compute.example -- chroot \
    /host which yq /usr/bin/yq
    
    $ oc debug -q node/ip-10-0-10-154.compute.example -- chroot /host yq -h
    yq is a portable command-line data file processor (https://github.com/mikefarah/yq/) 
    See https://mikefarah.gitbook.io/yq/ for detailed documentation and examples.
    Usage:
      yq [flags]
      yq [command]
    ...

    Error

    At this point the debugging process is very similar to the one followed when applying a new MachineConfig. Focus on checking the MachineConfigPool status, the information in the MachineConfigNodes resources and the logs of the machine-config-daemon pods.

    In case of error, the MCP shows as degraded:

    $ oc get mcp infra
    NAME  CONFIG                   UPDATED  UPDATING  DEGRADED  MACHINECOUNT
    infra rendered-infra-620..a43   False   False     True      3
    READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
    0                 0                   1                    3h51m
    
    $ oc get mcp infra -oyaml
    ...
      - lastTransitionTime: "2025-10-17T12:23:48Z"
        message: 'Node ip-10-0-75-69.compute.example is reporting: "Node ip-10-0-75-69.compute.example
          upgrade failure. Failed to update OS to quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13
          after retries: timed out waiting for the condition", Node ip-10-0-75-69.compute.example
          is reporting: "Failed to update OS to quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13
          after retries: timed out waiting for the condition"'
        reason: 1 nodes are reporting degraded status on sync
        status: "True"
        type: NodeDegraded
    

    And the detailed information can be found in the machine-config-daemon pod logs:

    $ oc logs -n openshift-machine-config-operator $(oc get pods \
    -n openshift-machine-config-operator -l "k8s-app=machine-config-daemon" \
    --field-selector "spec.nodeName=ip-10-0-75-69.compute.example" \
    -o jsonpath="{.items[0].metadata.name}") -c machine-config-daemon
    ...
    I1017 12:26:52.042570    2750 update.go:2546] Updating OS to layered image "quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13"
    I1017 12:26:52.042590    2750 image_manager_helper.go:92] Running captured: rpm-ostree --version
    I1017 12:26:52.055729    2750 image_manager_helper.go:194] Linking rpm-ostree authfile to /etc/mco/internal-registry-pull-secret.json
    I1017 12:26:52.055759    2750 rpm-ostree.go:183] Executing rebase to quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13
    I1017 12:26:52.055764    2750 update.go:2630] Running: rpm-ostree rebase --experimental ostree-unverified-registry:quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13
    Pulling manifest: ostree-unverified-registry:quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13
    W1017 12:26:52.427068    2750 update.go:2591] Failed to update OS to quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13 (will retry): error running rpm-ostree rebase --experimental ostree-unverified-registry:quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13: error: Creating importer: failed to invoke method OpenImage: failed to invoke method OpenImage: reading manifest sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13 in quay.io/sregidor/sregidor-os: manifest unknown

    Also verify information reported by the MachineConfigNode resources. This is especially important because in future versions of OpenShift, more information regarding the image mode process will be added to those resources in order to make the debugging process easier.

    $ oc get machineconfignode -o wide
    NAME                           POOLNAME  DESIREDCONFIG                                      CURRENTCONFIG                                      UPDATED   AGE     UPDATEPREPARED   UPDATEEXECUTED   UPDATEPOSTACTIONCOMPLETE   UPDATECOMPLETE   RESUMED   UPDATEDFILESANDOS   CORDONEDNODE   DRAINEDNODE   REBOOTEDNODE   UNCORDONEDNODE
    ip-10-0-10-154.compute.example infra  rendered-infra-620..a43  rendered-infra-620..a43    True      4h34m   False            False           False                      False            False     False               False          False         False          False
    ip-10-0-22-152.compute.example   master     rendered-master-93a022e91aa2bf815e4efed220ac97ea   rendered-master-93a022e91aa2bf815e4efed220ac97ea   True      4h44m   False            False            False                      False            False     False               False          False         False          False
    ip-10-0-41-78.compute.example    infra      rendered-infra-620..a43
    ...
    $ oc get machineconfignode ip-10-0-75-69.compute.example -o yaml
    ...
      - lastTransitionTime: "2025-10-17T12:22:13Z"
        message: 'Node ip-10-0-75-69.compute.example upgrade failure. Failed
          to update OS to quay.io/sregidor/sregidor-os@sha256:8761d4273f3213f2f9c9b4aa9dbe33aa758f17d691f0f53d2b20f55702c9ef13
          after retries: timed out waiting for the condition'
        reason: NodeDegraded
        status: "True"

    Successful debugging

    Debugging image mode doesn't have to be a black box operation. When you understand the three distinct stages (MachineOSConfig validation, MachineOSBuild execution, and image deployment to nodes) of the process, failures can be systematically narrowed down to identify where they occur and what the root cause is. The key is knowing where to look:

    • openshift-machine-config-operator pod logs for MOSC issues
    • Build job pod logs for image build failures
    • machine-config-daemon pod logs for node-level problems

    Image mode failures usually happen during the build stage, often caused by pull secret authentication issues, Containerfile errors, or registry permission problems. The debugging techniques in this guide empower you to perform effective troubleshooting and, ultimately, successful deployment of customized node images.

    Related Posts

    • Unlocking efficiency: A guide to operator cache configuration on Red Hat OpenShift and Kubernetes

    • Migrate BuildConfig resources to Builds for Red Hat OpenShift with Crane

    • What’s new for developers in Red Hat OpenShift 4.20

    • How to modify system-reserved parameters on OpenShift nodes

    Recent Posts

    • Debugging image mode with Red Hat OpenShift 4.20: A practical guide

    • EvalHub: Because "looks good to me" isn't a benchmark

    • SQL Server HA on RHEL: Meet Pacemaker HA Agent v2 (tech preview)

    • Deploy with confidence: Continuous integration and continuous delivery for agentic AI

    • Every layer counts: Defense in depth for AI agents with Red Hat AI

    What’s up next?

    grumpy guide to OS tile card.

    The Grumpy Developer's Guide to OpenShift

    Ian Lawson
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2026 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Chat Support

    Please log in with your Red Hat account to access chat support.