Configure a split disk on OpenShift Container Platform

The first time I watched a node run out of disk space while pulling a 6 GB GPU PyTorch image, I knew we needed a better way to handle container image storage. In my work with AI/ML teams running workloads on Red Hat OpenShift Container Platform, disk space management has become one of the most common pain points, especially as model sizes continue to grow.

Split disk configuration solves this problem by directing newly pulled container images to a dedicated filesystem while keeping container runtime data on the boot disk. This approach gives you better control over disk space allocation and separates the concerns of image storage from container operations. In this article, I'll walk you through setting up a split disk on Red Hat OpenShift 4.22 and later for AWS and Google Cloud (GCP) platforms. Split disk is currently a developer preview feature.

How container storage works

Before we dive into the configuration, it's worth understanding how CRI-O, the container runtime that powers OpenShift, manages storage. This context will help you appreciate why split disk works the way it does.

CRI-O handles two fundamentally different types of data. First, there are container images (the read-only layers that form the base of your containers). Think of these as the immutable foundation: your application code, dependencies, and runtime environment all packaged together. Second, there's container runtime data (the read-write state that includes active container processes, writable layers, and runtime metadata). By default, CRI-O stores everything under /var/lib/containers/storage on the node's boot disk.

This works fine for typical workloads, but it breaks down when you start pulling large images. I've seen production clusters where data scientists were deploying TensorFlow containers that consumed 5 GB each, and suddenly the boot disk was full. The traditional solution has been to mount an entirely separate filesystem at /var/lib/containers, but that moves everything (images and runtime data) to the secondary disk, and OpenShift remains unaware of the underlying storage.

How split disk works

Split disk takes a more surgical approach. Instead of moving everything, we configure CRI-O to store only newly pulled images on a separate disk. The key word here is "newly pulled," the pre-baked images that ship with the Red Hat Enterprise Linux CoreOS Amazon Machine Image (AMI) stay exactly where they are on the boot disk. This distinction is important because it means you're not disrupting the foundational system images; you're simply redirecting where workload images land going forward.

Here's what remains on the boot disk: all the pre-baked images included in the Red Hat Enterprise Linux CoreOS AMI, system and base images that OpenShift depends on, container runtime state, and the writable layers where your containers make changes. Meanwhile, the split disk (which you might mount at /var/lib/images or another location) receives newly pulled images and their associated overlay directories.

CRI-O achieves this through its imagestore configuration option, which tells the runtime to use an alternate location for image storage. It's an elegant solution because it doesn't require migrating existing data—you simply point future image pulls to the new location, and everything continues to work.

The following remain in /var/lib/containers/storage/ on the boot disk:

Pre-baked images included in the Red Hat Enterprise Linux CoreOS AMI
System and base images
Container runtime state
Container writable layers

The following are stored (e.g., /var/lib/images/) on the secondary disk:

New images pulled for workloads running as pods
Images pulled after the node boots
Image overlay directories: overlay/, overlay-images/, overlay-layers/

Split disk does not migrate or move existing images. It only directs where CRI-O stores newly pulled images going forward. Pre-baked AMI images remain on the original disk and are still accessible.

Use cases

Let me give you a concrete example from my own experience with large AI/ML container images. A team running PyTorch model training on OpenShift was using the official pytorch/pytorch:2.2.0-cuda12.1-cudnn8-runtime image, which weighs in at around 5.2 GB. They had a cluster with 10 worker nodes, and each node might run multiple training jobs throughout the day. Without split disk, every image pull consumed precious boot disk space. With split disk configured, they provisioned a 100 GB secondary disk for images, and the boot disk remained stable even under heavy workload churn.

The benefits extend beyond just avoiding disk pressure. By dedicating storage to images, you can size that disk independently based on your workload patterns. If you're running image-intensive workloads, you can provision larger secondary disks without over-provisioning the boot disk. If you need higher IOPS for image pulls, you can configure that specifically for the image disk. The separation gives you flexibility.

Split disk provides:

Dedicated storage for large images: Image files are stored on a separate, larger disk.
Optimized boot disk usage: Container runtime data stays on the fast boot disk.
Better resource allocation: Manage disk space effectively for image-heavy workloads.

This configuration is particularly beneficial for:

AI/ML workloads with large base images
Environments pulling multiple large container images
Clusters with limited boot disk space

Note: The split disk approach described in this article is a dev preview.

Prerequisites

Before you begin, make sure you have the OpenShift installer binary (openshift-install) for version 4.22 or later, your pull secret from the Red Hat Customer Portal, and cloud provider credentials configured. For AWS, that means your access key and secret are set up via aws configure or environment variables. For GCP, you'll need a service account key configured. You'll also need the MachineConfig file provided later in this article.

Creating a cluster with split disk

Split disk configuration needs to happen at cluster installation time. I'm going to walk you through the full process, explaining not just what commands to run, but why each step matters.

Step 1: Create the install configuration

The installer will prompt you for your platform (choose AWS or GCP), cluster name, pull secret, region, base domain, and other parameters. This interactive process creates install-config.yaml in your working directory. This file defines the high-level characteristics of your cluster, but we need to go deeper to inject our split disk configuration.

./openshift-install create install-config --dir ./ocp-split-disk-cluster

This creates an install-config.yaml in ./ocp-split-disk-cluster/.

Step 2: Generate manifests

This is where things get interesting. The installer takes your install-config.yaml and generates a full set of manifests, not just generic Kubernetes resources, but also OpenShift-specific configurations. You'll find manifest files in ./ocp-split-disk-cluster/manifests/ and OpenShift resources in ./ocp-split-disk-cluster/openshift/. Most importantly for our purposes, this process creates worker machineset files with names like 99_openshift-cluster-api_worker-machineset-0.yaml. These machinesets define how worker nodes are provisioned, and we need to modify them to attach secondary volumes.

./openshift-install create manifests --dir ./ocp-split-disk-cluster

Step 3: Add split disk MachineConfig

Navigate to the OpenShift directory and copy the MachineConfig. The complete MachineConfig YAML is provided in the following MachineConfig details section. By placing it in the openshift directory, we ensure the installer reads it and applies it to worker nodes during cluster creation.

cd ocp-split-disk-cluster/openshift
cp ~/path/to/98-config-split-disk.yaml .

Step 4: Edit worker machinesets

The installer creates multiple worker machineset files, typically one per availability zone.

ls 99_openshift-cluster-api_worker-machineset-*
Expected output:
99_openshift-cluster-api_worker-machineset-0.yaml
99_openshift-cluster-api_worker-machineset-1.yaml
99_openshift-cluster-api_worker-machineset-2.yaml
99_openshift-cluster-api_worker-machineset-3.yaml
99_openshift-cluster-api_worker-machineset-4.yaml

You must edit each file individually to ensure consistent configuration across all zones. Let me show you what to add for each platform.

For AWS, add the following under spec.template.spec.providerSpec.value.blockDevices:

- ebs:
    encrypted: true
    volumeSize: 100
    volumeType: gp3
  deviceName: /dev/xvdb

This is the complete example:

spec:
  template:
    spec:
      providerSpec:
        value:
          blockDevices:
          - ebs:
              encrypted: true
              iops: 0
              volumeSize: 120
              volumeType: gp3
            deviceName: /dev/sda1  # Boot disk (already present)
          - ebs:
              encrypted: true
              volumeSize: 100
              volumeType: gp3
            deviceName: /dev/xvdb  # Secondary disk for split disk

For GCP, add the following under spec.template.spec.providerSpec.value.disks:

- autoDelete: true
  boot: false
  sizeGb: 100
  type: pd-balanced

The following is the complete example:

spec:
  template:
    spec:
      providerSpec:
        value:
          disks:
          - autoDelete: true
            boot: true
            sizeGb: 128
            type: pd-standard  # Boot disk (already present)
          - autoDelete: true
            boot: false
            sizeGb: 100
            type: pd-balanced  # Secondary disk for split disk

Remember to edit all machineset files. Missing even one means nodes in that availability zone won't have secondary disks, and the split disk configuration will fail on those nodes.

Step 5: Create the cluster

With manifests prepared and machinesets edited, you're ready to create the cluster.

cd ../..
./openshift-install create cluster --dir ./ocp-split-disk-cluster

The installer now orchestrates the entire deployment. It reads all your manifests, including the split disk MachineConfig, creates the cloud infrastructure (VPC, subnets, load balancers, and DNS), provisions bootstrap, control plane, and worker nodes, attaches the secondary volumes you defined, and applies the MachineConfig to all workers.

On each worker node, the split disk setup runs automatically through a carefully orchestrated sequence. The find-secondary-device.service systemd unit executes first, running a script that detects your cloud platform, locates the secondary device (NVMe on AWS, SCSI on GCP), and formats it with an XFS filesystem labeled "splitdisk." Next, the var-lib-images.mount unit mounts this device at /var/lib/images. SELinux contexts are then configured to treat this directory equivalently to /var/lib/containers, ensuring container operations work correctly. Finally, CRI-O starts with the imagestore parameter pointing to /var/lib/images.

This installation typically takes 30 to 45 minutes. When it completes, you'll have a fully functional cluster with split disk configured and ready to handle large image pulls.

MachineConfig details

Understanding what's happening under the hood helps you troubleshoot issues and adapt the configuration to your needs. Let me walk through each component of the MachineConfig.

Device detection script

The heart of the split disk setup is a bash script embedded in the MachineConfig. The find-secondary-device script:

Detects the cloud platform (AWS or GCP)
Searches for the secondary device:
- AWS: NVMe devices (/dev/nvme*)
- GCP: SCSI devices (/dev/sd[b-z])
Creates an XFS filesystem labeled splitdisk
Retries up to 30 times with 2-second delays (60 seconds total)

CRI-O configuration

The MachineConfig creates /etc/crio/crio.conf.d/99-imagestore.conf drop-in as follows:

[crio]
imagestore = "/var/lib/images"

This tells CRI-O to store newly pulled images on the split disk instead of the default location.

Systemd units

The find-secondary-device.service executes the device detection script and has a condition that prevents it from running if the marker file /etc/var-lib-split-disk-mount already exists. This makes it a one-time initialization service. It's ordered before local-fs-pre.target, ensuring the device is ready before the system attempts to mount filesystems.

#!/bin/bash

# Detect cloud platform - check sys_vendor first (more reliable than product_name)
PLATFORM=""
if [ -f /sys/class/dmi/id/sys_vendor ]; then
  VENDOR=$(cat /sys/class/dmi/id/sys_vendor 2>/dev/null || echo "")
  if [[ "$VENDOR" == *"Google"* ]]; then
    PLATFORM="gcp"
  elif [[ "$VENDOR" == *"Amazon"* ]] || [[ "$VENDOR" == *"EC2"* ]]; then
    PLATFORM="aws"
  fi
fi

# Fallback to product_name if sys_vendor didn't work
if [ -z "$PLATFORM" ] && [ -f /sys/class/dmi/id/product_name ]; then
  PRODUCT=$(cat /sys/class/dmi/id/product_name 2>/dev/null || echo "")
  if [[ "$PRODUCT" == *"Google"* ]]; then
    PLATFORM="gcp"
  elif [[ "$PRODUCT" == *"Amazon"* ]]; then
    PLATFORM="aws"
  fi
fi

# If platform detection failed, try metadata services
if [ -z "$PLATFORM" ]; then
  if curl -s -f -m 1 http://169.254.169.254/computeMetadata/v1/ -H "Metadata-Flavor: Google" &>/dev/null; then
    PLATFORM="gcp"
  elif curl -s -f -m 1 http://169.254.169.254/latest/meta-data/ &>/dev/null; then
    PLATFORM="aws"
  fi
fi

echo "Detected platform: ${PLATFORM}"

# Retry up to 30 times with 2 second delay (60 seconds total)
MAX_RETRIES=30
RETRY_DELAY=2

for attempt in $(seq 1 $MAX_RETRIES); do
  echo "Attempt $attempt of $MAX_RETRIES to find secondary device"
  
  # Search for secondary device based on platform
  if [ "$PLATFORM" == "aws" ]; then
    # AWS uses NVMe devices
    for device in /dev/nvme*[0-9]*n*; do
      /usr/sbin/blkid "${device}" &> /dev/null
      if [ $? == 2 ]; then
        echo "secondary device found ${device}"
        echo "creating filesystem for containers mount"
        if ! mkfs.xfs -L splitdisk -f "${device}" &> /dev/null; then
          echo "Failed to create filesystem on ${device}" >&2
          exit 1
        fi
        udevadm settle
        touch /etc/var-lib-split-disk-mount
        exit 0
      fi
    done
  elif [ "$PLATFORM" == "gcp" ]; then
    # GCP uses SCSI devices (skip sda which is boot disk)
    for device in /dev/sd[b-z]; do
      if [ -b "${device}" ]; then
        /usr/sbin/blkid "${device}" &> /dev/null
        if [ $? == 2 ]; then
          echo "secondary device found ${device}"
          echo "creating filesystem for containers mount"
          if ! mkfs.xfs -L splitdisk -f "${device}" &> /dev/null; then
            echo "Failed to create filesystem on ${device}" >&2
            exit 1
          fi
          udevadm settle
          touch /etc/var-lib-split-disk-mount
          exit 0
        fi
      fi
    done
  fi
  
  # Device not found yet, wait before retry
  if [ $attempt -lt $MAX_RETRIES ]; then
    echo "Device not found, waiting ${RETRY_DELAY}s before retry..."
    sleep $RETRY_DELAY
  fi
done

echo "Couldn't find secondary block device after ${MAX_RETRIES} attempts!" >&2
exit 77

var-lib-images.mount:

Mounts /dev/disk/by-label/splitdisk to /var/lib/images
Uses XFS filesystem
Waits for device detection to complete

selinux-splitdisk-policy.service:

Sets SELinux file context equivalence to /var/lib/containers
Command: semanage fcontext -a -e /var/lib/containers /var/lib/images

```bash
/usr/sbin/semanage fcontext -a -e /var/lib/containers /var/lib/images
```

restorecon-var-lib-splitdisk.service:

Restores SELinux contexts recursively
Command: restorecon -R /var/lib/images
Runs before CRI-O starts

```bash
/sbin/restorecon -R /var/lib/images
```

CRI-O service dependency:

CRI-O waits for split disk mount and SELinux configuration
Ensures split disk is ready before container operations begin

```ini
[Unit]
After=restorecon-var-lib-splitdisk.service
Requires=restorecon-var-lib-splitdisk.service var-lib-images.mount
```

Complete MachineConfig YAML:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 98-config-split-disk
spec:
  config:
    ignition:
      version: 3.2.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,IyEvYmluL2Jhc2gKCiMgRGV0ZWN0IGNsb3VkIHBsYXRmb3JtIC0gY2hlY2sgc3lzX3ZlbmRvciBmaXJzdCAobW9yZSByZWxpYWJsZSB0aGFuIHByb2R1Y3RfbmFtZSkKUExBVEZPUk09IiIKaWYgWyAtZiAvc3lzL2NsYXNzL2RtaS9pZC9zeXNfdmVuZG9yIF07IHRoZW4KICBWRU5ET1I9JChjYXQgL3N5cy9jbGFzcy9kbWkvaWQvc3lzX3ZlbmRvciAyPi9kZXYvbnVsbCB8fCBlY2hvICIiKQogIGlmIFtbICIkVkVORE9SIiA9PSAqIkdvb2dsZSIqIF1dOyB0aGVuCiAgICBQTEFURk9STT0iZ2NwIgogIGVsaWYgW1sgIiRWRU5ET1IiID09ICoiQW1hem9uIiogXV0gfHwgW1sgIiRWRU5ET1IiID09ICoiRUMyIiogXV07IHRoZW4KICAgIFBMQVRGT1JNPSJhd3MiCiAgZmkKZmkKCiMgRmFsbGJhY2sgdG8gcHJvZHVjdF9uYW1lIGlmIHN5c192ZW5kb3IgZGlkbid0IHdvcmsKaWYgWyAteiAiJFBMQVRGT1JNIiBdICYmIFsgLWYgL3N5cy9jbGFzcy9kbWkvaWQvcHJvZHVjdF9uYW1lIF07IHRoZW4KICBQUk9EVUNUPSQoY2F0IC9zeXMvY2xhc3MvZG1pL2lkL3Byb2R1Y3RfbmFtZSAyPi9kZXYvbnVsbCB8fCBlY2hvICIiKQogIGlmIFtbICIkUFJPRFVDVCIgPT0gKiJHb29nbGUiKiBdXTsgdGhlbgogICAgUExBVEZPUk09ImdjcCIKICBlbGlmIFtbICIkUFJPRFVDVCIgPT0gKiJBbWF6b24iKiBdXTsgdGhlbgogICAgUExBVEZPUk09ImF3cyIKICBmaQpmaQoKIyBJZiBwbGF0Zm9ybSBkZXRlY3Rpb24gZmFpbGVkLCB0cnkgbWV0YWRhdGEgc2VydmljZXMKaWYgWyAteiAiJFBMQVRGT1JNIiBdOyB0aGVuCiAgaWYgY3VybCAtcyAtZiAtbSAxIGh0dHA6Ly8xNjkuMjU0LjE2OS4yNTQvY29tcHV0ZU1ldGFkYXRhL3YxLyAtSCAiTWV0YWRhdGEtRmxhdm9yOiBHb29nbGUiICY+L2Rldi9udWxsOyB0aGVuCiAgICBQTEFURk9STT0iZ2NwIgogIGVsaWYgY3VybCAtcyAtZiAtbSAxIGh0dHA6Ly8xNjkuMjU0LjE2OS4yNTQvbGF0ZXN0L21ldGEtZGF0YS8gJj4vZGV2L251bGw7IHRoZW4KICAgIFBMQVRGT1JNPSJhd3MiCiAgZmkKZmkKCmVjaG8gIkRldGVjdGVkIHBsYXRmb3JtOiAke1BMQVRGT1JNfSIKCiMgUmV0cnkgdXAgdG8gMzAgdGltZXMgd2l0aCAyIHNlY29uZCBkZWxheSAoNjAgc2Vjb25kcyB0b3RhbCkKTUFYX1JFVFJJRVM9MzAKUkVUUllfREVMQVk9MgoKZm9yIGF0dGVtcHQgaW4gJChzZXEgMSAkTUFYX1JFVFJJRVMpOyBkbwogIGVjaG8gIkF0dGVtcHQgJGF0dGVtcHQgb2YgJE1BWF9SRVRSSUVTIHRvIGZpbmQgc2Vjb25kYXJ5IGRldmljZSIKICAKICAjIFNlYXJjaCBmb3Igc2Vjb25kYXJ5IGRldmljZSBiYXNlZCBvbiBwbGF0Zm9ybQogIGlmIFsgIiRQTEFURk9STSIgPT0gImF3cyIgXTsgdGhlbgogICAgIyBBV1MgdXNlcyBOVk1lIGRldmljZXMKICAgIGZvciBkZXZpY2UgaW4gL2Rldi9udm1lKlswLTldKm4qOyBkbwogICAgICAvdXNyL3NiaW4vYmxraWQgIiR7ZGV2aWNlfSIgJj4gL2Rldi9udWxsCiAgICAgIGlmIFsgJD8gPT0gMiBdOyB0aGVuCiAgICAgICAgZWNobyAic2Vjb25kYXJ5IGRldmljZSBmb3VuZCAke2RldmljZX0iCiAgICAgICAgZWNobyAiY3JlYXRpbmcgZmlsZXN5c3RlbSBmb3IgY29udGFpbmVycyBtb3VudCIKICAgICAgICBpZiAhIG1rZnMueGZzIC1MIHNwbGl0ZGlzayAtZiAiJHtkZXZpY2V9IiAmPiAvZGV2L251bGw7IHRoZW4KICAgICAgICAgIGVjaG8gIkZhaWxlZCB0byBjcmVhdGUgZmlsZXN5c3RlbSBvbiAke2RldmljZX0iID4mMgogICAgICAgICAgZXhpdCAxCiAgICAgICAgZmkKICAgICAgICB1ZGV2YWRtIHNldHRsZQogICAgICAgIHRvdWNoIC9ldGMvdmFyLWxpYi1zcGxpdC1kaXNrLW1vdW50CiAgICAgICAgZXhpdCAwCiAgICAgIGZpCiAgICBkb25lCiAgZWxpZiBbICIkUExBVEZPUk0iID09ICJnY3AiIF07IHRoZW4KICAgICMgR0NQIHVzZXMgU0NTSSBkZXZpY2VzIChza2lwIHNkYSB3aGljaCBpcyBib290IGRpc2spCiAgICBmb3IgZGV2aWNlIGluIC9kZXYvc2RbYi16XTsgZG8KICAgICAgaWYgWyAtYiAiJHtkZXZpY2V9IiBdOyB0aGVuCiAgICAgICAgL3Vzci9zYmluL2Jsa2lkICIke2RldmljZX0iICY+IC9kZXYvbnVsbAogICAgICAgIGlmIFsgJD8gPT0gMiBdOyB0aGVuCiAgICAgICAgICBlY2hvICJzZWNvbmRhcnkgZGV2aWNlIGZvdW5kICR7ZGV2aWNlfSIKICAgICAgICAgIGVjaG8gImNyZWF0aW5nIGZpbGVzeXN0ZW0gZm9yIGNvbnRhaW5lcnMgbW91bnQiCiAgICAgICAgICBpZiAhIG1rZnMueGZzIC1MIHNwbGl0ZGlzayAtZiAiJHtkZXZpY2V9IiAmPiAvZGV2L251bGw7IHRoZW4KICAgICAgICAgICAgZWNobyAiRmFpbGVkIHRvIGNyZWF0ZSBmaWxlc3lzdGVtIG9uICR7ZGV2aWNlfSIgPiYyCiAgICAgICAgICAgIGV4aXQgMQogICAgICAgICAgZmkKICAgICAgICAgIHVkZXZhZG0gc2V0dGxlCiAgICAgICAgICB0b3VjaCAvZXRjL3Zhci1saWItc3BsaXQtZGlzay1tb3VudAogICAgICAgICAgZXhpdCAwCiAgICAgICAgZmkKICAgICAgZmkKICAgIGRvbmUKICBmaQogIAogICMgRGV2aWNlIG5vdCBmb3VuZCB5ZXQsIHdhaXQgYmVmb3JlIHJldHJ5CiAgaWYgWyAkYXR0ZW1wdCAtbHQgJE1BWF9SRVRSSUVTIF07IHRoZW4KICAgIGVjaG8gIkRldmljZSBub3QgZm91bmQsIHdhaXRpbmcgJHtSRVRSWV9ERUxBWX1zIGJlZm9yZSByZXRyeS4uLiIKICAgIHNsZWVwICRSRVRSWV9ERUxBWQogIGZpCmRvbmUKCmVjaG8gIkNvdWxkbid0IGZpbmQgc2Vjb25kYXJ5IGJsb2NrIGRldmljZSBhZnRlciAke01BWF9SRVRSSUVTfSBhdHRlbXB0cyEiID4mMgpleGl0IDc3Cg==
        mode: 0755
        path: /etc/find-secondary-device
        overwrite: true
      - contents:
          source: data:text/plain;charset=utf-8;base64,W2NyaW9dCmltYWdlc3RvcmUgPSAiL3Zhci9saWIvaW1hZ2VzIg==
        mode: 0644
        path: /etc/crio/crio.conf.d/99-imagestore.conf
        overwrite: true
      - contents:
          source: data:text/plain;charset=utf-8;base64,W1VuaXRdCkFmdGVyPXJlc3RvcmVjb24tdmFyLWxpYi1zcGxpdGRpc2suc2VydmljZQpSZXF1aXJlcz1yZXN0b3JlY29uLXZhci1saWItc3BsaXRkaXNrLnNlcnZpY2UgdmFyLWxpYi1pbWFnZXMubW91bnQK
        mode: 0644
        path: /etc/systemd/system/crio.service.d/99-wait-for-splitdisk.conf
        overwrite: true
    systemd:
      units:
      - name: find-secondary-device.service
        enabled: true
        contents: |
          [Unit]
          Description=Find secondary device
          DefaultDependencies=false
          After=systemd-udev-settle.service
          Before=local-fs-pre.target
          ConditionPathExists=!/etc/var-lib-split-disk-mount

          [Service]
          Type=oneshot
          RemainAfterExit=yes
          ExecStart=/etc/find-secondary-device

          [Install]
          WantedBy=multi-user.target
      - name: var-lib-images.mount
        enabled: true
        contents: |
          [Unit]
          Description=Mount /var/lib/images
          Requires=find-secondary-device.service
          After=find-secondary-device.service
          Before=local-fs.target

          [Mount]
          What=/dev/disk/by-label/splitdisk
          Where=/var/lib/images
          Type=xfs
          Options=defaults
          TimeoutSec=120s

          [Install]
          WantedBy=local-fs.target
      - name: selinux-splitdisk-policy.service
        enabled: true
        contents: |
          [Unit]
          Description=Set SELinux file context rules for splitdisk
          DefaultDependencies=no
          After=var-lib-images.mount
          Before=restorecon-var-lib-splitdisk.service
          ConditionPathExists=!/var/lib/splitdisk-selinux-configured

          [Service]
          Type=oneshot
          RemainAfterExit=yes
          ExecStart=/bin/bash -c '/usr/sbin/semanage fcontext -a -e /var/lib/containers /var/lib/images && touch /var/lib/splitdisk-selinux-configured'
          TimeoutSec=0

          [Install]
          WantedBy=multi-user.target
      - name: restorecon-var-lib-splitdisk.service
        enabled: true
        contents: |
          [Unit]
          Description=Restore recursive SELinux security contexts
          DefaultDependencies=no
          After=selinux-splitdisk-policy.service var-lib-images.mount
          Requires=selinux-splitdisk-policy.service
          Before=crio.service

          [Service]
          Type=oneshot
          RemainAfterExit=yes
          ExecStart=/sbin/restorecon -R /var/lib/images
          TimeoutSec=0

          [Install]
          WantedBy=multi-user.target graphical.target

Verifying split disk configuration

After cluster creation completes, verify that the split disk is configured correctly.

Verify all worker nodes are ready:

oc get nodes

Check the MachineConfig applied:

oc get machineconfig 98-config-split-disk
oc get machineconfigpool worker

The worker pool should show the updated configuration.

Then verify the split disk with a test workload. Deploy a pod with a large image to confirm split disk is working:

cat <<EOF | oc apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: pytorch-interactive
  labels:
    app: pytorch
spec:
  containers:
  - name: pytorch-container
    image: pytorch/pytorch:2.2.0-cuda12.1-cudnn8-runtime
    command: ["sleep", "infinity"]
    resources:
      requests:
        cpu: "2"
        memory: "4Gi"
      limits:
        cpu: "4"
        memory: "8Gi"
EOF

# Wait for the image to pull (may take several minutes)
oc get pod pytorch-interactive -w

When the pod status shows Running, verify the image location:

# Get the worker node where the pod is running
NODE=$(oc get pod pytorch-interactive -o jsonpath='{.spec.nodeName}')

# Start a debug session on the node
oc debug node/${NODE}

# Inside the debug shell, switch to the host filesystem
chroot /host

# Verify split disk directories exist
ls -la /var/lib/images/
# Expected: overlay/, overlay-images/, overlay-layers/

# Check for the PyTorch image on the split disk
ls -lh /var/lib/images/overlay-images/

# Check disk for the image
crictl ps 
# record the pytorch-container image id you can verify if its present in images folder
ls /var/lib/images/overlay-images
# Expected: Several GB (e.g., 5.2G)

du -sh /var/lib/containers/storage/
# Expected: Smaller size, unchanged (e.g., 1.1G)

# View filesystem mount
df -h | grep splitdisk
# Expected: Shows split disk mounted at /var/lib/images

# Exit the debug shell
exit
exit

# Clean up the test pod
oc delete pod pytorch-interactive

PyTorch image (several GB) appears in /var/lib/images/overlay-images/. Split disk shows increased usage after the image pull and pre-baked AMI images remain unchanged in /var/lib/containers/storage/.

This confirms that newly pulled images are stored on the split disk while existing AMI images remain on the original disk.

Troubleshooting

If the secondary device is not found, check the device detection service:

oc debug node/<worker-node-name>
chroot /host
journalctl -u find-secondary-device.service

Verify the secondary volume is attached.

AWS:

lsblk
# Look for the secondary NVMe device

GCP:

lsblk
# Look for /dev/sdb or other SCSI device

If the split disk is not mounted, check the mount status:

systemctl status var-lib-images.mount
journalctl -u var-lib-images.mount

Verify the device has the correct label:

blkid | grep splitdisk

For SELinux issues, verify SELinux contexts:

ls -laZ /var/lib/images/
semanage fcontext -l | grep splitdisk

Check the SELinux services:

systemctl status selinux-splitdisk-policy.service
systemctl status restorecon-var-lib-splitdisk.service

If CRI-O is not using imagestore, check the CRI-O configuration:

cat /etc/crio/crio.conf.d/99-imagestore.conf
# Should contain the configuration for imageStore

Check the CRI-O service:

systemctl status crio
journalctl -u crio | grep imagestore

Final thoughts

Split disk gives you a practical way to handle large container images on OpenShift without overrunning your boot disk. I've seen it make the difference between clusters that struggle with disk pressure and ones that scale smoothly with image-intensive workloads. This configuration process is straightforward when you understand the function of each piece. You modify your machinesets to provision secondary disks, apply a MachineConfig that detects and mounts those disks, configure SELinux appropriately, and tell CRI-O where to store images. The orchestration happens automatically through systemd; and once it's in place, it just works.

If you're running AI/ML workloads or any scenario where image sizes are measured in gigabytes rather than megabytes, split disk is worth considering. Start with a test cluster, verify the configuration works for your specific images and workflows, then roll it out to production with confidence.

Configure a split disk on OpenShift Container Platform

How container storage works

How split disk works

Use cases

Prerequisites

Creating a cluster with split disk

Step 1: Create the install configuration

Step 2: Generate manifests

Step 3: Add split disk MachineConfig

Step 4: Edit worker machinesets

Step 5: Create the cluster

MachineConfig details

Device detection script

CRI-O configuration

Systemd units

Verifying split disk configuration

Troubleshooting

Final thoughts

Simplify GitOps workflows with MCP in OpenShift Lightspeed

Operationalize AI agents with OpenShift and Kubernetes primitives

Architect an open blueprint for cloud-native AI agents

Computer use: How AI agents can automate almost anything

PyTorch distributed is changing and TorchComms is why

Implement a multi-architecture OpenShift cluster with s390x LPAR

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links