Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Checkpoint and restore in Kubernetes

October 7, 2021
Adrian Reber
Related topics:
ContainersLinuxKubernetesSecurity
Related products:
Red Hat OpenShift

Share:

    In 2015, an issue was opened against Kubernetes about supporting container migration. The problem description mentioned Checkpoint/Restore In Userspace (CRIU) on Linux as a possible basis for a solution. Around the same time, I started to look into how to integrate CRIU into the container stack.

    Note:  This article is a preview of my upcoming session at KubeCon + CloudNative NA 2021, happening October 11 to 15.  See the end of this article for more about my session. 

    Checkpoint and restore in the container stack

    The basic steps to migrate running containers from one node to another—which could also be called stateful migration—are to checkpoint the container on the source node, transfer the checkpoint image to the destination node, and restore the container on the destination node. This way, the container is migrated without losing its state.

    In 2015, however, the container stack was not ready to support checkpoint and restore in the orchestration layer (Kubernetes). The container runtime layer, runc, offered limited support for checkpointing and restoring containers, but that support was not yet available in the higher layers of the container stack.

    Over the years, I was involved in bringing checkpoint and restore support to these upper layers of the container stack. Around 2018 I implemented checkpoint and restore support in Podman. Bringing checkpoint and restore support, and thus migration support, to Podman required many changes in runc and CRIU. It was necessary to support different Linux security techniques used in containers, including SELinux, AppArmor, and seccomp, before Podman could migrate a container from one node to another without losing any of its state.

    Checkpointing a container out of a pod

    Eventually, it was possible to migrate containers with a few simple commands from one node to another. But at this point, it was still not possible to integrate checkpoint and restore into Kubernetes. One big remaining barrier to adding support for container checkpoint and restore in Kubernetes was that, until now, no one had looked into how to combine the concept of pods in Kubernetes with CRIU and the whole container stack.

    A container in Linux is usually one or more processes using Linux namespaces to create boundaries between processes in different containers. (See Demystifying namespaces and containers in Linux for an introduction to Linux namespaces.) In Kubernetes, containers run in pods and pods share some of their namespaces with all of the containers in the pod. But only some namespaces are shared. Before being able to checkpoint a container out of a pod and restore it into another pod, it was first necessary to enable pod support in CRIU and the container stack layers below Kubernetes; specifically, to enable checkpointing a container out of a pod and restoring the container into an existing pod. In addition to enabling the sharing of namespaces, we also needed to join existing SELinux contexts upon restore.

    Use cases for checkpoint and restore in Kubernetes

    Before integrating checkpoint and restore into Kubernetes, we thought about possible use cases and came up with the following:

    • Reboot without losing state: Sometimes, it is necessary to reboot a node for important security updates. With the help of checkpoint and restore, a slow starting container can be checkpointed before the reboot. Then, after the reboot, the container can be restored from the checkpoint without losing any state and without long service downtimes.
    • Quick startup: Similar to the first use case, one might want a slow-starting container to start faster. For containers that require a long time to initialize, checkpoint and restore can be used to create checkpoints of a container after the long initialization phase. Then the system can quickly spin up additional copies based on the checkpoint, which is already initialized.
    • Container migration: Checkpointing a container on one node and restoring it on another node constitutes container migration and would provide what was requested in the ticket from 2015.
    • Forensic container checkpointing: This use case checkpoints a container without stopping it and without the container knowing that it was checkpointed. The checkpointed container can be restored in a sandboxed environment for further threat analysis.

    One of the challenges we faced when we thought about introducing checkpoint and restore into Kubernetes was how to do it in a minimal way with as little impact as possible on anything else. The forensic container checkpointing use case was a useful but simple one to try out that requirement. After we implemented this use case, it became possible to see how checkpointing can be used in Kubernetes without breaking anything else.

    Learn more at KubeCon + CloudNative North America 2021

    At KubeCon + CloudNative North America 2021, I will present more details about Kubernetes and checkpoint restore. I will present additional use cases for checkpoint and restore in combination with containers. There will also be a live demo of all the use cases I present. I will give technical details about how CRIU enables checkpointing and restoring of containers, and an overview of how CRIU enables container migration in different container engines. Join my session on October 14 and I will be happy to answer any related questions.

    Last updated: September 20, 2023

    Related Posts

    • Checkpointing Java from outside of Java

    Recent Posts

    • Assessing AI for OpenShift operations: Advanced configurations

    • OpenShift Lightspeed: Assessing AI for OpenShift operations

    • OpenShift Data Foundation and HashiCorp Vault securing data

    • Axolotl meets LLM Compressor: Fast, sparse, open

    • What’s new for developers in Red Hat OpenShift 4.19

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue