Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Persistent storage in action: Understanding Red Hat OpenShift's persistent volume framework

October 22, 2020
Bryant Son
Related topics:
ContainersDevOpsKubernetes

Share:

    Red Hat OpenShift is an enterprise-ready Kubernetes platform that provides a number of different models you can use to deploy an application. OpenShift 4.x uses Operators to deploy Kubernetes-native applications. It also supports Helm and traditional template-based deployments. Whatever deployment method you choose, it will be deployed as a wrapper to one or more existing OpenShift resources. Examples include BuildConfig, DeploymentConfig, and ImageStream.

    In this article, I introduce you to OpenShift's Kubernetes-based persistent volume framework for persistent cluster storage. You will learn how to use OpenShift's PersistentVolume (PV) and PersistentVolumeClaim (PVC) objects to provision and request storage resources.

    Defining persistence

    In computer programming, a variable is a storage address (identified by a memory address) that is paired with an associated symbolic name. Each variable contains a known or unknown quantity of information that represents a value. A variable can temporarily hold a data value, but variables are typically stored only while running in memory, so the data will be lost when the application terminates.

    Persistence occurs when a data value "continues steadfastly or firmly in some state" (definition from Dictionary.com). In contrast to values that persist, some applications programmatically access temporary variable values, which prioritize speed over longevity. These temporary values are considered ephemeral when the data lasts for only a short time.

    For example, imagine your favorite game, such as Angry Birds. You would need to use persistent storage if you wanted to save certain game data to reload every time that you played. You might use ephemeral memory to store the programmatic values representing the scores for your current mission.

    Understanding storage architecture

    Now, let's discuss the concepts of persistent storage and ephemeral storage in an OpenShift cluster. The first thing to understand is how persistent storage differs from ephemeral storage. OpenShift treats PersistentVolume and PersistentVolumeClaim objects as resources: Like OpenShift APIs, you can use YAML or JSON configuration files to manage them.

    In Figure 1, the left side of the diagram illustrates an application that is deployed to an OpenShift project namespace without defining persistent storage. In this example, the data is temporarily stored in Pod 1 and Pod 2 using ephemeral storage. The stored data will be lost when the pods are deleted. The right side illustrates an application deployed to an OpenShift namespace with persistent data storage. In this case, an administrator has provisioned persistent storage in the cluster, and a developer has issued a PersistentVolumeClaim to request that storage.

    The PVC gives Pod 1 and Pod 2 each a volume reference to the persistent storage. The data storage will be referenced when you deploy the application using a DeploymentConfig or Deployment object, and the data will persist even after one or both of the pods is destroyed.

    Ephemeral Storage vs Persisten Storage
    Ephemeral Storage vs Persisten Storage
    Figure 1: Comparing persistent storage and ephemeral storage.">

    Persistent storage types

    As an OpenShift architect, you decide what type of persistent storage to provide to an application's developers. The three popular storage types are file storage, block storage, and object storage.

    Network File System

    The most frequently used type of storage in the persistent volume framework is Network File System (NFS), or simply file storage. NFS is a traditional network storage type, providing a path to storage connected over a standard ethernet network. It is relatively inexpensive and is able to store compatible data types. For example, you can store media files such as JPEG and MP3, as well as MySQL or MySQL data. As noted in Figure 2, NFS works best for small and simple file storage or databases.

    However, NFS is relatively slow and not the best option for complex applications requiring fast operations. Additionally, NFS expands only by scaling out; it does not scale up, which further restricts its performance. GlusterFS, Ceph, and Amazon Web Services Elastic File System (AWS EFS) are examples of NFS.

    Network File System strengths and weaknesses
    Figure 2: Network File System's strengths and weaknesses.">

    Block storage

    Block storage uses a storage area network (SAN) to deploy and operate on a block of storage spread across a network. SAN-enabled block storage offers faster retrieval and operation times and more efficient operation as compared to NFS. It is ideal for production-quality, structured data for databases such as MySQL and PostgreSQL.

    On the downside, block storage is slightly more expensive than NFS, and it has a limited capability for handling metadata. Examples of block storage are Ceph and AWS Elastic Block Store (AWS EBS), as shown in Figure 3.

    Block storage strengths and weaknesses
    Figure 3: Block storage's strengths and weaknesses.">

    Object storage

    Object storage's popularity has grown exponentially due to the demand for efficient transactions on unstructured files such as photos and videos. It is designed with a flat structure, where the files are broken into pieces and spread across the hardware. Object storage uses HTTP to manipulate and retrieve data. It is highly efficient for unstructured files such as media files and static files, and it uses a pay-as-you-go cost model, which makes it affordable and cost-effective.

    On the downside, you cannot use object storage with read-write-many data, so it is not suitable for a database. As shown in Figure 4, Ceph and Amazon Simple Storage Service (AWS S3) are examples of object storage.

    Object storage strengths and weaknesses
    Figure 4: Object storage's strengths and weaknesses.">

    Note: See "File storage, block storage, or object storage?" for a complete introduction to these storage types.

    Demo: Persistent volume storage in a MySQL database

    With those concepts out of the way, it's time for a demonstration. In the next sections, I will demonstrate the usefulness of OpenShift's persistent volume framework by deploying a MySQL database, first without and then with persistent volume storage.

    For this demonstration, I assume the following about your development environment:

    • You have at least developer access to a standard OpenShift 3 or higher cluster. Ideally, you should have admin access. If you don't have admin access, then you can skip Step 1 and use your own namespace instead.
    • Your ImageStream is correctly configured with the standard MySQL container image.
    • You have the OpenShift command-line interface (oc) installed, and you know how to use it.

    Step 1: Set up the demo

    The first thing we'll do is create a new OpenShift project under its own namespace. You can skip this step if you are assigned to a default namespace. You will be able to use your assigned namespace, but you won't be able to create a special project for this demo.

    Create the OpenShift project

    Log in to your OpenShift cluster using the OpenShift CLI (oc) and use the oc new-project command to create a new project. As shown in Figure 5, I recommend using the name "pvc-demo."

    Create a new project
    Create a new project
    Figure 5: Create a new OpenShift project named pvc-demo.">

    Verify your project defaults

    By default, your OpenShift project uses the mysql ImageStream in the openshift namespace. Run the following command to verify the default (also shown in Figure 6):

    oc get is -n openshift | grep mysql

    Then run the following:

    oc get -o yaml is/mysql -n openshift
    Check MySQL Image Stream
    Check MySQL Image Stream
    Figure 6: Verify your OpenShift project defaults.">

    Deploy a new MySQL application with ImageStream

    Use the OpenShift oc new-app command to deploy a new MySQL application with ImageStream. Note that for this phase of the demo, we are configuring MySQL to use ephemeral storage.

    At a minimum, you will need to configure the MySQL username, password, and database name using these three environment variables:

    oc new-app -i mysql -e MYSQL_USER=tester -e MYSQL_PASSWORD=Pass1234 -e MYSQL_DATABASE=testdb

    You can change the values for your own project, but note that when MySQL is installed, it initially sets MYSQL_DATABASE as the default database. The MySQL application deployment will fail if any of these environment variables are missing.

    When the MySQL application deployment completes, load the environment variables in each pod. Figure 7 shows the MySQL application deployment with ImageStream and the MYSQL_USER, MYSQL_PASSWORD, and MYSQL_DATABASE variables.

    Deploy a MySQL as a new application
    Deploy a MySQL as a new application
    Figure 7: Deploy a new MySQL application with ImageStream.">

    Verify that the MySQL pod is running

    Before we move on, let's check that our MySQL application is running successfully. First, run oc get all to verify that the pod's status is Running, as shown in Figure 8.

    Verify that MySQL app gets deployed successfully
    Verify that MySQL app gets deployed successfully
    Figure 8: Verify that the MySQL application pod is successfully running.">

    Another option is to use the OpenShift web console to check the pod's status. The pod shown in Figure 9 looks healthy.

    Check MySQL status in web console
    Check MySQL status in web console
    Figure 9: Use the OpenShift web console to check the MySQL pod's health status.">

    If you open the terminal inside the web console, you can click the name of the pod to inspect its status, as shown in Figure 10.

    Pod is running fine
    Pod is running fine
    Figure 10: Use the web terminal to inspect the status of the pod.">

    From inside the pod, you can click Terminal to interact directly with the pod, as shown in Figure 11:

    Terminal inside MySQL Pod
    Terminal inside MySQL Pod
    Figure 11: Use the OpenShift terminal to interact directly with the MySQL pod.">

    Note: For this demo, we will use the oc CLI to verify and inspect the pod. You can run the oc printenv | grep MYSQL command to verify that the environment variables that we specified when we created the pod are shown.

    Step 2: Create the MySQL data with ephemeral storage

    Use oc get pods to view the pod, then run the shell command inside the pod. Next, enter oc rsh pod and replace the POD_NAME variable with the name of the pod that you want to interact with:

    oc rsh pod/POD_NAME
    

    Once you are inside the pod, use printenv | grep to verify that the environment variable values are correctly populated, as shown in Figure 12.

    oc rsh to interact with Pod through OC CLI
    oc rsh to interact with Pod through OC CLI
    Figure 12: Run oc rsh to interact with the pod, and verify the environment variables.">

    Create sample MySQL queries

    Next, we'll create a file with sample MySQL queries, which will populate the MySQL data for the demo. As shown in Figure 13, I created a sample SQL query file named sample_query.sql and saved it in my VI text editor.

    oc rsh to interact with Pod through OC CLI
    oc rsh to interact with Pod through OC CLI
    Figure 13: Create a file with sample MySQL queries to populate the data.">

    Figure 14 shows the sample SQL queries that I will use for this demo. If you are already familiar with MySQL, feel free to change the queries; just note that your database name must match what you defined with the MYSQL_DATABASE variable.

    Sample SQL queries
    Sample SQL queries
    Figure 14: Using sample SQL queries saved in /var/lib/mysql.">

    Be sure to save this content into a separate text editor; you will need it later in the article when you reiterate the steps of deleting a pod and re-creating your MySQL data. Also, note that by default, MySQL saves the data into a path called /var/lib/mysql. We'll use this path as the reference for our PersistentVolumeClaim when we redeploy the application later.

    Run the MySQL queries

    Use the following command to log in to MySQL:

    mysql -u$MYSQL_USER -p$MYSQL_PASSWORD -h$HOSTNAME $MYSQL_DATABASE
    

    With the exception of $HOSTNAME, the properties starting with the $ symbol refer to the environment variables that you specified when you deployed the sample application using the OpenShift CLI oc new-app command. The $HOSTNAME environment variable refers to the MySQL hostname.

    As shown in Figure 15, execute the MySQL source command to run your MySQL queries from the sample SQL file that you've just created:

    source sample_queries.sql
    Login to MYSQL and create the data with source command
    Login to MYSQL and create the data with source command
    Figure 15: Execute the MySQL source command to run MySQL queries from the sample file.">

    Finally, run the MySQL show databases command to verify that each MySQL database was successfully created. Use the use testdb command to check for the existence of the data and then validate it, as shown in Figure 16.

    Check MySQL data get created successfully
    Check MySQL data get created successfully
    Figure 16: Check that your MySQL databases were successfully created.">

    Delete the pod

    Now, we'll delete the pod and see what happens to the data stored using ephemeral data storage. To exit from the pod, type exit twice. Then, re-enter the pod and use the OpenShift CLI oc delete pod POD_NAME command, as shown in Figure 17.

    Kill the Pod with ephemeral storage
    Kill the Pod with ephemeral storage
    Figure 17: Use the oc delete pod POD_NAME command to delete the pod.">

    We used a DeploymentConfig with ReplicaSet, so a new instance spins up to replace the one that we've just deleted. Wait a short time, then execute the OpenShift oc get pods command. You should see that a new pod is running. Connect to the new pod with the oc rsh command. For my example, the command is oc rsh pod/mysql pnpq4. When you log in to MySQL, check for the created database table. Unfortunately, the table is empty, as shown in Figure 18.

    Mayday! We lost the MySQL data.
    Mayday! We lost the MySQL data.
    Figure 18: Check for the created database table, which is empty.">

    Because we configured the pod to use ephemeral storage, we've lost all of our data.

    Next, we'll deploy the application with a persistent volume claim (PVC).

    Step 3: Create a MySQL application with persistent volume storage

    As I explained at the beginning of this article, we can use a YAML file to define a PersistentVolumeClaim (PVC) for our MySQL application. To start, create a new file named sample-pvc.yaml, as shown in Figure 19. (You can name the file whatever you want, as long as the file extension is the same.)

    PVC defined in YAML
    PVC defined in YAML
    Figure 19: Create a YAML file to define the PVC for the MySQL application.">

    Then, run the following to create the PVC:

    oc apply -f sample-pvc.yaml
    

    Enter the oc get pvc command to verify that you've just created a PVC definition, as shown in Figure 20.

    Apply PVC YAML to create a new PVC
    Apply PVC YAML to create a new PVC
    Figure 20: Execute the oc get pvc command to verify that you created a PVC definition.">

    We will use this new persistent volume claim soon to modify the MySQL application's DeploymentConfig. But first, let's make a slight improvement to the application.

    Create an OpenShift Secret

    You might remember that when we created the MySQL application, we defined the core environment variables (MYSQL_USER, MYSQL_PASSWORD, and MYSQL_DATABASE) in a plain text file. Storing secret or sensitive data as plain text values is a security risk, so let's fix that now.

    As an alternative to plain text, we can use OpenShift secrets to store our MySQL environment values. Secrets use Base64, which provides a basic encryption.

    Note: Another option would be to store the MySQL values using an optimally secure solution such as Vault. I won't describe that option in this article.

    The first step to storing the MySQL environment values in a secret is to create a new file with key-value pairs. Then, use the following command to create an OpenShift secret to store the MySQL credential data on the file containing key-value pairs:

    oc create secret generic mysql-sec --from-env-file=mysql-cred.env
    

    Note that in my example, mysql-sec is the OpenShift secret name and mysql-cred.env is the file that contains the MySQL key-value pairs.

    Run the oc get secret and oc get -o yaml secret/mysql-sec commands to verify that the OpenShift secret was successfully created, as shown in Figure 21.

    Create an Openshift Secret from environment file
    Create an Openshift Secret from environment file
    Figure 21: Store encrypted environment values in an OpenShift secret and ensure that it is created.">

    Modify the DeploymentConfig

    Now that we've added a secret, we are ready to redeploy the MySQL application with persistent volume storage. One way to redeploy the MySQL application is to modify the application's DeploymentConfig file. Be aware, however, that the DeploymentConfig file requires a specific format. An easier option is to use the OpenShift CLI's oc patch command.

    Start by running the oc get dc and oc edit dc/mysql commands, as shown in Figure 22.

    Edit DeploymentConfig
    Edit DeploymentConfig
    Figure 22: Use the oc get dc and oc edit dc/mysql commands to redeploy the DeploymentConfig file.">

    Next, replace the direct reference to MySQL's environment variable values (under spec.template.spec.containers.env) with the new secret. Update the value to valueFrom and use the new secretKeyRef, as shown in Figure 23. (See the OpenShift secrets documentation for further reference.)

    Modify our direct reference to MYSQL environment variable to Openshift Secret
    Modify our direct reference to MYSQL environment variable to Openshift Secret
    Figure 23: Replace the reference to MySQL environment variables with the new OpenShift secret.">

    Implement the persistent volume claim

    Finally, we've come to our glorious moment: We will make two changes to implement our persistent volume claim.

    First, under spec.template.spec.containers, add a new line after terminationGracePeriodSeconds. Enter volumes: and a reference to the persistent volume claim, for instance mysql-volume.

    Next, introduce a new line after terminationMessagePolicy. Enter volumeMounts: and add a mountPath with the value set to /var/lib/mysql. Enter mysql-volume, which is the volume name that you created in the previous step.

    Figure 24 shows these updates. Note that /var/lib/mysql is the default path that MySQL uses to store its SQL data.

    Introduce volumes and volumeMounts
    Introduce volumes and volumeMounts
    Figure 24: Implement the PVC by configuring volumes for the PVC and volume mounts.">

    Test the pod

    Once the new pod is running, run get pods to see the new pod in its running state.

    Use the rsh pod/POD_NAME command again to ssh into the pod, then enter printenv | grep MYSQL. As shown in Figure 25, you should see that the environment variables were successfully picked up from the secret.

    Test the pod
    Figure 25: On the new pod, check that the environment values from the OpenShift secret are successfully stored.">

    Create and populate a sample MySQL file

    Now, repeat the sequence from Step 2 to create the sample MySQL file again, as shown in Figure 26.

    Let's create our SQL file again
    Let's create our SQL file again
    Figure 26: Create and populate a new sample MySQL file.">

    Log in to the MySQL application and repeat the process for recreating the MySQL data with the source command. This time, because we have persistent data storage, we won't lose the data.

    Recreate our SQL data again
    Recreate our SQL data again
    Figure 27: On the MySQL application, recreate the MySQL data with persistent data storage.">

    Verify the MySQL queries

    In the MySQL shell, use the show databases, show tables, and SELECT commands to verify that the MySQL queries work. As shown in Figure 28, you should see that MySQL data was successfully recreated.

    Verify that our MySQL data created successfully
    Verify that our MySQL data created successfully
    Figure 28: Verify that the MySQL queries work, and the data exists.">

    Step 4: Test your application's data persistence

    With the persistent volume for storage, we should be assured that our data is secure. To test the application's data persistence, exit from the pod and delete it, as shown in Figure 29.

    Kill the Pod again
    Kill the Pod again
    Figure 29: Exit the pod and delete it.">

    When you reconnect to the new pod and check the MySQL database, verify that the data is still there. As shown in Figure 30, you should see that the MySQL data persists, even after we've deleted the pod.

    We now see our MySQL persists!
    We now see our MySQL persists!
    Figure 30: Verify that the MySQL data persists, even after the pod was deleted.">

    Conclusion

    In this article, you learned the basics of persistent storage and the different storage types that you can use. You also saw a demonstration of OpenShift's persistent volume storage in action.

    I hope that this article has helped you to understand OpenShift's persistent volume storage framework and how to use PersistentVolumeClaim objects for persistent storage in your OpenShift clusters. Please leave a comment if you have questions about the demonstration in this article.

    Last updated: November 4, 2020

    Recent Posts

    • How Kafka improves agentic AI

    • How to use service mesh to improve AI model security

    • How to run AI models in cloud development environments

    • How Trilio secures OpenShift virtual machines and containers

    • How to implement observability with Node.js and Llama Stack

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue