Kubernetes + OpenShift featured image

OpenShift APIs for Data Protection (OADP) is an operator that lets you back up and restore workloads in Red Hat OpenShift clusters. It is based on the upstream open source project Velero. You can use OADP to backup and restore all Kubernetes resources for a given project, including persistent volumes.

It is a best practice to be able to recreate your workloads via Infrastructure as Code (IAC) pipelines or automation. Most Kubernetes projects in production already have a way to be recreated; however, when it comes to restoring data from persistent volumes, that requires a separate solution. OADP can fill that gap. This article will focus primarily on restoring persistent volumes.

How does OADP back up persistent volumes?

OADP allows backing up and restoring persistent volumes via either Restic or CSI snapshots. In both cases, incremental backups are supported.

What are some limitations of backing up data with OADP?

These are some key limitations that are worth highlighting:

  • Pods need to be running for the corresponding persistent volumes to be backed up.
  • Emptydir volumes cannot be backed up. Your workloads should not be storing important data in emptydir volumes, as these volumes are ephemeral.
  • Persistent volumes cannot exist when doing a restore. This means that the corresponding persistent volume claims will need to be deleted explicitly before doing a restore.

Install and configure OADP

Follow these steps to install and configure the OpenShift APIs for Data Protection Operator.

Prerequisites

Install the OADP Operator

For this demo, we used Red Hat OpenShift 4.12 with OADP Operator version 1.1.3, which uses Velero version 1.9.5. To install via OpenShift Console follow these simple steps (requires user with cluster-admin role):

  1. In the OpenShift Container Platform web console, click Operators → OperatorHub.

  2. Use the Filter by keyword field to find the OADP Operator.

  3. Select the OADP Operator (select the one from Red Hat source, instead of the Community source) and click Install.

  4. Accept default values and click Install to install the Operator in the openshift-adp project.

  5. Click Operators → Installed Operators to verify the installation.

Create an S3 bucket for backups

For this demo we used an AWS S3 bucket; however, any S3 compliant storage can be used, including ODF (OpenShift Data Foundation) object bucket claims.

The following instructions are based on the OpenShift documentation:

  1. Log into AWS using aws configure and provide your credentials.

  2. Set the BUCKET and REGION variables:

    BUCKET=<your_bucket>
    REGION=<your_region>
  3. Create an AWS S3 bucket (us-east-1 does not support a LocationConstraint. If your region is us-east-1, omit --create-bucket-configuration LocationConstraint=$REGION):

    aws s3api create-bucket --bucket $BUCKET --region $REGION \
     --create-bucket-configuration LocationConstraint=$REGION 
  4. Create an IAM user (If you want to use Velero to back up multiple clusters with multiple S3 buckets, create a unique user name for each cluster):

    aws iam create-user --user-name velero 
  5. Create a velero-policy.json file:

    cat > velero-policy.json <<EOF
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "ec2:DescribeVolumes",
                    "ec2:DescribeSnapshots",
                    "ec2:CreateTags",
                    "ec2:CreateVolume",
                    "ec2:CreateSnapshot",
                    "ec2:DeleteSnapshot"
                ],
                "Resource": "*"
            },
            {
                "Effect": "Allow",
                "Action": [
                    "s3:GetObject",
                    "s3:DeleteObject",
                    "s3:PutObject",
                    "s3:AbortMultipartUpload",
                    "s3:ListMultipartUploadParts"
                ],
                "Resource": [
                    "arn:aws:s3:::${BUCKET}/*"
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "s3:ListBucket",
                    "s3:GetBucketLocation",
                    "s3:ListBucketMultipartUploads"
                ],
                "Resource": [
                    "arn:aws:s3:::${BUCKET}"
                ]
            }
        ]
    }
    EOF
  6. Attach the policies to give the velero user the minimum necessary permissions:

    aws iam put-user-policy --user-name velero --policy-name velero \
      --policy-document file://velero-policy.json
  7. Create an access key for the velero user:

    aws iam create-access-key --user-name velero
  8. Create a credentials-velero file replacing the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY from the previous command output:

    cat << EOF > ./credentials-velero
    [default]
    aws_access_key_id=$AWS_ACCESS_KEY_ID
    aws_secret_access_key=$AWS_SECRET_ACCESS_KEY
    EOF

Configure OADP to use S3 for backups

This sections creates the DataProtectionApplication that will be used to schedule backups.

  1. Log in to OpenShift using oc cli login command.

  2. Create a Secret object with the credentials-velero file:

    oc create secret generic cloud-credentials -n openshift-adp \
    --from-file cloud=credentials-velero 
  3. Create the DataProtectionApplication file:
    cat << EOF > ./dpa.yaml 
    apiVersion: oadp.openshift.io/v1alpha1
    kind: DataProtectionApplication
    metadata:
      name: dpa
      namespace: openshift-adp
    spec:
      configuration:
        restic:
          enable: true
        velero:
          defaultPlugins:
            - aws
      backupLocations:
        - velero:
            config:
              region: ${REGION}
              profile: default    
            credential:
              key: cloud
              name: cloud-credentials
            objectStorage:
              bucket: ${BUCKET}
              prefix: demo
            default: true
            provider: aws
    EOF
  4. Create DataProtectionApplication by running this command:
    oc apply -f dpa.yaml -n openshift-adp
  5. Confirm the application is ready for backup by confirming the status of the BackupStorageLocation is available:
    oc get BackupStorageLocation -n openshift-adp
    NAME    PHASE       LAST VALIDATED   AGE   DEFAULT
    dpa-1   Available   58s              81m   true

Deploy sample application: WordPress

Next, we need to deploy a Kubernetes application that uses persistent volumes. We will use WordPress, which uses the MySQL database. The instructions were based on this link.

  1. Download the MySQL deployment configuration file:

    curl -LO \
    https://raw.githubusercontent.com/yortch/oadp/main/mysql-deployment.yaml
  2. Download the WordPress configuration file:

    curl -LO \
    https://raw.githubusercontent.com/yortch/oadp/main/wordpress-deployment.yaml
  3. Export PASSWORD as an environment variable:

    PASSWORD=<YOUR_PASSWORD>
  4. Generate the following kustomization.yaml file:

    cat <<EOF >./kustomization.yaml
    secretGenerator:
    - name: mysql-pass
      literals:
      - password=${PASSWORD}
    resources:
      - mysql-deployment.yaml
      - wordpress-deployment.yaml
    EOF
    
  5. Create a new application project:
    oc new-project wordpress
  6. Apply the kustomization.yaml file:
    oc apply -n wordpress -k ./
  7. Next, expose the WordPress service:
    oc expose service wordpress -n wordpress
  8. Print the service URL and navigate to it from a browser.
    echo http://$(oc get route -n wordpress -o jsonpath='{.items[0].spec.host}')
  9. Proceed with the initial WordPress setup so that it is included in the backup.

Back up the WordPress application

  1. Create a Backup resource file (the default timeout is 30 days/720 hours, which is how long the data will be retained in the backup):
    cat <<EOF >./backup.yaml
    apiVersion: velero.io/v1
    kind: Backup
    metadata:
      name: backup  
      namespace: openshift-adp
    spec:  
      includedNamespaces:    
        - wordpress  
      defaultVolumesToRestic: true
      ttl: 720h
    EOF
  2. Apply the backup.yaml file
    oc apply -n openshift-adp -f backup.yaml
  3. Confirm that the pod volumes backup status show status as Completed:
    oc get PodVolumeBackup -n openshift-adp

Simulate disaster event

Before we can restore from the backup, we need to simulate a disaster event by deleting the deployments and corresponding persistent volume claims and persistent volumes. Remember that the goal is to simulate the persistent volumes are lost so that they can be restored successfully.

mapfile -t pvs < <(oc get PersistentVolumeClaim -n wordpress  --no-headers | awk '{print $3}');
oc delete Deployment,PersistentVolumeClaim --all -n  wordpress;
for pv in "${pvs[@]}"; do oc delete pv $pv; done

Navigate to the WordPress application URL to confirm that the application is no longer functional.

Restore WordPress application from backup

Next, proceed and create a Restore resource file specifying the backup name used earlier:

cat <<EOF >./restore.yaml
apiVersion: velero.io/v1
kind: Restore
metadata:
  name: restore  
  namespace: openshift-adp
spec:  
  backupName: backup
EOF

Apply the restore file so that the restore is triggered:

oc apply -n openshift-adp -f restore.yaml

Monitor progress of the pod volume restore until the status changes to Completed:

oc get PodVolumeRestore -n openshift-adp

Navigate back to the WordPress application URL and confirm that it shows the demo site instead of the initial WordPress setup screen. This proves that the persistent volumes were restored to the version that was backed up.

Troubleshooting and other tips

This section covers some additional tips to keep in mind.

Viewing backup/restore logs

Backup and restore logs can be viewed from the openshift-adp namespace pod logs. However, because multiple pods are running and you could be running multiple backups, it is often challenging to find the corresponding backup logs. Alternatively, logs can be downloaded from the S3 bucket using S3 CLI commands. As an example, this command will download restore logs for a backup named backup, which uses the demo prefix:

aws s3 mv s3://oadp/demo/restores/restore/restore-restore-results.gz .

Another helpful command is to list files recursively. For instance, the command below will list all files for a bucket named oadp, which uses the demo prefix:

aws s3 ls --recursive s3://oadp/demo/

Increasing backup/restore timeout

For backups with hundreds of gigabytes, it can take over an hour (which is the default timeout) to initial backup completion (subsequent incremental backups will take less time). To change the timeout for Restic volume backups, you can set it via spec.configuration.restic.timeout in the DataProtectionApplication instance:

spec:  
  configuration:    
   restic:      
     enable: true      
     timeout: 2h

Scheduling backups

Velero provides a convenient resource that can be used to schedule backups: the Schedule resource. The Schedule resource has essentially the same definition as a Backup resource, but it has an additional schedule property that is used to specify a cron expression to specify when and how frequent to take backups. Here is a sample Schedule definition, which will create a daily backup at 7 a.m.:

apiVersion: velero.io/v1
kind: Schedule
metadata:
  name: <schedule>
  namespace: openshift-adp
spec:
  schedule: 0 7 * * * 
  template:
    includedNamespaces:
    - <namespace>

Summary

In this article, we learned how you can use OADP to easily back up OpenShift applications, including persistent volumes.

Last updated: March 25, 2024