Transitioning to Red Hat OpenShift Service on AWS (ROSA) HCP from classic with MTC

Explore how the migration toolkit for containers (MTC) paves the way for seamless migration of application workloads from ROSA classic to ROSA HCP clusters, right down to the namespace level.

Let’s explore the basics of data migration in simple terms. In this lesson, you will gain an understanding of different migration types, compare direct vs indirect migration, and learn about various data copy methods.

In order to get the full benefit from taking this lesson, you need:

  • Familiarity with basic concepts of data management and transfer, such as understanding the importance of migration in moving data between environments.
  • Some knowledge of OpenShift or similar container orchestration platforms to grasp the context of migration strategies within such environments.

In this lesson, you will learn:

  • The fundamentals of different migration types, including full migration, state migration, and storage class conversion.
  • The distinctions between direct and indirect migration paths, which will help you understand when to opt for each method and how they impact the efficiency and success of your migration projects.
  • Various data copy methods such as file system copy and snapshot copy.

Migration types

There are 3 migration types supported by MTC:

  • Full migration
  • State migration
  • Storage class conversion

Each of the above migration types is again divided into 3:

  • Cutover migration
  • Stage migration
  • Rollback migration

Full migration

This migrates namespaces, persistent volumes (PV), and Kubernetes resources from the source cluster to the target cluster.

Cutover migration

  • This migration type refers to when all the resources in the source cluster are migrated to the destination cluster and the application is started to run in the target cluster. You can optionally halt the applications on the source cluster. If you don’t do this, 2 instances of the application run in both clusters.

  • If you don't stop the source application before the cutover migration, you'll end up with two instances of the same application in separate clusters. This leads to a disparity in their statuses if new transactions occur independently in each cluster post-migration, making it impossible to merge their updated statuses.

Stage migration

  • Stage migration refers to when only the data, such as PV data and image streams, is copied to the destination cluster without stopping the application in the source cluster. The application is not started in the target cluster.

  • If you are planning to run the cutover migration, run the stage migration before that. This will reduce the duration of the cutover migration because the data to be copied from the source to the target would be smaller.

Rollback migration

  • Rollback migration rolls back a completed migration. This migration plan will revert migrated resources and volumes to their original states (new changes made in the source are ignored) and original locations.

  • Restores original replica counts on Deployments, DeploymentConfigs, StatefulSets, ReplicaSets, DaemonSets, CronJobs, and Jobs in the source cluster.

  • Deletes migrated resources in the target cluster.

  • Deletes Velero backups and restores created during the migration in both source and target clusters.
  • Removes migration annotations and labels from PVs, PVCs, Pods, ImageStreams, and namespaces in both source and target clusters.
  • Rollback migration makes sense only if your application is no longer running in the source cluster. If you didn’t halt your application in your source cluster before the cutover migration, the rollback process just deletes the application in the target cluster and no changes are made to the application already running in the source cluster. 

State migration

This type migrates only PVs between namespaces in the same cluster or different clusters.

  • When migrated within the same cluster, a new project with the default new name <current project>-new is created.
  • When migrated to a different cluster, a new project with the default same name is created.

Of course, you can change the new project name. PVs are copied into the new project. 

Cutover migration

  • This migration type refers to when PV data is copied to the target. All transactions on source applications are halted by stopping all pods for the migration. Applications can be started manually or automatically using rollback migration.

Stage migration

  • In this migration type, PV data is copied to the target. The source applications continue running.

Rollback migration

  • This migration type is possible only after cutover migration where all pods were stopped. Rollback migration rolls back a completed migration. This migration will revert migrated resources and volumes to their original states and original locations.

Storage class conversion

This migration type converts PVs to a different storage class within the same namespace in the same cluster. The default name of the converted PV would be <current-pv-name>-new.

Cutover migration

  • In this migration type, all transactions on source applications are halted for the migration. PV data is copied to the converted PVs. PVC references in the applications are updated to new PVCs and the applications are restarted. Both the old and new PVCs are available in the namespace.

Stage migration

  • In a stage migration, PV data is copied to the converted PVs. PVC references in the applications are not updated to new PVCs. The source applications continue running as is. Both the old and new PVCs are available in the namespace.

Rollback migration

  • Rolling back the migration plan will revert migrated resources and volumes to their original states and locations. Rollback migration for storage class conversion functions the same way it does for full and state migrations.

Direct vs indirect migration

When data is copied directly from the source cluster to the target cluster, it’s called direct migration. There are 2 types of direct migrations: Direct Volume Migration (DVM) and Direct Image Migration (DIM)

Direct Volume Migration (DVM) 

  • Persistent volumes are copied directly from the source cluster to the destination cluster bypassing the replication repository. DVM has additional prerequisites. 

  • For example, if you run DVM with nodes that are in different availability zones, the migration might fail because the migrated pods cannot access the persistent volume claim. DVM has significant performance benefits because it skips the intermediate steps of backing up files from the source cluster to the replication repository and restoring files from the replication repository to the target cluster. The data is transferred with Rsync.

Direct Image Migration (DIM)

  • In DIM, images are copied directly from the source cluster to the destination cluster bypassing the replication repository. DIM also has significant performance benefits because it skips the replication repository. The data is transferred with Rsync. DIM requires remote clusters with an exposed secure registry. 

Indirect migration

  • When you use an indirect migration, your images, volumes, and Kubernetes objects are copied from the source cluster to the replication repository, then from the replication repository to the destination cluster.

Data copy methods

MTC supports the file system and snapshot data copy methods for data migration.

File system copy

  • MTC copies data files from the source cluster to the target cluster. The file system copy method uses Restic for indirect migration or Rsync for direct volume migration.

  • The file system copy method offers several benefits, such as allowing clusters to use different storage classes, supporting all S3 storage providers, providing optional data verification with checksum, and enabling direct volume migration for improved performance. However, it's worth noting that file system copy tends to be slower than the snapshot copy method, and optional data verification can reduce performance.

Snapshot copy

  • In this method, MTC copies a snapshot of the source cluster data to the replication repository of a cloud provider. The data is restored on the target cluster. This method can be used with Amazon Web Services, Google Cloud Provider, and Microsoft Azure.

  • Snapshot copy offers the advantage of being faster than the file system copy method. However, it comes with some limitations. Firstly, cloud providers must support snapshots for this method to work. Additionally, clusters using snapshot copy must be on the same cloud provider and located in the same region. Furthermore, clusters must share the same storage class, which must also be compatible with snapshots. Lastly, snapshot copy does not support direct volume migration.

Previous resource
Introduction to migration toolkit for containers (MTC)
Next resource
Step-by-step guide to MTC installation and configuration