Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Red Hat OpenShift Connectors: Configuring change data capture

September 7, 2022
Bernard Tison
Related topics:
KafkaOpen source
Related products:
Red Hat OpenShift

Share:

    Red Hat OpenShift Connectors is a new cloud service offering from Red Hat. The service provides prebuilt connectors to enable quick and reliable connectivity across data, services, and systems. Each connector is a fully managed service, tightly integrated with Red Hat OpenShift Streams for Apache Kafka, Red Hat's managed cloud service for Apache Kafka.

    Change data capture (CDC) is a software process that detects changes (inserts, updates, deletes) to data in a database and transforms these changes into event streams that can be consumed by other systems or applications to react to these changes.

    Typical use cases for change data capture include: 

    • Data replication
    • Streaming analytics
    • Event-driven distributed applications

    Red Hat OpenShift Connectors offers several source connectors for change data capture, based on the popular Debezium open source project. OpenShift Connectors support the following databases:

    • PostgreSQL
    • MySQL
    • SQL Server
    • MongoDB

    This article demonstrates how to configure a source connector to capture data change events from MongoDB Atlas, a fully managed cloud database provided by MongoDB.

    Prerequisites

    This article assumes that you have created an instance of OpenShift Streams for Apache Kafka and that the instance is in the Ready state. Please refer to Getting started with Red Hat OpenShift Streams for Apache Kafka for step-by-step instructions on creating your Kafka instance.

    Create a service account and configure the access rules for it. The service account requires privileges to read and write topics and to create new topics. The Get started with Red Hat OpenShift Connectors article has detailed instructions on creating and configuring the service account and the access rules.

    Set up a MongoDB Atlas instance

    The following instructions will guide you through the process of setting up a MongoDB Atlas instance:

    • Sign up to provision a MongoDB Atlas instance. You can create an Atlas account or sign up with your Google account.
    • After registration, you are taken to a page where you can choose the type of cloud database you want to provision. MongoDB Atlas offers different tiers, including a free tier for an easy getting-started experience. The free tier is sufficient to follow this demonstration.
    • When creating your MongoDB Atlas instance, you have to specify the security settings. Choose a username and password and create a user to connect to your MongoDB instance.
    • Add an IP address to the IP Access List. Enter 0.0.0.0/0 because you don't know the IP address where the managed OpenShift Connector is running. This effectively allows connections to your MongoDB instance from anywhere. You can make the IP Access List more restrictive later on.
    • Add databases and collections once the MongoDB Atlas instance is up and running.

    MongoDB Atlas provides a sample dataset described in the documentation. We will use the sample_mflix database from the sample dataset. This dataset contains data about movies.

    10 steps to create an instance for change data capture

    Now that you have provisioned and configured a MongoDB Atlas instance and loaded the sample dataset, you can create an OpenShift Connectors source connector to capture change events from the sample_mflix database. From your Red Hat console, complete the following 10 steps:

    1. Select Application and Data Services.

    2. On the Application and Data Services page, select Connectors and click Create Connectors instance.

    3. To find the Debezium MongoDB connector, enter mongo in the search field. Click the Debezium MongoDB Connector card, then click Next.

    4. Select the Streams for Apache Kafka instance for the connector. (This is the instance you created in the prerequisites step.) Then click Next.

    5. On the Namespace page, click Create preview namespace to provision a namespace for hosting the connector instances that you create (or select your existing namespace if you created one earlier). This evaluation namespace will remain available for 48 hours. You can create up to four connector instances per namespace. Once the namespace is available, select it and click Next.

    6. Provide the core configuration for your connector by entering the following values:

      • A unique name for the connector.

      • The Client ID and Client Secret of the service account that you created for your connectors.

      • Click Next.

    7. Provide the connection configuration for your connector. For the Debezium MongoDB connector, enter the following information:

      • Hosts: The set of the replicaset public addresses for your MongoDB instance. You can find these on the web console for your MongoDB instance. Click the Overview tab shown in Figure 1.

        A screenshot of the public addresses for the MongoDB replicaset.
        Figure 1: The Overview tab on the web console of the MongoDB Atlas instance shows the public addresses of the MongoDB replicaset.
      • Click the first link to get to the status page of the replica. The address of the replica is shown at the top of the page in host:port format. Copy the address to a text editor.

      • Repeat the procedure for the other replicas.

      • Go back to the OpenShift Connectors configuration page and enter the addresses of the three replicas separated by commas in the Hosts field. The entry should look like the following list (your values will be different):

        ac-whrxxxx-shard-00-00.ooulprt.mongodb.net:27017, ac-whrxxxx-shard-00-01.ooulprt.mongodb.net:27017, ac-whrxxxx-shard-00-02.ooulprt.mongodb.net:27017

      • Namespace: A unique name that identifies this MongoDB instance. For example, enter mongo-mflix.

      • Password: The password of the database user you created previously. Note that this is not the same user with which you logged into MongoDB Atlas.

      • User: The user name of the database user you created.

      • Make sure the Enable SSL connection to MongoDB is checked.

      Figure 2 shows what a filled-out form looks like.

      A screenshot of configuration properties for MongoDB Debezium connector.
      Figure 2: Enter configuration properties for the MongoDB Debezium connector.
    8. On the next page of the wizard, set the Database filter to sample_mflix and the Collection filter to sample_mflix.movies. This will ensure you capture change events only from the movies collection.

    9. Make sure that the Include box is selected for both entries. Click Apply to apply the filter. [Do not change the values on the Data & runtime page of the wizard. These are advanced values that you rarely need to change.]

    10. Review the summary of the configuration properties. Pay particular attention to the MongoDB Hosts field. Click Create Connector to create the connector.

    Your connector instance will be added to the table of connectors. After a couple of seconds, the status of your connector instance should change to Ready. If your connector ends up in an Error state, you can click the options icon (the three vertical dots) next to the connector. Then edit the configuration and restart the connector.

    Capture data change events from MongoDB

    Once the Debezium MongoDB connector is ready, it connects to the MongoDB database, creates a snapshot of the collections it monitors, and creates data change events for every record present in the collection.

    To verify, use the message viewer in the OpenShift Streams for Apache Kafka web console.

    Head over to the Application and Data Services page of the Red Hat console and select Streams for Apache Kafka→Kafka Instances. Click the name of the Streams for the Apache Kafka instance that you created for connectors. Select the Topics tab.

    You should see four new topics. Debezium connectors run on top of Kafka Connect (in contrast to the other OpenShift Connectors instances, which are based on Camel-K). Kafka Connect creates three topics to maintain its internal state. These are the topics ending with -config, -offset, and -status shown in Figure 3.

    The Kafka Connect connector creates four topics.
    Figure 3: The Kafka Connect connector creates four topics.

     

    The other topic, named mongo-mflix.sample_mflix.movies, holds the data change events from the movies collection. Click the topic name and select the Messages tab. You should see the most recent ten messages in the topic, as shown in Figure 4.

    A topic's Messages tab displays recent messages in the topic.
    Figure 4: A topic's Messages tab displays recent messages in the topic.

     

    The offset of the last message in Figure 4 is 23529, which indicates that there are 23530 events in the topic. This corresponds to the number of records in the movies collection.

    Each data change message has a JSON payload with the following structure:

    • after: Contains the latest state of the document.
    • source: Contains metadata about the connector and the MongoDB instance.
    • op: Specifies the operation that created this change. In this case, the operation is r, which stands for read.

    Please refer to the Debezium documentation for more information on Debezium and the structure of the data change events it produces.

    At this point, you can add new records to the MongoDB collection or modify existing records. The easiest way to do so is through the MongoDB Atlas web console:

    • On the Database page, select the Collections tab
    • Then select the sample_mflix.movies collection.
    • From here you can add new records to the collection or modify existing ones. Every time you make a change, a data change event is produced with the latest state of the document and an operation equal to c for creates and u for updates.
    • You can verify that change events are generated by checking the message viewer of the data change event topic in the Red Hat console.

    An end-to-end data pipeline example

    Capturing data change events from a database with an OpenShift Connector instance is just the first step in using the data. Typically, the data change events are consumed by other services or applications to, for instance, replicate the data or build a local view of the data.

    The following video contains a demo of what an end-to-end data pipeline could look like. The demo uses OpenShift Connectors to stream the data change events to AWS Kinesis. The events trigger an AWS Lambda function that extracts the state of the document from the event and updates an AWS OpenSearch index. Figure 5 shows the pipeline.

    A flow chart illustrating OpenShift Connectors feeds change events.
    Figure 5: Red Hat OpenShift Connectors feeds change events from the database to consumers.

    Watch this quick CDC pipeline demo:

    OpenShift Connectors speed up data collection applications

    We hope you found this demonstration informative and easy to follow. Try Red Hat OpenShift Connectors for yourself and see how they speed up data collection applications. Feel free to reach out to us. We welcome your feedback.

    Last updated: October 13, 2023

    Related Posts

    • Change data capture for microservices without writing any code

    • Developer preview of Debezium Apache Kafka connectors for Change Data Capture (CDC)

    • Installing MongoDB on Red Hat Enterprise Linux

    • A Mongo Shell Cheat Sheet

    Recent Posts

    • Integrate Red Hat AI Inference Server & LangChain in agentic workflows

    • Streamline multi-cloud operations with Ansible and ServiceNow

    • Automate dynamic application security testing with RapiDAST

    • Assessing AI for OpenShift operations: Advanced configurations

    • OpenShift Lightspeed: Assessing AI for OpenShift operations

    What’s up next?

    kafka pizza

    Learn how to use Kafka in your app without installing it! Follow this step-by-step experience for using Red Hat OpenShift Streams for Apache Kafka from code running in a different cluster on the Developer Sandbox for Red Hat OpenShift. You will run your code for this fun pizza stock application in the cloud, as you discover the attractions of distributed computing.

    Learn by doing
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue