Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Debezium serialization with Apache Avro and Apicurio Registry

December 11, 2020
Hugo Guerrero
Related topics:
Artificial intelligenceEvent-DrivenKubernetes
Related products:
Developer Tools

Share:

    In this article, you will learn how to use Debezium with Apache Avro and Apicurio Registry to efficiently monitor change events in a MySQL database. We will set up and run a demonstration using Apache Avro rather than the default JSON converter for Debezium serialization. We will use Apache Avro with the Apicurio service registry to externalize Debezium’s event data schema and reduce the payload of captured events.

    What is Debezium?

    Debezium is a set of distributed services that captures row-level database changes so that applications can view and respond to them. Debezium connectors record all events to a Red Hat AMQ Streams Kafka cluster. Applications use AMQ Streams to consume change events.

    Debezium uses the Apache Kafka Connect framework, which transforms Debezium’s connectors into Kafka Connector source connectors. They can be deployed and managed using Kafka Connect custom Kubernetes resources provided by AMQ Streams.

    Debezium supports the following database connectors:

    • MySQL connector
    • PostgreSQL connector
    • MongoDB connector
    • SQL Server connector

    We’ll use the MySQL connector for our example.

    Debezium serialization with Apache Avro

    Debezium uses a JSON converter to serialize record keys and values into JSON documents. By default, the JSON converter includes a record’s message schema, so each record is quite verbose. Another option is to use Apache Avro to serialize and deserialize each record’s keys and values. If you want to use Apache Avro for serialization, you must also deploy a schema registry, which manages Avro’s message schemas and their versions.

    Apicurio Registry is an open source project that works with Avro. It provides an Avro converter along with an API and schema registry. You can specify the Avro converter in your Debezium connector configuration. The converter then maps Kafka Connect schemas to Avro schemas. It uses the Avro schemas to serialize record keys and values into Avro’s compact binary form.

    The Apicurio API and schema registry track the Avro schemas used in Kafka topics. Apicurio also tracks where the Avro converter sends the generated Avro schemas.

    Note: The Apicurio service registry is fully supported and generally available as part of Red Hat Integration.

    Demonstration: Debezium serialization with Apache Avro and Apicurio

    For this demonstration, we will run an example application that uses Avro for serialization and the Apicurio service registry to track Debezium events. To successfully run the example application, you will need the following tools installed and running in your development environment:

    • The most recent version of Docker: Needs to be installed with Linux container images .
    • kafkacat : A generic non-JVM producer and consumer for Kafka.
    • jq : A command-line utility for JSON processing.

    Step 1: Start the services

    Once you have the required tools installed in your development environment, we can begin the demonstration by cloning the debezium-examples repository and starting the required service components:

    1. Clone the repository:
      $ git clone https://github.com/hguerrero/debezium-examples.git
    2. Change to the following directory:
      $ cd debezium-examples/debezium-registry-avro
    3. Start the environment:
      $ docker-compose up -d

    The last command starts the following components:

    • A single-node Zookeeper and Kafka cluster
    • A single-node Kafka Connect cluster
    • An Apicurio service registry instance
    • A MySQL database that is ready for change data capture

    Step 2: Configure the Debezium connector

    Next, we configure the Debezium connector. The configuration file instructs the Debezium connector to use Avro for serialization and deserialization. It also specifies the location of the Apicurio registry.

    The container image used in this environment includes all of the required libraries to access the connectors and converters. The following lines set the key and value converters and their respective registry configurations:

            "key.converter": "io.apicurio.registry.utils.converter.AvroConverter",
            "key.converter.apicurio.registry.url": "http://registry:8080/api",
            "key.converter.apicurio.registry.global-id": "io.apicurio.registry.utils.serde.strategy.AutoRegisterIdStrategy",
            "kwy.converter.apicurio.registry.as-confluent": "true",
            "value.converter": "io.apicurio.registry.utils.converter.AvroConverter",
            "value.converter.apicurio.registry.url": "http://registry:8080/api",
            "value.converter.apicurio.registry.global-id": "io.apicurio.registry.utils.serde.strategy.AutoRegisterIdStrategy",
            "value.converter.apicurio.registry.as-confluent": "true"
    

    Note that Apicurio’s compatibility mode lets us use tooling from another provider to deserialize and reuse the Apicurio service registry’s schemas.

    Step 3: Create the connector

    Next, we create the Debezium connector to start capturing changes in the database. We’ll use the Kafka Connect cluster REST API to create the Debezium connector:

    $ curl -X POST http://localhost:8083/connectors -H 'content-type:application/json' -d @dbz-mysql-connector-avro.json
    

    Step 4: Check the data

    We have created and started the connector. Now, we notice that the initial data in the database has been captured by Debezium and sent to Kafka as events. Use the following kafkacat command to review the data:

    $ kafkacat -b localhost:9092 -t avro.inventory.customers -e
    

    Step 5: Deserialize the record

    When you review the data, you will notice that the information returned is not human-readable. That means Avro correctly serialized it. To get a readable version of the data, we can instruct kafkacat to query the schema from the Apicurio service registry and use it to deserialize the records. Run the following command with the registry configuration:

    $ kafkacat -b localhost:9092 -t avro.inventory.customers -s avro -r http://localhost:8081/api/ccompat -e
    

    If you have the jq JSON utility installed, you can use the following command instead:

    $ kafkacat -b localhost:9092 -t avro.inventory.customers -s avro -r http://localhost:8081/api/ccompat -e | jq
    

    You can see that the Kafka record information contains only the payload without the Debezium schema’s overhead. That overhead is now externalized in the Apicurio registry.

    Step 6: Check the Apicurio schema registry

    Finally, if you want to view a list of all of the schema artifacts for this example, you can check the Apicurio schema registry at http://localhost:8081/, shown in Figure 1.

    The Apicurio registry lists six schema artifacts for this example.

    Figure 1: Schema artifacts in the Apicurio schema registry.

    Get started with Debezium and Kafka Connect

    You can download the Red Hat Integration Debezium connectors from the Red Hat Developer portal. You can also check out Gunnar Morling’s webinar on Debezium and Kafka (February 2019) from the DevNation Tech Talks series. Also, see his more recent Kafka and Debezium presentation at QCon (January 2020).

    Conclusion

    Although Debezium makes it easy to capture database changes and record them in Kafka, developers still have to decide how to serialize the change events in Kafka. Debezium lets you specify the key and value converters from various options. Using an Avro connector with the Apicurio service registry, you can store externalized versions of the schema and minimize the payload you have to propagate.

    Debezium Apache Kafka connectors are available from Red Hat Integration. Red Hat Integration is a comprehensive set of integration and messaging technologies that connect applications and data across hybrid infrastructures.

    Last updated: March 18, 2024

    Recent Posts

    • How to run a fraud detection AI model on RHEL CVMs

    • How we use software provenance at Red Hat

    • Alternatives to creating bootc images from scratch

    • How to update OpenStack Services on OpenShift

    • How to integrate vLLM inference into your macOS and iOS apps

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue