At Red Hat, open source is at the heart of everything we do. We are the world's largest open source software company, and staying true to our roots is something that is incredibly important to us. From the first versions of Red Hat Enterprise Linux, to the latest release of Red Hat OpenShift, all our code is open and available to our customers. We work with people across a wide spectrum of open source projects, some overseen by foundations such as the Apache Software Foundation (ASF), Cloud Native Computing Foundation (CNCF) and the Eclipse Foundation, and some independently run by their community.
This article explores how Red Hat engineers contribute to open source projects in the data streaming landscape.
The Strimzi project: Apache Kafka on Kubernetes
In 2018 Red Hat identified an opportunity to develop a new team, and community, to improve the experience of deploying and operating Apache Kafka on Kubernetes. The Strimzi project was created and in 2019 it was donated to CNCF, entering at the sandbox level. The project continues to grow with a diverse set of users and contributors, and was rewarded with Incubating status in 2024 (indicating a stable project used in production by a number of organizations).
Strimzi has been at the heart of innovation at the intersection of Kubernetes and Apache Kafka, and significantly reduces the operational burden of deploying and managing Kafka on Kubernetes. Custom resources enable declarative management of components such as Kafka clusters, topics, users and Kafka Connect connectors. These allow administrators to manage Kafka clusters in a manner consistent with other workloads on Kubernetes. Strimzi goes further in helping with certificate management and security, and in the path to Kafka 4.0 will automate the migration from ZooKeeper to KRaft mode.
Apache Kafka
Engineers from Red Hat contribute heavily to not only Strimzi but Apache Kafka, too. Over the past year, Red Hat has had 5 active contributors in the project who have improved the code, documentation, tests or in some other way. There are 3 committers (Tom Bentley, Mickael Maison, and Luke Chen) who are Red Hat associates—these people have the ability to merge changes to the software and cast binding votes on the Kafka Improvement Proposals (KIPs)—the process by which API-affecting changes are introduced to the project. The three committers from Red Hat are also members of the Project Management Committee (PMC) who oversee the governance of the project. In 2023, Mickael Maison was voted Chair of the PMC following the resignation of Jun Rao.
Apache Kafka utilizes a process known as Kafka Improvement Proposals (KIPs) to discuss and agree on architectural and API level changes in the project. These design documents require a review and support from three committers in order to be approved—a process Red Hat engineers are heavily involved in. Following approval, the code can be written to implement the feature and will be available in a future release of Kafka. Engineers at Red Hat have proposed many KIPs across a wide scope in the Kafka project.
To the core Kafka broker:
- KIP-827: Expose log dirs total and usable space via Kafka API
- KIP-707: The future of KafkaFuture
- KIP-788: Allow configuring num.network.threads per listener
- KIP-978: Allow dynamic reloading of certificates with different DN / SANs
To the clients:
- KIP-830: Allow disabling JMX Reporter
- KIP-894: Use incrementalAlterConfigs API for syncing topic configurations
To the Kafka Connect runtime:
- KIP-769: Connect APIs to list all connector plugins and retrieve their configuration definitions
- KIP-581: Value of optional null field which has default value
Contributions to Apache Kafka from Red Hat engineers are not limited to feature implementations, but also to maintaining the health of the project and community. For instance, engineers from Red Hat contribute to the migration from Scala to Java for a number of components, and provide numerous bug fixes. In 2022, engineers from Red Hat were responsible for discovering, reporting, and responsibly disclosing CVE-2022-34917. This involved liaising with the Apache Software Foundation security group, helping to fix the issue, and coordinating embargoed releases prior to the vulnerability being made public.
Code must be also reviewed prior to being merged into the project. Engineers from Red Hat are among the most active code reviewers in the project. Releasing any piece of software is a complex operation and handled by the committers in Apache Kafka. To date, engineers from Red Hat have been the Release Manager for over 40% of releases in the Kafka 3.x series, the most of any company.
Other data streaming projects
Considering the wider data streaming ecosystem, engineers from Red Hat actively contribute to many projects in this space. Some of them include:
- Cruise Control: A tool for balancing workloads (eg. disk utilization, network bandwidth, etc.) across a set of Apache Kafka brokers.
- Kroxylicious: A Kafka protocol proxy which enables use cases such as topic encryption.
- Debezium: Changes data capture for a variety of databases. This allows an application to consume a list of change events from database tables.
- SmallRye Reactive Messaging with Kafka: Bridges the gap between Reactive Messaging and Apache Kafka.
- Apicurio Registry: A schema registry for storing schemas in formats such as Apache Avro and JSON Schema.
- kcctl: A command-line client for Kafka Connect.
Engineers from Red Hat are also actively engaged in the Apache Kafka and data streaming communities. They shared their expertise at conferences such as Kafka Summit, Devoxx UK, and jFokus about Kafka and data streaming topics. Mickael Maison releases a monthly blog post highlighting notable changes to Kafka, and other engineers contribute to blogs such as Strimzi and Grafana and provide regular support on mailing lists and Stack Overflow.
Kate Stanley and Mickael Maison recently wrote a book on Kafka Connect (free download available from Red Hat Developer). While writing it, they identified and addressed bugs in Kafka Connect and fixed numerous gaps in the documentation.
Summary
This article details the contributions of engineers from Red Hat to open source projects in the data streaming space. In some of those communities, the contributions are bug fixes and incremental improvements. However, in others such as Apache Kafka and Strimzi, engineers from Red Hat play leading roles.
Note: Red Hat Streams for Apache Kafka includes supported versions of many of these projects. If you are interested in finding out more, please get in touch.