This edition of the Kafka Monthly Digest covers what happened in the Apache Kafka community in December 2021. We will also look back at some of the milestones that the Kafka project and community reached over the past year, and we'll look ahead to two of the most anticipated Kafka features coming in 2022.
For last month’s digest, see Kafka Monthly Digest: November 2021.
Releases: Apache Kafka 3.1.0
The only release in progress at this moment is Apache Kafka 3.1.0.
David Jacot published the 3.1.0 RC0 (Release Candidate 0) on December 23. The vote on this first release candidate is currently ongoing. This release is a bit behind schedule but should be available in early January.
Kafka Improvement Proposals
Last month, the community submitted five KIPs (KIP-806 to KIP-811). Two particularly caught my eye.
KIP-808: Add support for Unix epoch precision in TimestampConverter SMT:
TimestampConverteris one of the built-in Single Message Transformations (SMTs) for Kafka Connect. It supports the conversion of dates and timestamps in messages flowing through Kafka Connect. At the moment, this SMT only supports timestamps in milliseconds. The KIP's goal is to allow developers to specify the unit of timestamps in order to support seconds, microseconds, and nanoseconds.
KIP-810: Allow producing records with null values in Kafka Console Producer: In order to clear a key in a compacted topic, users currently need to produce a message with that key and the value set to
null. Such a record is called a tombstone. As this is a relatively common operation when using compacted topics, this KIP proposes allowing the production of such records using the
December brought us a couple of notable open source community project updates:
Debezium 1.8: This newest release of the change data capture platform brings numerous improvements to the MongoDB connector. The Debezium UI also now supports configuring SMTs and topic creation options.
strimzi-kafka-operator 0.27: Strimzi is a Kubernetes Operator for running Kafka. Strimzi 0.27 brings multi-arch container images with support for AArch64. The
ControlPlaneListenerfeature gate is now enabled by default to have a separate listener for controller-to-broker communications, and a few dependencies such as Cruise Control, Log4j2, and OPA Authorizer have been updated to the latest versions.
Check out these interesting blog articles that were published last month:
- Deep dive into Apache Kafka storage internals: Segments, rolling, and retention
- Scheduling millions of messages with Kafka & Debezium
- Kafka and .NET, Part 3: Finally at .NET
- Visualize your Apache Kafka Streams using the Quarkus Dev UI
- Announcing the First Release of kcctl
Kafka project milestones in 2021
As a new year begins, let's look back at some of the milestones Apache Kafka achieved in 2021.
The Kafka project followed its time-based release plan in 2021. Consequently, it released two versions—2.8.0 and 3.0.0—as well as six bugfix releases: 2.6.1, 2.6.2, 2.6.3, 2.7.1, 2.7.2 and 2.8.1. Figure 1 shows the timeline for these releases.
In the past 12 months, the community raised 106 KIPs. Figure 2 shows how this compares to previous years.
Code and contributors
Over 190 unique contributors made more than 1,200 Kafka commits in 2021. Figure 3 shows the size of the codebase, in lines of code, for a few releases.
New committers and PMC members
In 2021, three contributors were invited to become Kafka committers:
- Tom Bentley
- Bruno Cadonna
- José Armando García Sancio
These six committers also joined the Apache Kafka project management committee (PMC):
- Chia-Ping Tsai
- Bill Bejeck
- Randall Hauch
- Konstantine Karantasis
- Tom Bentley
- David Jacot
The current roster of committers and PMC members is available on the Kafka website.
What's coming for Kafka in 2022
The Kafka project is still evolving and improving at breakneck speed. Out of the dozens of new features being worked on, the two most awaited features are:
- KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum(KRaft): This is the removal of ZooKeeper. Kafka 2.8.0 introduced KRaft in early access. It is expected to stay in early access for a few more releases to leave time to implement features that are still missing in this mode, such as zero-downtime upgrades (KIP-778) and authorizations (KIP-801) but also to ensure it's tested at scale before being ready for production.
- KIP-405: Kafka Tiered Storage: This enables storing old segments onto remote storage such as Amazon S3 or Hadoop Distributed File System. This KIP was first proposed in January 2019 and after many discussions, it was finally voted in early 2021! The implementation is currently in progress so expect to hear about it again this year.
To learn more about Kafka, visit Red Hat Developer's Apache Kafka topic page.