This 59th edition of the Kafka Monthly Digest covers what happened in the Apache Kafka community in December 2022. We will also look back at some of the milestones that the Kafka project and community reached over the past year.
For last month’s digest, see Kafka Monthly Digest: November 2022.
There are currently two releases in progress: 3.4.0 and 3.3.2.
The release process for 3.4.0 continued. The release entered code freeze in December. A few late issues (KAFKA-14392, KAFKA-14457 and KAFKA-14496 delayed the release. The first release candidate is now expected in January.
Chris Egerton published RC0 on December 15 but a few of blockers (the same as 3.4.0) were found. The next release candidate, RC1, was then published on December 21. The vote is currently on-going.
Last month, the community submitted 3 KIPs (KIP-893 to KIP-895). I'll highlight a couple of them:
- KIP-894: Use incrementalAlterConfigs API for syncing topic configurations. In order to sync topic configuration between clusters, MirrorMaker uses the
alterConfigs()Admin API. One issue is that
alterConfigs()overwrites all topic configurations so it prevents running systems that rely on setting topic configurations, such as Cruise Control, on the target cluster. This KIPs aims at addressing this limitation by using the newer
incrementalAlterConfigs()API to only sync the desired topic configurations.
- KIP-895: Dynamically refresh partition count of __consumer_offsets. Kafka uses the
__consumer_offsetsinternal topic to store committed offsets. Due to its inner workings, if a user adds partitions to this topic, brokers will only take this into account after restarting. If only a subset of brokers are restarted or while a cluster rolls, some brokers will work with the new number of partitions and others will still use the previous value. This causes consumer groups to enter error states and be unusable until all brokers have restarted. This KIP proposes changing how brokers use the
__consumer_offsetsso they can detect at runtime if the partition count changes.
- Debezium 2.1: Debezium is a Change Data Capture platform. This release contains a lot of new features and improvements. This includes a new connector for Cloud Spanner, support for snapshotting in the Vitess connector, and support for predicates and the Cassandra connector in Debezium Server.
I selected some interesting blog articles that were published last month:
Project Milestones in 2022
As the year concludes, let's look at some of the milestone Apache Kafka achieved in 2022.
New Features in 2022
One of the most significant milestones is KRaft reaching production ready status (via KIP-833 in 3.3). However note that upgrading from ZooKeeper mode is still being implemented and should be available later this year.
Other notable new features include exactly-once delivery guarantees for source connectors in Kafka Connect (via KIP-618 in 3.3) and the new interactive query interface IQv2 in Kafka Streams (via KIP-796 in 3.2).
Releases in 2022
The project followed its time-based release plan. Consequently, it released three minor versions: 3.1, 3.2 and 3.3 as well as 8 bugfix releases (2.8.2, 3.0.1, 3.0.2, 3.1.1, 3.1.2, 3.2.1, 3.2.3 and 3.3.1). Figure 1 shows the timeline for these releases.
KIPs in 2022
In the past 12 months, the community raised over 80 KIPs. This is less than previous years but it's still a new KIP created roughly every 4 days. This counts KIPs created and does not take into account whether they were accepted or merged. Figure 2 shows how this compares to previous years.
Code and contributors in 2022
This year 54 unique contributors made more than 770 commits. This is similar to previous years. The size of the codebase (computed with git ls-files -z | xargs -0 cat | wc -l) is still growing and crossed the million lines with the 3.2 release. Figure 3 shows the size of the codebase, in lines of code, for a few releases.
Committers and PMC in 2022
In 2022, five contributors were invited to become Committers:
- Luke Chen
- Chris Egerton
- Ziming Deng
- Viktor Somogyi-Vass
- Josep Prat
Likewise, three Committers also joined the Apache Kafka PMC:
- Bruno Cadonna
- Sophie Blee-Goldman
- Luke Chen
The current roster of Committers and PMC member is available on the Kafka website.
What's coming for Kafka in 2023
The Kafka project is still rapidly evolving and here are some work currently in progress:
- KIP-405: Kafka Tiered Storage: I already covered this feature last year but unfortunately it is not complete yet. Some significant progress has been made and hopefully it should get released this year.
- KIP-866: ZooKeeper to KRaft Migration: This KIP has been voted and it is currently being implemented. Once it's ready it will finally allow users to migrate to KRaft and run their existing clusters without ZooKeeper.
- KIP-848: The Next Generation of the Consumer Rebalance Protocol: This is a significant overhaul of the consumer group protocol and aims at making it truly incremental and cooperative by moving the assignment logic to the coordinator (currently assignments are computed on the client side).
To learn more about Kafka, visit Red Hat Developer's Apache Kafka topic page.