This 78th edition of the Kafka Monthly Digest covers what happened in the Apache Kafka community in July 2024.
For last month’s digest, see Kafka Monthly Digest: June 2024.
Releases
There is a new release, 3.8.0, and 3.9.0 is in progress:
3.8.0
Josep Prat published 3.8.0 RC0 on July 12 but an issue in the Admin client was quickly identified. Then 3.8.0 RC1 was published on July 15 but a regression in Connect was found. A third release candidate was published on July 18 but another issue this time in the log layer was found. Finally RC4 was published on July 26. It passed the vote and Kafka 3.8.0 released on July 29. You can find the announcement on the Apache Kafka blog. You can also check the release notes and the release plan in the wiki.
This new minor release brings several new features and many bug fixes. Tiered Storage is still in early access.
Kafka brokers and clients
Updates to the Kafka broker and clients include the following:
- JBOD support in KRaft mode is now production ready. (KIP-858)
- Tiered storage now supports JBOD but it remains in early access.
- The new consumer protocol introduced in Kafka 3.7 in early access has received a lot of improvements. See the release notes for instructions on how to get started. Note that this feature is still not production ready. (KIP-848)
- The compression level when using gzip, lz4 and zstd can now be configured. This allows optimizing the compression for your use cases and potentially improve latency and throughput. (KIP-390)
- A new Docker image kafka-native is available. It uses GraalVM to run Kafka as a native application, so it starts significantly faster and uses less memory than the regular Docker image. (KIP-974)
- The File and Directory configuration providers now allow restricting paths that can be accessed. (KIP-933)
Kafka Connect
Updates to Kafka Connect include the following:
- The configuration of connectors can be updated partially using the new
PATCH /connectors/<id>/config
REST API endpoint. (KIP-477) - The maximum number of tasks created by a connector can now be enforced using the
tasks.max.enforce
configuration. (KIP-1004)
Kafka Streams
Updates to Kafka Streams include the following:
- State stores can now be shared across applications. There are two new methods on
Topology
namedaddReadOnlyStateStore()
that allow specifying a changelog topic to reuse. (KIP-813) - The task assignor can be customized by implementing the new
TaskAssignor
interface. (KIP-924)
3.9.0
The release process of 3.9.0 continued. Colin McCabe cut the 3.9 branch on July 30. You can find the release plan in the wiki.
Kafka Improvement Proposals
Last month, the community submitted 8 KIPs (KIP-1066 to KIP-1074, KIP-1069 was skipped). I'll highlight a few of them:
- KIP-1066: Mechanism to cordon brokers and log directories: When new topics or partitions are created, Kafka always tries to distribute them across all brokers in the cluster. This works very well in most cases but it is not optimal when the cluster is being scale in or out. In these scenarios it's better to assign the new partitions to a subset of the brokers (the new brokers when scaling out, the brokers remaining when scaling in). This KIP proposes a mechanism to prevent assigning new partitions to specific log directories and brokers. It reuses the "cordon" terminology from Kubernetes to make a broker or log directory unable to host any new partitions.
- KIP-1068: New metrics for the new KafkaConsumer: Kafka 3.7 introduced a new consumer rebalance protocol via KIP-848 (still in early access). This KIP adds new metrics to the Consumer when it's using the new rebalance protocol.
- KIP-1071: Streams Rebalance Protocol: This KIP aims at updating the rebalance protocol used by Streams to follow the new consumer rebalance protocol from KIP-848. With this new protocol, the group coordinator will be responsible for computing the task assignments and creating internal topics. The goal is also to make stand-by and warm-up tasks first class citizens in the protocol.
-
KIP-1073: Return inactive observer nodes in DescribeQuorum response: In KRaft mode, when a broker is decommissioned, it needs to be explicitly unregistered from the cluster. This is done using the
Admin.unregisterBroker()
API and providing the Id of the broker to unregister. However at the moment there isn't a mechanism to retrieve the Ids of all registered brokers, theAdmin.describeMetadataQuorum()
API only returns brokers that are currently online. This KIP proposes enhancing theDescribeQuorum
API to also include offline brokers.
Community Releases
- strimzi-kafka-operator 0.42: Strimzi is a Kubernetes Operator for running Kafka. This new release adds support for Kafka 3.7.1. The
UseKRaft
feature gate is now permanently enabled, so to use KRaft when deploying a new cluster you just need to set thestrimzi.io/kraft: enabled
annotation on your Kafka custom resource.
- Librdkafka 2.5: Librdkafka is a Kafka client in C/C++. This release adds support for KIP-951 so new leaders can be discovered more efficiently when there are partition leadership changes, and it also supports sending client metrics to brokers for better observability (KIP-714).
Blogs
I selected some interesting blog articles that were published last month:
To learn more about Kafka, visit Red Hat Developer's Apache Kafka topic page.