This 49th edition of the Kafka Monthly Digest covers what happened in the Apache Kafka community in February 2022. Some new minor releases are in the works that may be of interest. I'll also discuss new KIPs and open source releases in January 2022.
For last month’s digest, see Kafka Monthly Digest: January 2022.
Releases
There are currently two releases in progress: 3.2.0 and 3.0.1.
Apache Kafka 3.2.0
On February 5, Bruno Cadonna volunteered to run the next minor release, Kafka 3.2.0. This release is currently targeted for April. Check out the release plan on the Kafka wiki for more details.
Apache Kafka 3.0.1
On February 7, I volunteered to run the Kafka 3.0.1 bugfix release. This release is expected to arrive in March, and you can find the release plan on the Kafka wiki.
Kafka Improvement Proposals
Last month, the community submitted five KIPs (KIP-819 to KIP-823). I'll highlight a few that caught my eye.
-
KIP-820: Extend KStream process with new Processor API: Kafka Streams exposes two APIs: the Processor API, which gives full control to the developer, and the Streams DSL, which is a layer on top that allows you to express most data processing operations in just a few lines of code. Using recent improvements to the Processor API, this KIP aims to improve the process()) methods on the Streams DSL to return output values that could be chained across the topology.
-
KIP-821: Connect Transforms support for nested structures: Kafka Connect comes with a number of built-in transformations that enable you to update records as they flow through Connect. However, these transformations can currently only operate on top-level fields of records, which greatly limits their usefulness. This KIP proposes to update them so they support nested fields and expose utilities that can easily be reused by developers writing their own transformations.
-
KIP-822: Optimize the semantics of KafkaConsumer#pause to be consistent between the two RebalanceProtocols: Using
pause()
, consumers can at any time suspend fetching data for a specific partition. This is useful to control the flow of data and, for example, prioritize some partitions over others. Pausing happens at the consumer level and does not cause a rebalance. But when a rebalance does happen, depending on the rebalance protocol used, partitions may not stay paused. This KIP's goal is to make the behavior consistent across the different rebalance protocols and maintain paused partitions across rebalances.
Community releases for Apache Kafka
This section covers a few notable open source community project releases:
-
strimzi-kafka-operator 0.28: Strimzi is a Kubernetes Operator for running Kafka. Version 0.28 delivers support for Kafka 3.1 and also contains a number of improvements to security components, such as support for custom authentication on listeners, better configuration for OAuth, and fixes for renewing CA certificates.
-
kafkajs 1.16: KafkaJS is a pure JavaScript Kafka client for Node.js. This release brings a few improvements to the admin client and a number of fixes, especially in the consumer.
Kafka blogs and articles
Here are a few of the most noteworthy Kafka-related blogs and articles published in February 2022:
- A fresher data lake on AWS S3
- The four innovation phases of Netflix's trillions-scale real-time data infrastructure
- Building self-driving Kafka clusters using open source components
To learn more about Kafka, visit Red Hat Developer's Apache Kafka topic page.
Last updated: February 5, 2024