This is the 51st edition of the Kafka Monthly Digest, and covers what happened in the Apache Kafka community in April 2022.
For last month’s digest, see Kafka Monthly Digest: March 2022.
There are currently two releases in progress, 3.2.0 and 3.1.1.
The release process for 3.2.0 continued this month. Bruno Cadonna published the first release candidate, RC0, on April 15. However, a few issues were quickly identified: KAFKA-12841, PR-12102, and KAFKA-13794. A new release candidate should be available in the next few days. For more details, you can find the release plan on the Kafka wiki.
Kafka Improvement Proposals
Last month, the community submitted five KIPs (KIP-827 to KIP-831). I'll highlight a few of them:
KIP-827: Expose logdirs total and usable space via Kafka API: Kafka has a
DescribeLogDirsAPI for retrieving details about all log directories from the brokers. This includes the list of replicas and their sizes. This KIP proposes adding the total size and the available size of the log directories. Administrators typically already monitor these values using metrics, but exposing them via the API would enable tools to easily validate the state of disk or rebalance operations.
KIP-829: (console-consumer) add print.topic property: This KIP proposes updating the
kafka-console-consumertool to display the topic of each record. This would be useful when using the
--includeflag, which enables providing a regex to specify topics to consume from.
KIP-831: Add metric for log recovery progress: If a broker shuts down unexpectedly, upon restarting it will first perform log recovery before joining the cluster. Depending on the amount of log data, this recovery can take some time to complete, but it is important to ensure the data is not corrupted. This KIP aims to provide new metrics so administrators can monitor the recovery process.
- Debezium 1.9. Debezium is a change data capture platform. Version 1.9 introduces support for Cassandra 4. It now also works with SQL Server multidatabase environments. Finally this release includes several enhancements, especially around Redis and MySQL, and many bug fixes.
I selected some interesting blog posts and articles that were published last month:
- Process Formula 1 telemetry with Quarkus and OpenShift Streams for Apache Kafka
- Presto on Apache Kafka at Uber scale
- Optimizing Pinterest's data ingestion stack: Findings and learnings
- Kafka Summit London 2022: The full recap
To learn more about Kafka, visit Red Hat Developer's Apache Kafka topic page.Last updated: May 11, 2022