In this article, you will learn how to set up the Knative Broker implementation for Apache Kafka in a production environment. Recently, the Kafka implementation has been announced as General Availablitly.
One common question for production systems is always the need for an optimal configuration based on the given environment. This article gives recommendations on how to pick a good default configuration for the Knative Broker implementation for Apache Kafka.
The article is based on a small yet production-ready setup of Apache Kafka and Apache Zookeeper, each system consisting of three nodes. You can find a reference Kafka configuration based on Strimzi.io on the Strimzi GitHub page.
Note: This article is not giving recommendations for the configuration of the Kubernetes cluster.
Setting up the Knative Broker
Each broker object uses a Kafka topic for the storage of incoming CloudEvents. The recommendation to start with is as follows:
- Partitions: 10
- Replication Factor: 3
Taking this configuration gives you a fault-tolerant, highly-available, and yet scalable basis for your project. Topics are partitioned, meaning they are spread over a number of buckets located on different Kafka brokers. To make your data fault tolerant and highly available, every topic can be replicated, even across geo-regions or datacenters. You can find more detailed information in the Apache Kafka documentation.
Note: Configuration aspects such as topic partitions or replication factor directly relate to the actual sizing of the cluster. For instance, if your cluster consists of three nodes, you cannot set the replication factor to a higher number, as it directly relates to the available nodes of the Apache Kafkacluster. You can find details about implications in the Strimzi documentation.
An example Knative Broker configuration
Let us take a look at a possible configuration of a Knative Broker for Apache Kafka with this defined cluster in mind:
We see two manifests defined:
ConfigMap defines the URL to the Apache Kafka cluster and references a secret for TLS/SASL security support. The manifest also sets the partitions and replication factor for the Kafka topic, internally used for the
Broker object, to store incoming CloudEvents as Kafka records. The
Broker uses the
eventing.knative.dev/broker.class indicating the Kafka-based Knative Broker implementation should be used. On the
spec, it references the
ConfigMap, as well configuration on the broker's event delivery.
Broker event delivery guarantees and retries
Knative provides various configuration parameters to control the delivery of events in case of failure. This configuration is able to define a number of retry attempts, a backoff delay, and a backoff policy (linear or exponential). If the event delivery is not successful, the event is delivered to the dead letter sink, if present. The
delivery spec object is global for the given broker object and can be overridden on a per
Trigger basis, if needed.
The following resources offer more details on the
The Knative Broker as a dead letter sink approach
deadLetterSink object can be any
Addressable object that conforms to the Knative Eventing sink contract, such as a Knative Service, a Kubernetes Service, or a URI. However, this example is configuring a
deadLetterSink, with the different broker object. The benefit of this highly recommended approach is that all undelivered messages are sent to a Knative broker object and consumed from there, using the standard Knative Broker and Trigger APIs.
An example trigger configuration
The triggers are used for the egress out of the broker to consuming services. Triggers are executed on the referenced broker object. The following is an example of a trigger configuration:
We see a trigger configuration for a specific
broker object, which filters the available CloudEvents, their attributes, and custom CloudEvent extension attributes. Matching events are routed to the
subscriber, which can be any
Addressable object that conforms to the Knative Eventing sink contract, as defined above.
Note: We recommend applying filters on triggers for CloudEvent attributes and extensions. If no
filter is provided, all occurring CloudEvents are routed to the referenced
Message delivery order
When dispatching events to the
subscriber, the Knative Kafka Broker can be configured to support different delivery ordering guarantees by using the
kafka.eventing.knative.dev/delivery.order annotation on every
The supported consumer delivery guarantees are:
unordered: An unordered consumer is a non-blocking consumer that delivers messages unordered while preserving proper offset management. This is useful when there is a high demand for parallel consumption and no need for explicit ordering. One example could be the processing of click analytics.
ordered: An ordered consumer is a per-partition blocking consumer that waits for a successful response from the CloudEvent subscriber before it delivers the next message of the partition. This is useful when there is a need for more strict ordering or if there is a relationship or grouping between events. One example could be the processing of customer orders.
Note: The default ordering guarantee is
Trigger delivery guarantees and retries
The global broker delivery configuration can be overridden on a per-trigger basis using the
delivery on the
Trigger spec. The behavior is the same as defined for the Broker event delivery guarantees and retries.
Our advice for Knative Broker configuration
In this article, we demonstrated how to set up the Knative Broker implementation for Apache Kafka in a production environment. We provided recommendations for picking a good default configuration for the Knative Broker implementation for Apache Kafka. Feel free to comment below if you have questions. We welcome your feedback.