Building resilient event-driven architectures with Apache Kafka

In this article, you will learn how to set up the Knative Broker implementation for Apache Kafka in a production environment. Recently, the Kafka implementation has been announced as General Availablitly.

One common question for production systems is always the need for an optimal configuration based on the given environment. This article gives recommendations on how to pick a good default configuration for the Knative Broker implementation for Apache Kafka.

The article is based on a small yet production-ready setup of Apache Kafka and Apache Zookeeper, each system consisting of three nodes. You can find a reference Kafka configuration based on Strimzi.io on the Strimzi GitHub page.

Note: This article is not giving recommendations for the configuration of the Kubernetes cluster.

Setting up the Knative Broker

Each broker object uses a Kafka topic for the storage of incoming CloudEvents. The recommendation to start with is as follows:

  • Partitions: 10
  • Replication Factor: 3

Taking this configuration gives you a fault-tolerant, highly-available, and yet scalable basis for your project. Topics are partitioned, meaning they are spread over a number of buckets located on different Kafka brokers. To make your data fault tolerant and highly available, every topic can be replicated, even across geo-regions or datacenters. You can find more detailed information in the Apache Kafka documentation.

Note: Configuration aspects such as topic partitions or replication factor directly relate to the actual sizing of the cluster. For instance, if your cluster consists of three nodes, you cannot set the replication factor to a higher number, as it directly relates to the available nodes of the Apache Kafkacluster. You can find details about implications in the Strimzi documentation.

An example Knative Broker configuration

Let us take a look at a possible configuration of a Knative Broker for Apache Kafka with this defined cluster in mind:

apiVersion: v1
kind: ConfigMap
metadata:
 name: <broker-name>-config
data:
 bootstrap.servers: <url>
 auth.secret.ref.name: <optional-secret-name>
 default.topic.partitions: "10"
 default.topic.replication.factor: "3"

---

apiVersion: eventing.knative.dev/v1
kind: Broker
metadata:
 annotations:
   eventing.knative.dev/broker.class: Kafka
 name: <broker-name>
spec:
 config:
   apiVersion: v1
   kind: ConfigMap
   name: <broker-name>-config
 delivery:
   retry: 12
   backoffPolicy: exponential
   backoffDelay: PT0.5S
   deadLetterSink:
     ref:
       apiVersion: eventing.knative.dev/v1
       kind: Broker
       name: <dead-letter-sink-broker-name>

We see two manifests defined:

  • A ConfigMap resource
  • A Broker resource.

The ConfigMap defines the URL to the Apache Kafka cluster and references a secret for TLS/SASL security support. The manifest also sets the partitions and replication factor for the Kafka topic, internally used for the Broker object, to store incoming CloudEvents as Kafka records. The Broker uses the eventing.knative.dev/broker.class indicating the Kafka-based Knative Broker implementation should be used. On the spec, it references the ConfigMap, as well configuration on the broker's event delivery.

Broker event delivery guarantees and retries

Knative provides various configuration parameters to control the delivery of events in case of failure. This configuration is able to define a number of retry attempts, a backoff delay, and a backoff policy (linear or exponential). If the event delivery is not successful, the event is delivered to the dead letter sink, if present. The delivery spec object is global for the given broker object and can be overridden on a per Trigger basis, if needed.

The following resources offer more details on the delivery spec:

The Knative Broker as a dead letter sink approach

The deadLetterSink object can be any Addressable object that conforms to the Knative Eventing sink contract, such as a Knative Service, a Kubernetes Service, or a URI. However, this example is configuring a deadLetterSink, with the different broker object. The benefit of this highly recommended approach is that all undelivered messages are sent to a Knative broker object and consumed from there, using the standard Knative Broker and Trigger APIs.

An example trigger configuration

The triggers are used for the egress out of the broker to consuming services. Triggers are executed on the referenced broker object. The following is an example of a trigger configuration:

apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
 name: <trigger-name>
 annotations:
   kafka.eventing.knative.dev/delivery.order: <delivery-order> 
spec:
 broker: <broker-name>
 filter:
   attributes:
    type: <cloud-event-type>
    <ce-extension>: <ce-extension-value>
 subscriber:
   ref:
     apiVersion: v1
     kind: Service
     name: <receiver-name>

We see a trigger configuration for a specific broker object, which filters the available CloudEvents, their attributes, and custom CloudEvent extension attributes. Matching events are routed to the subscriber, which can be any Addressable object that conforms to the Knative Eventing sink contract, as defined above.

Note: We recommend applying filters on triggers for CloudEvent attributes and extensions. If no filter is provided, all occurring CloudEvents are routed to the referenced subscriber.

Message delivery order

When dispatching events to the subscriber, the Knative Kafka Broker can be configured to support different delivery ordering guarantees by using the kafka.eventing.knative.dev/delivery.order annotation on every Trigger object.

The supported consumer delivery guarantees are:

  • unordered: An unordered consumer is a non-blocking consumer that delivers messages unordered while preserving proper offset management. This is useful when there is a high demand for parallel consumption and no need for explicit ordering. One example could be the processing of click analytics.
  • ordered: An ordered consumer is a per-partition blocking consumer that waits for a successful response from the CloudEvent subscriber before it delivers the next message of the partition. This is useful when there is a need for more strict ordering or if there is a relationship or grouping between events. One example could be the processing of customer orders.

Note: The default ordering guarantee is unordered.

Trigger delivery guarantees and retries

The global broker delivery configuration can be overridden on a per-trigger basis using the delivery on the Trigger spec. The behavior is the same as defined for the Broker event delivery guarantees and retries.

Our advice for Knative Broker configuration

In this article, we demonstrated how to set up the Knative Broker implementation for Apache Kafka in a production environment. We provided recommendations for picking a good default configuration for the Knative Broker implementation for Apache Kafka. Feel free to comment below if you have questions. We welcome your feedback.