Kafka 101

Learn about the fundamentals of Apache Kafka. This tutorial covers basic concepts of Kafka and its components.

What are partitions?

A partition contains a subset of the messages written to a topic. New messages are appended to the partition, which guarantees that messages maintain their order at the partition level. Using multiple partitions enables improved performance in case of heavy load, data sharing, and replication. 

The partitions of a topic are distributed across Apache Kafka brokers to maximize the parallelism when producing to topics and consuming from topics. A topic is ultimately the sum of all events of all its partitions (Figure 3).

Each topic is handled by a single Kafka broker, and can be divided into multiple partitions that can be replicated across multiple brokers.
Figure 3: Each topic is handled by a single Kafka broker, and can be divided into multiple partitions that can be replicated across multiple brokers.

Partitions can be configured to be replicated across the brokers in an Apache Kafka cluster. Each partition can be replicated, and one of the brokers in the Apache Kafka cluster is designated as the partition leader. All messages are produced and consumed via the leader, and the partition replicas (on other brokers) just stay in sync with the leader. If the leader becomes unavailable, one of the synced replicas becomes the new leader (Figure 4).

For each partition, one broker is the leader that controls the replication of messages from that partition.
Figure 4: For each partition, one broker is the leader that controls the replication of messages from that partition.
Previous resource
What are topics?
Next resource
What are messages?