Producers publish messages to a topic, appending them to the end of a partition. By default, if a message contains a key, the hashed value of the key is used to decide in which partition receives the message (Figure 17). If the key is null, a round-robin algorithm balances the storage of messages across all partitions. Custom partitioning logic is also supported.

A hash function uses the key in the message to determine which partition stores it.
Figure 17: A hash function uses the key in the message to determine which partition stores it.

 

Each message published to a topic is delivered to a consumer that subscribes to that topic. Each consumer belongs to a consumer group, a list of consumer instances that ensures fault tolerance and scalable message processing. When a consumer group contains only one consumer, that consumer is responsible for processing all messages of all partitions. With multiple consumers in a group, each consumer receives messages from only a subset of the partitions (Figure 18).

 

Define a consumer group with multiple consumers to process messages in a scalable manner.
Figure 18: Define a consumer group with multiple consumers to process messages in a scalable manner.