Kafka 101

Learn about the fundamentals of Apache Kafka. This tutorial covers basic concepts of Kafka and its components.

What are messages?

A message or record is a key/value pair with data for the consumer applications. Each message is stored inside a topic (Figure 5). This message is persisted and durable during its configured lifespan. The position of each message within a topic is its offset. Messages are generally consumed from each partition in the order in which they were added.

Messages sent by producers are stored in order in a partition, and then distributed to consumers.
Figure 5: Messages sent by producers are stored in order in a partition, and then distributed to consumers.

A message is typically a small chunk of data. Its optional key identifies the message and determines by default which partition stores the message (Figure 6). The message’s value is the content of the message, which can be in any format.

Messages are assigned to partitions based on the key within the message’s data.
Figure 6: Messages are assigned to partitions based on the key within the message’s data.

Moreover, each message contains a metadata timestamp attribute that is set either by the producer at creation time or by the broker on insertion time, and also an optional set of headers that are key-value pairs.

Messages can be up to 1MB in size by default, although you can configure Apache Kafka to work with large messages. In order to handle large streams of data efficiently, the recommended message size is only a few kilobytes.

Previous resource
What are partitions?
Next resource
What are producers?