5 - Kafka Producer and Consumer Flow

Producer Write Flow

Imagine a producer wants to publish an event to a topic called order-events.

The producer sends:

  • Topic name
  • Key
  • Value
  • Acknowledgement configuration (acks=all)

Example:

{
  "topic": "order-events",
  "key": "user-42",
  "value": {
    "orderId": 1001,
    "amount": 500
  },
  "acks": "all"
}

Producer Needs Metadata First

A Kafka cluster contains multiple brokers.

The producer cannot randomly send data to any broker. It must determine:

  • Which partition should receive the event
  • Which broker is the leader for that partition

So the producer first requests metadata.

The metadata contains information like:

  • Topic partitions
  • Partition leaders
  • Replica assignments

Example:

PartitionLeader Broker
Partition 0Broker 1
Partition 1Broker 2
Partition 2Broker 3

If the broker receiving the metadata request does not have fresh metadata, it contacts the active controller.

Partition Selection

Kafka determines the partition using:

hash(key) % number_of_partitions

If:

hash("user-42") % 3 = 1

then the event belongs to:

Partition 1

From metadata, the producer knows:

Partition 1 → Broker 2 is leader

So the producer directly sends the event to Broker 2.

Broker Writes the Event

Broker 2 appends the event to the partition log.

Kafka logs are append-only.

Example:

Partition-1 Log
 
Offset 100 -> older event
Offset 101 -> new event

Kafka never inserts in the middle.

It only appends at the end.

This is extremely important because append-only sequential writes are very fast on disk.

ISR and Acknowledgements

The producer used:

acks=all

That means the leader broker cannot immediately acknowledge success.

Instead:

  1. Leader writes the event
  2. Followers in ISR replicate the event
  3. Leader waits for ISR acknowledgement
  4. Only then does Kafka respond to the producer

Example ISR:

ISR = [Broker1, Broker2, Broker3]

If Broker 2 is the leader:

  • Broker 1 and Broker 3 must also replicate the event
  • Then the producer receives success response

This provides stronger durability guarantees.

Complete Producer Flow

Consumer Read Flow

Consumers work differently from producers.

Kafka consumers are poll-based, not push-based.

Kafka never pushes data automatically.

Consumers continuously ask Kafka for new data.

Consumer Groups and Group Coordinators

Suppose a consumer joins a group:

notification-service-group

Kafka must determine:

  • Which broker will coordinate this group
  • Which partitions this consumer should read

Kafka uses the internal topic:

__consumer_offsets

This topic itself has partitions.

Example:

50 partitions

Kafka calculates:

hash(groupId) % 50

If:

hash(notification-service-group) % 50 = 23

then:

Partition 23 of __consumer_offsets

manages this consumer group.

Whichever broker is leader for Partition 23 becomes the:

Group Coordinator

Consumer Fetches Metadata

Just like producers, consumers also need cluster metadata.

The consumer requests:

  • Topic information
  • Partition leaders
  • Coordinator information

The controller provides metadata if brokers do not already have fresh copies.

Joining the Consumer Group

The consumer contacts the group coordinator.

The coordinator:

  • Registers the consumer
  • Assigns partitions
  • Stores consumer group metadata inside __consumer_offsets

Example assignment:

Consumer-3 → Partition-2 of order-events

This information is replicated across ISR replicas of the coordinator partition.

Finding the Last Committed Offset

Suppose a previous consumer crashed. The new consumer must know where to resume reading. The consumer again contacts the group coordinator. The coordinator checks __consumer_offsets.

The lookup key looks conceptually like:

(groupId, topic, partition)

Example:

(notification-service-group, order-events, partition-2)

The stored value:

last_processed_offset = 101

So the consumer resumes from offset 101.

Fetching Messages

Now the consumer directly contacts the leader broker for the assigned partition.

Example:

order-events → partition-2 → leader = Broker 2

The consumer requests:

Fetch from offset 101
Return 200 bytes

Kafka returns records sequentially.

Example:

101
102
103
...

The consumer processes the batch.

Offset Commit

After processing completes, the consumer commits offsets back to Kafka.

This commit is stored in:

__consumer_offsets

Again, the consumer talks to the group coordinator.

Example commit:

Processed till offset 501

Now if the consumer crashes later, Kafka knows where to resume.

Complete Consumer Registration & Fetch Flow

Kafka is Poll-Based

Kafka consumers continuously poll.

Flow:

Fetch batch
Process batch
Commit offset
Poll again

Example:

Fetch from 502
Return next 200 bytes

This loop continues forever.

Log Retention and Cleanup

Kafka keeps appending events forever unless cleanup policies remove old data.

There are two major cleanup policies.

Delete Policy

Default policy:

cleanup.policy=delete

Kafka deletes old segments based on Time and Size

Example:

retention.ms=7days
retention.bytes=1GB

Note that retention works at the segment level, not per message. If a segment becomes older than retention time, delete the entire segment. If total partition size exceeds configured limit, delete oldest segments first.

Log Compaction

Compaction keeps only the latest value for a key.

Configuration:

cleanup.policy=compact

Example log:

OffsetKeyValue
100user1v1
101user2v1
102user1v2
103user3v1
104user2v2

After compaction:

KeyLatest Value
user1v2
user2v2
user3v1

Older values are eventually removed asynchronously.

Why Kafka is Fast

Kafka reads and writes from disk. Isn't disk slow? Kafka achieves very high throughput using several optimizations.

Write Optimization: Page Cache

Kafka does not synchronously write every request directly to disk. Instead, data first enters OS page cache (RAM). The OS asynchronously flushes to disk. This avoids blocking disk waits.

Sequential Writes

Kafka logs are append-only. So writes are always sequential. Sequential disk writes are dramatically faster than random writes.

Kafka never:

  • updates records in place
  • inserts in the middle
  • modifies older events

It only appends.

Read Optimization: Zero Copy

Normally data flow looks like:

Disk → OS Page Cache → JVM/User Space → Network

Kafka optimizes this using the sendfile() system call.

With zero-copy:

Disk → OS Page Cache → Network

Kafka skips copying data into JVM user space.

Benefits:

  • fewer memory copies
  • lower CPU usage
  • lower latency
  • higher throughput