Part-1
Apache Kafka is a Message Broker.
Kafka is a middleman between producers and the consumers.
Apache Kafka is an open-source distributed publish-subscribe messaging platform that has been purpose-built to handle real-time streaming data for distributed streaming, pipelining, and replay of data feeds for fast, scalable operations.
Apache Kafka is a horizontally scalable, fault-tolerant, distributed steaming platform.
Consumers read data from brokers. Consumers subscribes to one or more topics and consume published messages by pulling data from the brokers
Since Kafka brokers are stateless, which means that the consumer has to maintain how many messages have been consumed by using partition offset.
If the consumer acknowledges a particular message offset, it implies that the consumer has consumed all prior messages.
The consumers can rewind or skip to any point in a partition simply by supplying an offset value. Consumer offset value is notified by ZooKeeper.
Workflow of Pub-Sub Messaging
Producers send message to a topic at regular intervals or as per your business needs.
Kafka broker stores all messages in the partitions configured for that particular topic. It ensures the messages are equally shared between partitions. If the producer sends two messages and there are two partitions, Kafka will store one message in the first partition and the second message in the second partition.
Consumer subscribes to a specific topic.
Once the consumer subscribes to a topic, Kafka will provide the current offset of the topic to the consumer and also saves the offset in the Zookeeper ensemble.
Consumer will request the Kafka in a regular interval (like 100 Ms) for new messages.
Once Kafka receives the messages from producers, it forwards these messages to the consumers.
Consumer will receive the message and process it.
Once the messages are processed, consumer will send an acknowledgement to the Kafka broker.
Once Kafka receives an acknowledgement, it changes the offset to the new value and updates it in the Zookeeper. Since offsets are maintained in the Zookeeper, the consumer can read next message correctly even during server outrages.
This above flow will repeat until the consumer stops the request.
Consumer has the option to rewind/skip to the desired offset of a topic at any time and read all the subsequent messages.
This above flow will repeat until the consumer stops the request.
Consumer has the option to rewind/skip to the desired offset of a topic at any time and read all the subsequent messages.