Content

Saturday, November 11, 2017

How to read an Apache Storm 1.0.2 Kafka Spout Zookeeper offset?

As with any streaming systems, the consumers namely the Kafka Spout need to keep track up to which message identifier has been read in Kafka topic. The reason this needs to be persisted is if we do need to restart the topology and the spout restarts, it needs to know from where it needs to start reading from else it would start from the beginning. In order to prevent this from happening Storm Kafka spout allows once to persist the offset in Zookeeper and would automatically read it when the Topology restarts.

In order to find this information,  we need to login to the Zookeeper Command Line Shell.
cd /usr/local/zookeeper/zookeeper-3.4.9

bin/zkCli.sh -server zookeeper1:2181

Check the topic and its partition for the consumer, your need to type in your topicname and if you have more than 1 partition make sure you do this to every partition

get /consumers/yourcompany/yourtopicname/partition_0

Sample response

{"topology":{"id":"YourTopologyInstanceId-1-1497152721","name":"YourTopologyName"},"offset":3673,"partition":0,"broker":{"host":"81387110753b","port":9092},"topic":"yourTopicName"}

The offset:3673 says up to which offset the Kafka spout has read from this topic's partition 0.


No comments:

Post a Comment