-
Notifications
You must be signed in to change notification settings - Fork 4
Home
Du Bin edited this page Oct 5, 2017
·
10 revisions
It is important to monitor the Kafka status.
- Kafka 0.10.x
- Java 8
- InfluxDB
- [Grafana] (use as front end)
- It is easy.
- Some threads to get the info from Kafka topic offset, some to get the info from Kafka consumer message. Set a timer to get all the lag of all consumers.
- A timer task to get JMX metric from Kafka.
- Need to store the status in Database or do checkpoint? No, the aim of the code is to monitor, all info is time related. When the code is Down(like OOM or Power down), recovering code with the old status does not make any sense.
- InfluxDB will be a bottle neck? No, usually the topic and partition are not that large, so the single influxDB will be enough.
- Kafka-eagle written in Java, but there is a problem when facing lots of topic and partition. I tried to optimize that part of code for improving the concurrency.
- Burrow written in Go, hard to use the function already implemented in Kafka, like the JMX info.
- Timestamp. Because it needs some time to update the offset, may be the offsets and lags not the latest. More, offsets will not be updated during the network failure or else, so the lag with such timestamp will be wrong. The other similar projects have the same problem.
- How many consumer we need? When the data center is large, may the consumer speed of internal topic is not fast enough, but how many is enough? Need a benchmark code to do this.
- Multiple data center support.
This is the 1.0 release of the code. The next release I will fix all the Issues. 2.0 release will show up in 2 months.