Wednesday 30 August 2017

Interview Questions and Answers for Kafka

1. What Is an ISR?
An ISR is an in-sync replica. If a leader fails, an ISR is picked to be a new leader.

2. How Does Kafka Scale Consumers?
Kafka scales consumers by partition such that each consumer gets its share of partitions. A consumer can have more than one partition, but a partition can only be used by one consumer in a consumer group at a time. If you only have one partition, then you can only have one consumer.

3. What Are Leaders & Followers?
Leaders perform all reads and writes to a particular topic partition. Followers replicate leaders.

4. How Does Kafka Perform Failover for Consumers?
If a consumer in a consumer group dies, the partitions assigned to that consumer is divided up amongst the remaining consumers in that group.

5. How Does Kafka Perform Failover for Brokers?
If a broker dies, then Kafka divides up leadership of its topic partitions to the remaining brokers in the cluster.

6. Can producers occasionally write faster than consumers?
Yes. A producer could have a burst of records, and a consumer does not have to be on the same page as the consumer.

7. What is the default partition strategy for producers without using a key?
Round-Robin

8. What is the default partition strategy for Producers using a key?
Records with the same key get sent to the same partition.

9. What picks which partition a record is sent to?
The Producer picks which partition a record goes to.

10. Why is Kafka so fast?
Kafka is fast because it avoids copying buffers in-memory (Zero Copy), and streams data to immutable logs instead of using random access.

11. How is Kafka getting used?
Kafka is used to feed data lakes like Hadoop, and to feed real-time analytics systems like Flink, Storm and Spark Streaming.

12. How does Kafka relate to real-time analytics?
Kafka feeds data to real-time analytics systems like Storm, Spark Streaming, Flink, and Kafka Streaming.

13. How does Kafka decouple streams of data?
It decouple streams of data by allowing multiple consumer groups that can each control where in the topic partition they are. The producers don’t know about the consumers. Since the Kafka broker delegates the log partition offset (where the consumer is in the record stream) to the clients (Consumers), the message consumption is flexible. This allows you to feed your high-latency daily or hourly data analysis in Spark and Hadoop and the same time you are feeding microservices real-time messages, sending events to your CEP system and feeding data to your real-time analytic systems.

14. What is a consumer group?
A consumer group is a group of related consumers that perform a task, like putting data into Hadoop or sending messages to a service. Consumer groups each have unique offsets per partition. Different consumer groups can read from different locations in a partition.

15. Does each consumer group have its own offset?
Yes. The consumer groups have their own offset for every partition in the topic which is unique to what other consumer groups have.

16. When can a consumer see a record?
A consumer can see a record after the record gets fully replicated to all followers.

17. What happens if there are more consumers than partitions?
The extra consumers remain idle until another consumer dies.

18. What happens if you run multiple consumers in many threads in the same JVM?
Each thread manages a share of partitions for that consumer group.


No comments:

Post a Comment