Alvin's Big Data Notebook : Kafka Operation Notes

Hardware Requirements:

Unlike some systems, Kafka itself doesn't require a lot of RAM. Kafka makes use of the operation system's page cache to hold recently-used data. But more memory will improve performance because of a larger pagecache.

Kafka brokers have a relatively small memory footprint. Extra RAM will be used by the operating system for disk caching.
Kafka is heavily multi-threaded, favor more cores over faster cores.

Typical JVM options:

-Xms6g -Xmx6g -XX:MetaspaceSize=96m -XX:+UseG1GC -XX:MaxGCPauseMillis=20

-XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M

-XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80

Avoid clusters that span datacenters.

Zookeeper is sensitive to I/O latency. Make sure that it has its own disk.
Run a Zookeeper quorum of 3 or 5 nodes.

Each broker must have its own unique ID. Set the advertised.listeners property.

To Delete a Topic:
All brokers must have the "delete.topic.enable" set to true. Otherwise, the delete command will be silently ignored.

What does Committed Really Mean?

Data is received by all the replicas in the ISR(In Sync Replicas)
Not related to the ack setting chosen by the producer
Committed state is checkpointed to disk
Data can't be seen until it is committed

Leader maintains the latest committed offset.
Replica is added to the ISR when it is fully caught up.
Control the lag between leader and replica: replica.lag.time.max.ms
If too large, replicas will slow down writes
If too small, replicas will drop in and out of ISR.

Controller Broker:

Detects broker failure/restart via Zookeeper.
When the leader fails, controller selects a new leader and updates the ISR.
Persists the new leader and ISR to Zookeeper.
Sends the new leader/ISR change to all brokers

Check Topic Partition and Replica:
$ kafka-topics --zookeeper localhost:2181 --describe --topic mytopic

Important log4j Files:

controller.log: Logs all Broker failures, and actions taken because of them.
state-change.log: Logs every decision it has received from the controller.

Group Coordinator:

Each Consumer Group has a GroupCoordinator(elected Broker)
Consumers heartbeat to the GroupCoordinator
Lack of heartbeats causes a rebalance
During rebalance, consumption is paused.
Group Coordinator makes one consumer as Group Leader.
Only the leader gets the list of group members.
Group leader calls the partition assigner to assign consumers to partitions.

Manual Offset Management:

Set auto.commit.offset = false
commitSync(): Includes retry logic
commitAsync(): No retries, has potential to reorder
Consider combinations of sync and async

Log Compaction: old value for the key are deleted.
Uses: Database change capture; Stateful stream processing; Event sourcing.

Number of Partitions:
More partitions means higher throughput.
However, it requires more open file handles, increase unavailability, end-to-end latency, more memory in the client.

Check Consumer Offsets:

$ kafka-consumer-groups --group my-group --describe --new-consumer

--bootstrap-server=localhost:9090

Alvin's Big Data Notebook

Friday, 27 January 2017

Kafka Operation Notes

No comments:

Post a Comment