Apache Kafka
workshop | WIP
Agenda
-
Apache Kafka introduction
-
Topics
-
Producers
-
Consumers
-
Kafka Streams
-
Configurations
-
Ops & Tooling
-
Monitoring


Text
Text
❌
Topic
log.retention.bytes
7d default
cleanup.policy
delete
compact
log.retention.ms
Consumer
auto.offset.reset
Earliest
Latest
Memory & data size
fetch.min.bytes
Defines a minimum number of bytes required to send data from Kafka to the consumer. When Consumer polls for data, if the minimum number of bytes is not reached, then Kafka waits until the pre-defined size is reached and then sends the data.
CPU load many messages
fetch.max.bytes
Text
max.poll.records
Limits the number of records retrieved in a single call to poll. Default is 500.
max.partition.fetch.bytes
limits the number of bytes fetched per partition. This should not be a problem as the default is 1MB.
Max memory usage
min(
num brokers * max.fetch.bytes,
max.partition.fetch.bytes * num_partitions
)
Signs of life
heartbeat.interval.ms
Specifies the frequency of sending heart beat signal by the consumer.
So if this is 3000 ms (default), then every 3 seconds the consumer will send the heartbeat signal to the broker.
session.timeout.ms
Specifies the amount of time within which the broker needs to get at least one heart beat signal from the consumer. Otherwise it will mark the consumer as dead. The default value 10000 ms (10 seconds) makes provision for missing three heart beat signals before a broker will mark the consumer as dead.
poll loop
wakeup
shut down hook
Errors
FETCH_SESSION_ID_NOT_FOUND
max.incremental.fetch.session.cache.slots
INVALID_FETCH_SESSION_EPOCH
Resolved since v2.3.0+
max.incremental.fetch.session.cache.slots
More
Producer
acks
0
1
all
enable.idempotence
Exactly once
linger.ms
Throttling
Broker
broker.id
broker.id.generation.enable=true
reserved.broker.max.id=1000
log.dir / log.dirs
default in /tmp
auto.create.topics.enable
false?
default.replication.factor
2?
3?
auto.leader.rebalance.enable
Enables auto leader balancing. A background thread checks the distribution of partition leaders at regular intervals, configurable by `leader.imbalance.check.interval.seconds`. If the leader imbalance exceeds `leader.imbalance.per.broker.percentage`, leader rebalance to the preferred leader for partitions is triggered.
Security
Security Protocols
- PLAINTEXT – Un-authenticated, non-encrypted channel
- SASL_PLAINTEXT – SASL authenticated, non-encrypted channel
- SASL_SSL – SASL authenticated, SSL channel
- SSL – SSL channel
Ops
Kafka Manager
(CMAK)
ZooNavigator
Kafkacat
Cruise Control
Burrow
Kafka WorkShop
By Šimon Podlipský
Kafka WorkShop
- 99