Apache Kafka

workshop | WIP

Agenda

  • Apache Kafka introduction

  • Topics

  • Producers

  • Consumers

  • Kafka Streams

  • Configurations

  • Ops & Tooling

  • Monitoring

Text

Text

Topic

log.retention.bytes

7d default

cleanup.policy

delete

compact

log.retention.ms

Consumer

auto.offset.reset

Earliest

Latest

Memory & data size

fetch.min.bytes

Defines a minimum number of bytes required to send data from Kafka to the consumer. When Consumer polls for data, if the minimum number of bytes is not reached, then Kafka waits until the pre-defined size is reached and then sends the data.

 

CPU load many messages

fetch.max.bytes

Text

max.poll.records

Limits the number of records retrieved in a single call to poll. Default is 500.

max.partition.fetch.bytes

limits the number of bytes fetched per partition. This should not be a problem as the default is 1MB.

Max memory usage

min(

    num brokers * max.fetch.bytes,

    max.partition.fetch.bytes * num_partitions

)

Signs of life

heartbeat.interval.ms

Specifies the frequency of sending heart beat signal by the consumer.

So if this is 3000 ms (default), then every 3 seconds the consumer will send the heartbeat signal to the broker.

session.timeout.ms

Specifies the amount of time within which the broker needs to get at least one heart beat signal from the consumer. Otherwise it will mark the consumer as dead. The default value 10000 ms (10 seconds) makes provision for missing three heart beat signals before a broker will mark the consumer as dead.

poll loop

wakeup

shut down hook

Errors

FETCH_SESSION_ID_NOT_FOUND

max.incremental.fetch.session.cache.slots

INVALID_FETCH_SESSION_EPOCH

Resolved since v2.3.0+

 

max.incremental.fetch.session.cache.slots

More

Producer

acks

0

1

all

enable.idempotence

Exactly once

linger.ms

Throttling

Broker

broker.id

broker.id.generation.enable=true

reserved.broker.max.id=1000

log.dir / log.dirs

default in /tmp

auto.create.topics.enable

false?

default.replication.factor

2?

3?

auto.leader.rebalance.enable

Enables auto leader balancing. A background thread checks the distribution of partition leaders at regular intervals, configurable by `leader.imbalance.check.interval.seconds`. If the leader imbalance exceeds `leader.imbalance.per.broker.percentage`, leader rebalance to the preferred leader for partitions is triggered.

Security

Security Protocols

  • PLAINTEXT – Un-authenticated, non-encrypted channel
  • SASL_PLAINTEXT – SASL authenticated, non-encrypted channel
  • SASL_SSL – SASL authenticated, SSL channel
  • SSL – SSL channel

Ops

Kafka Manager

(CMAK)

ZooNavigator

Kafkacat

Cruise Control

Burrow

Made with Slides.com