Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Theory - Part 1
Hands-on exercice
Theory - Part 2
ruby-kafka & Karafka gems
Shopmium ↔ Quotient
Basics
Topics & partitions
Consumers
Producers
Kafka APIs
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Events streaming framework
Instead of storing objects with state in a db, we reason with logs of events.
Events driven architecture
vs
Kafka = (distributed) system to manage the logs
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Pub-sub messaging architecture
Producer - Cluster (brokers) - Consumer
Protocol used: ≠ HTTP
→ persistent connection
Continuous flow of events between services
Microservices that communicate through Kafka
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Message bus (redis + sidekiq) ≠ Streaming platform
- Sidekiq, Resque, etc. → once the events are taken from the queue, they disappear
- Kafka → consumer does not impact the messages
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Usages
Real-time data flow
Microservices that communicate through Kafka
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Topic = log = ordered and immutable collection of events
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
A topic is divided in partitions (≠ nodes in the cluster)
So actually a log = a partition
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Message = key-value pair
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
A topic is divided in partitions (≠ nodes in the cluster)
So actually a log = a partition
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Built-in tools
Admin
Kafka Streams
Processing data between topics, aggregations, joins...
Kafka Connect
Integration with db, search indexes (Elastic Search...), key-value stores (Redis...) - Read and write
Possibilité d'orchestrer facilement Producer + Consumer + dbs avec Connect + Streams...
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Principle of pub sub pattern
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Highly scalable
The brokers are easily scalable (horizontally)
→ No limit to the throughput of events (unlike a db)
Kafka brokers, do not track any of its consumers. The consumer service itself is in charge of telling Kafka where it is in the event processing stream, and what it wants from Kafka.
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
Highly scalable: Netflix, LinkedIn, Microsoft : trillions of events per day in their clusters
Uber
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
ruby-kafka
https://github.com/zendesk/ruby-kafka
Wrapper around Kafka Consumer and Producer API
Built-in tools for instrumentation
Theory 1 - Exercice - Theory 2 - Gems - Shopmium
karafka
https://github.com/karafka/karafka
framework
higher-level abstraction
manages low-level logic (handling errors, retries and backoffs; consumer-groups; concurrency; brokers discoveries...)
cf. config files
Theory 1 - Exercice - Theory 2 - Gems - Shopmium