Why it takes a team of engineers to process events in realtime

Alan Braithwaite

@Caust1c

Twilio Segment

Gophercon EU 2021

Architecture

Systems

Secret Sauce

Architecture

Microservices

Microservices

Microservice Availability

Microservice Availability

Microservice Availability

Availability Math

Availability Math

95\%*99\%*99\%*99\%*99\%*99\% = 90.3\%
99\%*99\%*99\%*99\%*99\%*99\% = 94.1\%
99.9\%^6 = 99.4\% \space uptime = 54 \space down \space hours \space per \space year

Goodbye Microservices

Conceptual Architecture

Architecture In Practice

Architecture In Practice

Actual footage of us brainstorming

Current Architecture

ctlstore

ctlstore

Written in Go!
Open Source!

Fancy
Website!

ctlstore.segment.com

Key Insight

Separate your control plane from your data plane


Or your write-path from your read-path outside of data pipelines

Systems

Consistency

  • Bullet One
  • Bullet Two
  • Bullet Three

Deduplication

https://segment.com/blog/exactly-once-delivery

Deduplication

  1. Check if duplicate
  2. Produce to output (w/ input offset)
  3. Commit ID to RocksDB
  4. Commit offset upstream

Deduplication

  1. Read upstream + downstream offset
  2. Scan output (downstream-upstream) messages
  3. Ensure those IDs are committed to RocksDB
  4. Commit upstream offset
  5. Continue as usual

Recovery

Key Insight

Use your destination topic or database as a write-ahead log to recover from failures.

ClickHouse

ClickHouse

  • High throughput writes using LogStructured MergeTree columnar data-format

 

  • Enables high-performance high-cardinality near-real-time analytics through Materialized Views

 

  • Highly-available multi-leader architecture

Query Gateway Pattern

  • Domain service for every database
     
  • Don't allow other clients to connect to database & run their own queries
    • enables network isolation & easy audit trails
       
  • Enables domain experts code-reviewing queries and provides purpose built product or business queries

Secret Sauce

and The Segment Team

segmentio/queues

type Source interface {
	Receive(context.Context) (Message, func(error), error)
}

type Sink interface {
	Send(context.Context, func(error), ...Message) error
}

type Message struct {
	Key []byte
	Value []byte
	Topic string
	Attributes Attributes
}

segmentio/stats

type Handler interface {
	HandleMeasures(time time.Time, measures ...Measure)
}
  • datadog
  • influxdb
  • prometheus
  • veneur

segmentio/encoding

Drop-in replacement for encoding/json

Makes use of performance optimizations reliant on memory layout that may break between versions of Go

60-400 % performance improvement over Go v1.16.2 encoding/json

 

We don't believe that this code should be ported upstream to the standard encoding/json package. The standard library has to remain readable and approachable to maximize stability and maintainability, and make projects like this one possible because a high quality reference implementation already exists.

Go OSS Ecosystem

github.com/segmentio/topicctl

github.com/segmentio/events

github.com/segmentio/kafka-go

github.com/segmentio/ksuid

github.com/segmentio/cli

github.com/segmentio/chamber

github.com/segmentio/golines

github.com/segmentio/kubeapply


(all MIT Licensed)

Thank you!

The entire team at Segment

and the Go community

Alan Braithwaite

@Caust1c

Twilio Segment

deck

By Alan Braithwaite

deck

  • 34