Collector

Journey from Fluent Bit, Fluentd and Prometheus to

Collectors Zoo

Collector

Journey from Fluent Bit, Fluentd and Prometheus to

Collector

Journey from Fluent Bit, Fluentd and Prometheus to

Collectors Zoo

Collectors Zoo

where it all started?

de facto k8s

telemetry data collection standards
in early 2023

logs

metrics

traces

?

logs

a lot of plugins

community, documentation

written in Ruby

logs

a lot of plugins

community, documentation

written in Ruby

Exhibit 1 - kubeclient gem

used by more than 1k projects

1 (one) active maintainer

logs

1 worker = 1 thread

resources.requests.cpu = 2000

autoscaling.targetCPUUtilizationPercentage = 50

logs

logs

metrics

traces

?

crazy fast

great memory usage

written in C

logs

logs

logs

metrics

traces

?

metrics

community, documentation

memory usage

database, not a forwarder *

logs

metrics

traces

?

traces

?

traces

why?

OpenTelemetry

101

Vendor provided

OpenTelemetry provided

OpenTelemetry provided

No OpenTelemetry backend

Vendor provided

OpenTelemetry

OpenTelemetry

Guidelines - cross language requirements and expectations for all implementations

 

Semantic conventions

 

API, SDK

 

OTLP

Guidelines - cross language requirements and expectations for all implementations

 

Semantic conventions

 

API, SDK

 

OTLP

Community driven
industry standard

OpenTelemetry

OpenTelemetry

Instrumentation

OpenTelemetry

OpenTelemetry

OpenTelemetry

Collector

101

Collection

OpenTelemetry Collector

Receivers

Processors

Exporters

OpenTelemetry Collector

Receivers

Processors

Exporters

Pipelines

OpenTelemetry Collector

aka otelcol

Host Metrics Receiver

Put it all together

receivers:
  hostmetrics:
    scrapers:
      memory:

processors:
  resourcedetection/detect-host-name:
    detectors:
    - system
    system:
      hostname_sources:
      - os

exporters:
  otlp:
    endpoint: otelcol2:4317

service:
  pipelines:
    metrics:
      receivers:
      - hostmetrics
      processors:
      - resourcedetection/detect-host-name
      exporters:
      - otlp
receivers:
  hostmetrics:
    scrapers:
      memory:

processors:
  resourcedetection/detect-host-name:
    detectors:
    - system
    system:
      hostname_sources:
      - os

exporters:
  otlp:
    endpoint: otelcol2:4317

service:
  pipelines:
    metrics:
      receivers:
      - hostmetrics
      processors:
      - resourcedetection/detect-host-name
      exporters:
      - otlp

Data Pipeline

service:
  pipelines:
    metrics:
      receivers: [hostmetrics]
      processors: [resourcedetection/detect-host-name]
      exporters: [otlp]
      
    metrics/kafka:
      receivers: [kafka]
      exporters: [otlp/kafka]
      
    logs:
      receivers: [filelog]
      exporters: [otlp]
    
    logs/kafka:
      receivers: [kafka]
      exporters: [otlp, otlp/kafka]
      
    traces:
      receivers: [otlp, kafka]
      processors: [resourcedetection/detect-host-name]
      exporters: [otlp, otlp/kafka]

More Data Pipelines

OTel Collector distros

Canonical Observability Stack

K8s observability

Logs

Metrics

Traces

Metadata

Kubernetes Observability 1.0

K8s observability

Logs

Metrics

Traces

Metadata

K8s observability

Logs

Metrics

Traces

Metadata

Traces

~March 2020

data flood

K8s observability

Logs

Metrics

Traces

Metadata

K8s observability

Logs

Metrics

Traces

Metadata

Metadata

Metadata

Battle tested

 

Single threaded

 

Weak performance

 

Ruby magic

K8s Attributes in beta

 

Go-lang performance

 

Removed backpressure from Prometheus'
remote-write

 

Lowered Prometheus' memory

February 2022

Metadata: CPU sum

~ 38 CPUs

~ 13 CPUs

Metadata: memory sum

~ 220G of RAM

~75G of RAM

... memory after more tweaks?

~11G - 20x less

Metadata: instances

~85

~20

... instances after more tweaks?

~11 instances - 8x less

K8s observability

Logs

Metrics

Traces

Metadata

K8s observability

Logs

Metrics

Traces

Metadata

Logs

Logs

Great CPU usage

 

Great memory usage

 

Hard to debug issues
 

Didn't support metrics and traces at the time
(it does now)

Filelog Receiver in beta

 

Great CPU usage

 

Reasonable memory usage

 

No major feature missing

June 2022

K8s observability

Logs

Metrics

Traces

Metadata

K8s observability

Logs

Metrics

Traces

Metadata

Metrics

Remote write

 

Load balancing

 

Memory usage

Prometheus receiver in beta

 

Metric names quirks

 

Small resource usage

Metrics

September 2023

Metrics: CPU sum

~ 0.7 CPUs

~ 0.15 CPUs

Metrics: memory sum

~ 7.5G of RAM

~1.6G of RAM

K8s observability - OTel edition

Logs

Metrics

Traces

Metadata

Kubernetes Observability 2.0

Kubernetes Observability 2.0

K8s observability - OTel edition

Logs

Metrics

Traces

Metadata

K8s observability - OTel edition

Logs

Metrics

Traces

Metadata

OTC issues

Low hanging fruit bugs in 2022

State as of 2023

K8s telemetry collection

pull vs push

Some more #OpenTelemetry

Community

Thank you!

Marcin "Perk" Stożek

@marcinstozek / perk.pl

Journey from Fluent Bit, Fluentd and Prometheus to OpenTelemetry Collector

By Marcin Stożek

Journey from Fluent Bit, Fluentd and Prometheus to OpenTelemetry Collector

  • 1,267