Distributed logging

What are the problems with logging in distributed systems?

What are the problems with logging in distributed systems?

  •  Many log sources
  •  Big amount of data
  •  Inconsistency
  •  Crashes => data loss
  •  Hard to trace the needed event
  •  Security (X-Pack)

Basically it does 3 things:

  1.  Consume data
  2.  Transform data
  3.  Send data

Inputs(1XX plugins):

  • ​Windows event logs
  • Syslogs
  • Files
  • HTTP, TCP
  • Webhooks 
  • RabbitMq, Kafka ...
  • Redis, Mongo ...

Filters are used for transforming data :

 

  • anonymize:
    Replaces field values with a consistent hash
  • geoip:
    Add information in the event based on the ip location
  • xml:
    Parse xml into json fields
  • i18n:
    Removes special characters

Output plugins, same as Input plugins ;)

 
  • Documents, indexes and indexes
    Inverted index, Lucene
  • Analyzer:
    Tokenizing, Normalizing
  • Mapping
  • Shards and Replicas
    Master, data and client node
    Log rolling, hot and cold logs
  • Aggregations:
    Metrics, Buckets, Pipeline

Elastic stack

By Corneliu Caraman