Data analysis with Graphite
Graphite
- Storage
- UI
- Passive
- Math
StatsD
- Timeless metrics
- Raw -> Aggregate
- Statistical
- UDP*
- RAM only
some.metric.name:2|c|@0.1
Riemann, Hekad
- Arbitrary aggregations
- Metrics from events
Dashboards
- Grafana
- GraphExplorer
- Tessera
- Cabot, Syren
Too many to list
Short intro to Metrics
What's a metric?
- Timestamp
- Name
- Value
- Maybe other stuff (tags?)
Raw metric
- High data rate
- Expensive to transmit and store
- Can use for any calculation
Aggregate
- Compressed
- Biased towards certain usage
- Accuracy
- Data loss
Sampling
Gauges
Counters
Timers
Graphite
Short intro
Architecure
Protocols
- Line protocol - TCP, UDP
- Pickle protocol
- AMQP
host.service.subservice.something.blah 231 1438182493
What if we have multiple points in the same interval?
Graphite takes the last one
Use prefixes to protect
Storage schema
- Multiple periods and resolutions
- Downscale by aggregate function
- AVG, MIN, MAX, LAST
- 12 bytes per point (+ change)
- Preallocated
[all_min]
pattern = \.min$
xFilesFactor = 0.1
aggregationMethod = min
[apache_busyWorkers]
pattern = ^servers\.www.*\.workers\.busyWorkers$
retentions = 15s:7d,1m:21d,15m:5y
Finding a niddle in a haystack
Average is mean to me
Percentiles, StdDev
- The birthday paradox
- p99
- p50 - median
Using graphite
How to get stuff out
- json
- csv
- pickle
- svg
- png
Events
- Events API
- drawAsInfiinite + timeseries
Downsampling issues
consolidateBy()
Cleanup noise
movingAverage()
Drawing threshold
constantLine()
Correlating
-
second Y axis
-
flot
-
scaling
Working with counters
- derivative()
- nonNegativeDerivative()
- scaleToSeconds()
Correlating multiple series
- MostDeviant
- Highest
- Lowest
Data analysis with Graphite
By Avishai Ish-Shalom
Data analysis with Graphite
- 1,676