Application Metrics & 

DevOps Awesomeness 

With Graphite and Grafana






Logging Analytics


Observability


Application Metrics





Who am I? 




Torkel Ödegaard




@torkelo
github.com/torkelo


Stockholm



Sweden





Coding Instinct

"we are survival machines - robot vehicles blindly programmed to preserve the selfish molecules known as genes" 



Open source metrics dashboard and graph editor for
Graphite, InfluxDB and OpenTSDB

Sponsors




Why?


Continuous delivery

  • Monitoring
  • Logging
  • Alerting
  • Analytics

Distributed systems


  • Isolated sub-systems / applications
  • Async messaging via queues
  • Many servers




Standard metrics solution (win)


Performance Counters








Metrics / Measurements

Metric vs Log Event



MetricKey    Value   Timestamp





Graphite


  • Open source scalable time series database
  • Composed of 3 components
    • Carbon  - receives and records metrics
    • Whisper - Storage engine
    • Graphite-web - Http frontend 
  • Large community 
  • Written in python






Input

prod.apps.server-1.counter.login.count   10    1398969187

Query
prod.apps.*.counter.login.count



Functions!


sumSeries(apps.mysite.*.counter.login.count)

summarize(apps.mysite.*.counter.login.count, '1h')

movingAverage(apps.mysite.*.counter.login.count, 10)

timeShift(apps.mysite.*.counter.login.count, '7d')

Metric Libraries







Metric types


  • Counters
  • Timers
  • Gauges


Metric.Increment("user.login");            


Metric.Time("auction_search", 142);            


Metric.Time("auction_search", () => search());            
    
Graphite writer

apps.devsum.server-01.counters.auction_search.count   15    123123123131
apps.devsum.server-02.counters.auction_search.count    1    123123123131
apps.devsum.server-03.counters.auction_search.count   35    123123123131

apps.devsum.server-01.timers.auction_search.count    5    123123123131
apps.devsum.server-01.timers.auction_search.mean    10    123123123131
apps.devsum.server-01.timers.auction_search.max     50    123123123131
apps.devsum.server-01.timers.auction_search.min      2    123123123131
    

    

Demo



Graphite intro





play.grafana.org

Graphite configuration



[stats]
pattern = ^apps.*
retentions = 10s:6h,1min:7d,10min:5y

[stats]
pattern = ^highres.*
retentions = 1s:6h,1min:1d

[stats]
pattern = ^statsd.*
retentions = 1min:1d,10min:1y



    

Time measurements



Average is not god enough!


5
7
2
7
2400
20
15
10000
4
2

Avg = 1246

Percentiles


5
7
2
7
2400
20
15
10000
4
2

Percentiles

10000
2400
20
15
7
7
5
4
2
2

upper 20 = 2
upper 50 = 7
upper 70 = 15
upper 90 = 2400





StatsD




More demo


Functions

timeShift
percent
summarize
integral
derivate

Display options

templated
annotations


Future of metrics



  • Metric 2.0
  • Alerting 
  • Resolution

Metrics 2.0






prod.eu-01.webapp-01.counters.images.upload_bytes.count

Problems


  • Finding metrics
  • Understanding metrics
  • Metric unit?
  • Rate write?
  • Meta data
  • Change Agent

Metrics 2.0


prod.eu-01.webapp-01.counters.images.upload_bytes.count

{  server: webapp-01,  datacenter: eu-01,  unit: bytes,  rate: 10s,  metric_type: counter,  stat: images.upload}

Metrics 2.0




Conceptual model vs 
wire protocol vs 
storage










Metric resolution and alerting


Thanks!



@torkelo

@grafana

grafana.org

github.com/grafana/grafana



Copy of Application Metrics & DevOps Awesomeness With Graphite and Grafana, Metrics future

By torkelo

Copy of Application Metrics & DevOps Awesomeness With Graphite and Grafana, Metrics future

  • 1,327