Monitoring Sucks

And what you can do about it

Me









Francis Kayiwa kayiwa@


  • Monitoring running applications is interesting
  • Most monitoring tools suck

I want to convince you that...






What we have


Clunky user interfaces



Verbose configuration


Long check intervals


Host centric world view






Not just me



A tweet







##monitoringsucks


an IRC room







#monitoringsucks


A Twitter hashtag


A github repository


What we want


(really really want)




Metrics and graphs


System AND Business Data


Logstreams

APIs


Alerts


Dashboards



Goings on


(Just a quick sample)


Naming things (is hard)


  • metric - a numeric of boolean data point
  • context - metadata about a metric
  • resource - the source of a metric
  • event - metric combined with context
  • action - a response to a given metric
  • collection - getting the metrics
  • event processing - taking action
  • presentation - graphs, emails, dashboards etc
  • analytics - correlation



Sharing setups


Low latency message based tools

Monitoring == Testing

Monitoring unit tests



monitors.txt


JSON example


Monitoring system agnostic




Open Source


Graphite


GDash

statsd


Ruby counter


Java counter


Logster


Point at log files


Get metrics in Ganglia or Graphite


Graylog2


Logstash


Configuration management


Automate checks



Defined outside monitoring systems


Automate graylog collection


Automate logster collection


or Configuration Management




SAAS

(pay nice people for software)


New Relic


New Relic Dashboard


Librato metrics


Librato graphs



Splunk



Splunk dashboard


PagerDuty


PagerDuty scheduler



Boundary


Network traffic analyses




TakeAway


(if all you remember is)




Admit we have a problem


  • http://github.com/monitoringsucks
  • http://graylog2.org
  • http://logstash.net
  • https://github.com/etsy/logster
  • https://github.com/etsy/statsd
  • http://graphite.wikidot.com/
  • http://monitorstxt.org/
  • http://auxesis.github.io/cucumber-nagios/

Lots of links

Questions?


Jobs?

Monitoring Sucks

By Francis Kayiwa

Monitoring Sucks

  • 1,851