Monitoring Sucks
And what you can do about it
Me
Francis Kayiwa kayiwa@
- Monitoring running applications is interesting
- Most monitoring tools suck
I want to convince you that...
What we have
Clunky user interfaces
Verbose configuration

Long check intervals

Host centric world view
Not just me
A tweet
##monitoringsucks
an IRC room
#monitoringsucks
A Twitter hashtag

A github repository
What we want
(really really want)

Metrics and graphs
System AND Business Data
Logstreams

APIs
Alerts

Dashboards
Goings on
(Just a quick sample)
Naming things (is hard)
- metric - a numeric of boolean data point
- context - metadata about a metric
- resource - the source of a metric
- event - metric combined with context
- action - a response to a given metric
- collection - getting the metrics
- event processing - taking action
- presentation - graphs, emails, dashboards etc
- analytics - correlation
Sharing setups
Low latency message based tools

Monitoring == Testing

Monitoring unit tests
monitors.txt
JSON example
Monitoring system agnostic
Open Source
Graphite
GDash

statsd
Ruby counter

Java counter
Logster
Point at log files
Get metrics in Ganglia or Graphite
Graylog2
Logstash
Configuration management
Automate checks

Defined outside monitoring systems

Automate graylog collection
Automate logster collection
or Configuration Management
SAAS
(pay nice people for software)
New Relic
New Relic Dashboard
Librato metrics

Librato graphs
Splunk

Splunk dashboard
PagerDuty
PagerDuty scheduler

Boundary

Network traffic analyses
TakeAway
(if all you remember is)
Admit we have a problem
- http://github.com/monitoringsucks
- http://graylog2.org
- http://logstash.net
- https://github.com/etsy/logster
- https://github.com/etsy/statsd
- http://graphite.wikidot.com/
- http://monitorstxt.org/
- http://auxesis.github.io/cucumber-nagios/
Lots of links
Questions?
Jobs?
Monitoring Sucks
By Francis Kayiwa
Monitoring Sucks
- 2,040