Monitoring Sucks
And what you can do about it
Me
Francis Kayiwa kayiwa@
- Monitoring running applications is interesting
- Most monitoring tools suck
I want to convince you that...
What we have
Clunky user interfaces
Verbose configuration
Long check intervals
Host centric world view
Not just me
A tweet
##monitoringsucks
an IRC room
#monitoringsucks
A Twitter hashtag
A github repository
What we want
(really really want)
Metrics and graphs
System AND Business Data
Logstreams
APIs
Alerts
Dashboards
Goings on
(Just a quick sample)
Naming things (is hard)
- metric - a numeric of boolean data point
- context - metadata about a metric
- resource - the source of a metric
- event - metric combined with context
- action - a response to a given metric
- collection - getting the metrics
- event processing - taking action
- presentation - graphs, emails, dashboards etc
- analytics - correlation
Sharing setups
Low latency message based tools
Monitoring == Testing
Monitoring unit tests
monitors.txt
JSON example
Monitoring system agnostic
Open Source
Graphite
GDash
statsd
Ruby counter
Java counter
Logster
Point at log files
Get metrics in Ganglia or Graphite
Graylog2
Logstash
Configuration management
Automate checks
Defined outside monitoring systems
Automate graylog collection
Automate logster collection
or Configuration Management
SAAS
(pay nice people for software)
New Relic
New Relic Dashboard
Librato metrics
Librato graphs
Splunk
Splunk dashboard
PagerDuty
PagerDuty scheduler
Boundary
Network traffic analyses
TakeAway
(if all you remember is)
Admit we have a problem
- http://github.com/monitoringsucks
- http://graylog2.org
- http://logstash.net
- https://github.com/etsy/logster
- https://github.com/etsy/statsd
- http://graphite.wikidot.com/
- http://monitorstxt.org/
- http://auxesis.github.io/cucumber-nagios/
Lots of links
Questions?
Jobs?
Monitoring Sucks
By Francis Kayiwa
Monitoring Sucks
- 1,851