Sensible monitoring with Sensu

Avishai Ish-Shalom (@nukemberg)

In the beginning

God created The Servers and The Network

Now the dashboard was formless and empty, darkness was over the surface of the deep, and the Spirit of God was hovering over the servers.

And God said:
"Let there be Monitoring", and there was Monitoring

God saw that the monitoring was good, and he separated the green from the red.

And God said

“Let the data under the system be gathered to one place, and let a single process appear.” And it was so. God called the single process “server,” and the gathered data he called “checks” And God saw that it was good

Let's skip ahead a bit...

Now the cloud

was more crafty than any of the wild platforms the Lord God had made. He said to the sysadmin, “Did God really say, ‘You must configure everything manually’?”

The sysadmin said

to the cloud, “We may automate the servers in the system, but God did say, ‘You must not automate the monitoring that is in the middle of the system, and you must not touch it, or it will break.’”

“It will not certainly crash,”

the Cloud said to the Sysadmin. “For God knows that when you automate it your eyes will be opened, and you will be like Google, knowing good and evil.”

OK... jokes aside, what's wrong with Nagios?

Nagios was built for a static world of Pet servers

The Nagios model

Checks executed on central server
Host oriented model
Every server needs to be configured
File based configurations

Scalability
Configurations
API
Cluster aggregation

What's wrong?

A wild Sensu appears

Architecture

Features

Auto client registration
API
Scalable
Built for automation
Aggregation
Support for time-series data

Uchiwa

Mergable Json config files

{
  "checks": {
    "check_disk_usage": {
      "command": "check-disk-usage.rb -w :::disk.warning|80::: -c :::disk.critical|90:::",
      "subscribers": [
        "production"
      ],
      "interval": 60,
      "environment": ":::environment|production:::"
    }
  }
}

Check plugins

Compatible with Nagios

API

$ curl -s http://127.0.0.1:4567/clients | jq .
[
  {
    "timestamp": 1458625739,
    "version": "0.25.0",
    "socket": {
      "port": 3030,
      "bind": "127.0.0.1"
    },
    "subscriptions": [
      "dev"
    ],
    "environment": "development",
    "address": "127.0.0.1",
    "name": "client-01"
  }
]

Client (de)registration

Auto register
Deregister on clean shutdown (optional)
Deregister by API call (DELETE /clients/:id)

Integrations

Notifications (Slack, PagerDuty)
Time-series data (Graphite, InfluxDB)
Events (ELK, GrayLog, Slack)
CM (puppet, chef, ansible)

Checks collect data and create events

Handlers process the events passed on by Filters and optionally Mutate the event

Aggregates are pseudo checks that derive events from a group of checks

Checks are scheduled by the server and executed on the client

Stand-Alone checks are scheduled by the client

Sensu

Auto client discovery
Checks executed on clients
Arbitrary aggregations
API
Mergeable json configs

Nagios

Hosts must be configured
Checks executed on server
No aggregations*
No API*
Special config language

Cut the crap, let's have the bottom line

Pros

Close to the Nagios model
Compatible with Nagios plugins
Scalable
Automation friendly
Modern
Integration friendly
Simple

Cons

Close to the Nagios model
Not powerful enough for serious app centric monitoring
Barebones - no analytics, lame dashboard, no dependency graph, etc

Questions?

Sensible monitoring with Sensu

By Avishai Ish-Shalom

Sensible monitoring with Sensu

2,037

Avishai Ish-Shalom

nukemberg

Sensible monitoring with Sensu

In the beginning

And God said: "Let there be Monitoring", and there was Monitoring

And God said

Let's skip ahead a bit...

Now the cloud

The sysadmin said

“It will not certainly crash,”

OK... jokes aside, what's wrong with Nagios?

Nagios was built for a static world of Pet servers

The Nagios model

What's wrong?

A wild Sensu appears

Architecture

Features

Uchiwa

Mergable Json config files

Check plugins

Compatible with Nagios

API

Client (de)registration

Integrations

Checks collect data and create events

Handlers process the events passed on by Filters and optionally Mutate the event

Aggregates are pseudo checks that derive events from a group of checks

Checks are scheduled by the server and executed on the client

Stand-Alone checks are scheduled by the client

Sensu

Nagios

Cut the crap, let's have the bottom line

Pros

Cons

Questions?

Sensible monitoring with Sensu

More from Avishai Ish-Shalom

And God said:
"Let there be Monitoring", and there was Monitoring