Sensu alerts
for multi projects
[2015/10/2] Sensu Deep Talks #2
Yu Yamanaka (@yuurelx)
DevOps engineer at peroli, Inc.
No.1 curation platform for women in Japan !
By the way,
How many alerts do you receive in a week?
In case of MERY
About 15-80 alerts
Yes, it is not surprising number.
But, since we are using Sensu with PagerDuty,
engineers’ sleeping are hampered by alerts...
That cannot be help to save users' experience
(of course, we are making efforts to reduce alerts!)
By contrast, if you got another team's alerts?
I got an urgent call,
but cannot do
anything ...
The problem must
be resolve !!
(Today's Main Topic)
1. Our teams & alerts flow
2. Why cannot divide alerts
3. How to resolve the issue
Subtopics
1. Our teams & alerts flow
2. Why cannot divide alerts
3. How to resolve the issue
Subtopics
Our company's project teams
?
?
Ad
Platform
MERY
Our system alerts flow
-
A system failure occurs
-
Sensu client on the server detects it & reports
-
Sensu server creates an incident on PagerDuty through the API
- PagerDuty notifies that to Slack and calls engineers (not only ops but dev)
1. Our teams & alerts flow
2. Why cannot divide alerts
3. How to resolve the issue
Subtopics
A check definition are shared by each projects
## /etc/sensu/conf.d/checks.conf
...
"unicorn": {
"command": "/etc/sensu/plugins/check-procs.rb -p 'unicorn master' -C 1",
"interval": 60,
"occurrences": 4,
"subscribers": [
"mery-web",
"adpf-admin"
],
"handlers": [
"pagerduty",
"force_restart_unicorn"
]
},
...
handlers per a check
subscribers per a check
A check cannot connect those resources
subscriber
subscriber
subscriber
handler
handler
handler
check
Cluster A
Cluster B
Cluster C
Team A
Team B
Team C
handler
?
You can resolve that by duplicating checks
subscriber
subscriber
subscriber
handler
handler
handler
check
Cluster A
Cluster B
Cluster C
Team A
Team B
Team C
handler
check
check
This is not DRY...
1. Our teams & alerts flow
2. Why cannot divide alerts
3. How to resolve the issue
Subtopics
Custom definition attributes
Solution
subscriber
subscriber
subscriber
handler
check
Cluster A
Cluster B
Cluster C
Team A,B,C's API key
attrs
attrs
attrs
config
Fetch the key by attrs
Client side config
## /etc/sensu/client/config.json
...
"client": {
"name": "<%= node[:ec2][:instance_id] %>",
"address": "<%= node[:machinename] %>",
<%= %Q("service": "#{node[:service]}",) %> # e.g. mery, adpf
<%= %Q("environment": "#{node[:environment]}",) %> # e.g. staging, production
"keepalive": {
"thresholds": {
"warning": 40,
...
(deploy by tool as in Chef)
Server side config
## /etc/sensu/conf.d/handlers.json
...
"pagerduty": {
"mery": {
"production": { "api_key": "xxxxx" },
"staging": { "api_key": "yyyyy" }
},
"adpf": {
"production": { "api_key": "aaaaa" },
"staging": { "api_key": "bbbbb" }
}
}
...
Diff of PagerDuty plugin
## diff of /etc/sensu/handlers/pagerduty.rb
...
def handle
if @event['check']['pager_team']
api_key = settings['pagerduty'][@event['check']['pager_team']]['api_key']
+ elsif @event['client']['service'] && @event['client']['environment']
+ api_key = settings['pagerduty'][@event['client']['service']][@event['client']['environment']]['api_key']
else
api_key = settings['pagerduty']['api_key']
end
...
Create services on PagerDuty
Finally,
We have not been waked up by another team's alerts!!
alerts from staging environments too
Make alerts properly.
Thank you for your attention!
Yu Yamanaka (@yuurelx)
DevOps engineer at peroli, Inc.
Sensu alerts for multi projects
By Yu Yamanaka
Sensu alerts for multi projects
[2015/10/2] Sensu Deep Talk #2
- 2,805