Gerhard Laußer, ConSol* Software GmbH
Ein Framework, mit dem beliebige Datenquellen angezapft werden können, um automatisiert Nagios-Konfigurationsdateien zu erzeugen.
Datenquellen enthalten Hosts und Applikationen. (DB, CSV, ...)
Es gibt Regeln, wie Host- und Applikationsobjekte in Nagios-Hosts und -Services transformiert werden.
coshsh ist ein Generator für hoch standardisierte Umgebungen, der die Pflege das IT-Bestands von der Pflege des Monitorings trennt.
coshsh sorgt dafür, dass in einer PaaS/SaaS-Umgebung neue Services und VMs nach der Registrierung in kürzester Zeit und bedienerlos ins Monitoring aufgenommen werden.
def __mi_ident__(params={}):
if compare_attr("type", params, ".*ontap.*|.*netapp.*"):
return ONTAP
class ONTAP(Application):
template_rules = [
TemplateRule(needsattr=None,
template="os_ontap_default",
),
TemplateRule(needsattr="volumes",
template="os_ontap_fs",
),
]
def __mi_ident__(params={}):
if compare_attr("type", params, ".*ontap.*|.*netapp.*"):
return ONTAP
class ONTAP(Application):
template_rules = [
TemplateRule(needsattr=None,
template="os_ontap_default",
),
TemplateRule(needsattr="volumes",
template="os_ontap_fs",
),
]
{{ application|service("os_ontap_default_check_hw") }}
host_name {{ application.host_name }}
use os_ontap_default
check_command check_naf_v2!$HOSTADDRESS$!60!{{ application.l
oginsnmpv2.community }}!environment
}
{{ application|service("os_ontap_default_check_disks") }}
host_name {{ application.host_name }}
use os_ontap_default,srv-pnp
check_command check_naf_v2!$HOSTADDRESS$!60!{{ application.loginsnmpv2.
community }}!disk,failed
}
{% for volume in application.volumes %}
{{ application|service("os_ontap_fs_check_vol_"+ volume.name) }}
host_name {{ application.host_name }}
use os_ontap_fs,srv-pnp
check_interval 15
retry_interval 15
check_command check_naf_v2!$HOSTADDRESS$!60!{{ application.
loginsnmpv2.community }}!vol_data,{{ volume.name }},{{ volume.warning}}{{ volume
.units}},{{ volume.critical }}{{ volume.units }}
}
$ mkdir generator $ cd generator $ git pull https://github.com/lausser/coshsh.git $ mkdir recipes $ mkdir recipes/yns/recipes $ mkdir recipes/yns/classes $ mkdir recipes/yns/data $ mkdir etc $ vi etc/coshsh.cfg
$ export PATH=$PATH:$HOME/generator/coshsh/bin
[datasource_officepcs] type = csv dir = %HOME%/generator/recipes/yns/data [recipe_nsmuc] objects_dir = %HOME%/siteconfigs/office datasources = officepcs
host_name,address,type,os,hardware,virtual,notification_period,location,department
ynspc001,192.168.10.101,pc,Dell Optiplex GX700,ps,7x24,Empfang,Administration
ynspc002,192.168.10.102,pc,Acer veriton M290,ps,7x24,Kasse,Administration
Das Kochrezept für nsmuc lautet:
$ coshsh/bin/coshsh-cook --cookbook etc/coshsh.cfg --recipe nsmuc
2013-06-15 22:09:03,409 - INFO - recipe nsmuc init
...
2013-06-15 22:09:03,565 - INFO - load template host
2013-06-15 22:09:03,565 - INFO - load items to datarecipient_coshsh_default
2013-06-15 22:09:03,565 - INFO - recipe datarecipient_coshsh_default remove dynamic_dir coshsh/bin/../coshsh/../../siteconfigs/office/dynamic
2013-06-15 22:09:03,565 - INFO - recipient datarecipient_coshsh_default dynamic_dir coshsh/bin/../coshsh/../../siteconfigs/office/dynamic
2013-06-15 22:09:03,565 - INFO - number of files before: 0 hosts, 0 applications
2013-06-15 22:09:03,565 - INFO - number of files after: 2 hosts, 0 applications
$ find siteconfigs/
siteconfigs/
siteconfigs/office
siteconfigs/office/dynamic
siteconfigs/office/dynamic/hostgroups
siteconfigs/office/dynamic/hosts
siteconfigs/office/dynamic/hosts/ynspc001
siteconfigs/office/dynamic/hosts/ynspc001/host.cfg
siteconfigs/office/dynamic/hosts/ynspc002
siteconfigs/office/dynamic/hosts/ynspc002/host.cfg
und im Detail...
$ cat siteconfigs/office/dynamic/hosts/ynspc001/host.cfg define host { use generic-host host_name ynspc001 address 192.168.10.101 alias ynspc001 notification_period 7x24 check_command check_host_alive notification_options d,u,r _SSH_PORT 22 }
h = Host(row)
self.add('hosts', h)
Erweitern der Datasource
1. Die mitgelieferte Datasource-Klasse ins eigene Verzeichnis kopieren
cp coshsh/recipes/default/classes/datasource_csvfile.py recipes/yns/classes
2. Hostgruppen erzeugen
h = Host(row)
h.hostgroups.append("dept_" + row["departement".lower())
h.hostgroups.append("loc_" + row["location"].lower()) self.add('hosts', h)
self.add('hosts', h)
Jeder Host bekommt eine Applikation mit dem Namen "os" und dem Typ "Windows", "Linux" [, "IOS", "ESX", "OnTAP",...]
$ cat recipes/yns/data/officepcs_applications.csv
host_name,name,type,component,version,check_period
ynspc001,os,Windows,,2008R2,7x24
ynspc002,os,Windows,,2008R2,7x24
$ coshsh/bin/coshsh-cook --cookbook etc/coshsh.cfg --recipe nsmuc
2013-06-16 00:43:33,767 - INFO - recipe nsmuc init
...
2013-06-16 00:43:33,959 - INFO - number of files before: 2 hosts, 0 applications
2013-06-16 00:43:33,960 - INFO - number of files after: 2 hosts, 2 applications
$ find siteconfigs
siteconfigs/
siteconfigs/office
siteconfigs/office/dynamic
siteconfigs/office/dynamic/hostgroups
siteconfigs/office/dynamic/hosts
siteconfigs/office/dynamic/hosts/ynspc001
siteconfigs/office/dynamic/hosts/ynspc001/host.cfg
siteconfigs/office/dynamic/hosts/ynspc001/os_windows_default.cfg
siteconfigs/office/dynamic/hosts/ynspc002
siteconfigs/office/dynamic/hosts/ynspc002/host.cfg
siteconfigs/office/dynamic/hosts/ynspc002/os_windows_default.cfg
$ cat siteconfigs/office/dynamic/hosts/ynspc001/os_windows_default.cfg
define service {
service_description os_windows_default_check_nsclient
host_name ynspc001
use os_windows_default
check_command check_nrpe_arg!60!checkUpTime!MinWarn=5m MinCrit=1m
}
define service {
service_description os_windows_default_check_cpu
host_name ynspc001
use os_windows_default,srv-pnp
max_check_attempts 10
check_command check_nrpe_arg!60!checkCPU!warn=80 crit=90 time=5m time=1m time=30s
}
Wenn aus der Datasource irgendwas rauskommt, dessen type-Feld mit .*windows.* matcht, mach ein Objekt der Klasse Windows draus
template_rules geben an, welche tpl-Files hierfür gerendert werden
$ more coshsh/recipes/default/classes/os_windows.py
from application import Application
from templaterule import TemplateRule
from util import compare_attr
def __mi_ident__(params={}):
if compare_attr("type", params, ".*windows.*"):
return Windows
class Windows(Application):
template_rules = [
TemplateRule(needsattr=None,
template="os_windows_default"),
TemplateRule(needsattr="filesystems",
template="os_windows_fs"),
]
Ein tpl-File beinhaltet Platzhalter, die durch Attribute des Objekts ersetzt werden
$ more coshsh/recipes/default/templates/os_windows_default.tpl
{{ application|service("os_windows_default_check_nsclient") }}
host_name {{ application.host_name }}
use os_windows_default
check_command check_nrpe_arg!60!checkUpTime!MinWarn=5m MinCrit=1m
}
{{ application|service("os_windows_default_check_cpu") }}
host_name {{ application.host_name }}
use os_windows_default,srv-pnp
max_check_attempts 10
check_command check_nrpe_arg!60!checkCPU!warn=80 crit=90 time=5m time=1m time=30s
}
$ more coshsh/recipes/default/templates/os_windows_fs.tpl
{#
fs.path
fs.warning
fs.critical
fs.units
#}
{% for fs in application.filesystems %}
{{ application|service("os_windows_fs_check_" + fs.path) }}
host_name {{ application.host_name }}
use os_windows,srv-pnp
check_interval 15
check_command check_nrpe_arg!60!CheckDriveSize!ShowAll \
MinWarnFree={{ fs.warning }}{{ fs.units }} \
MinCritFree={{ fs.critical }}{{ fs.units }} \
Drive={{ fs.path }}:
}
{% endfor %}
Wie kommt eine Applikation der Klasse Windows zu .filesystems?
$ cat recipes/yns/data/officepcs_applicationdetails.csv
host_name,application_name,application_type,monitoring_type,monitoring_0,monitoring_1,...
ynspc001,os,Windows,FILESYSTEM,C,20,10,,,
ynspc001,os,Windows,FILESYSTEM,D,20,10,,,
ynspc001,os,Windows,FILESYSTEM,F,10,5,,,
ynspc002,os,Windows,FILESYSTEM,C,25,10,,,
$ cd coshsh/recipes/default/classes
$ ls detail*
detail_access.py detail_loginsnmpv2.py detail_socket.py
detail_datastore.py detail_loginsnmpv3.py detail_tablespace.py
detail_depth.py detail_nagiosconf.py detail_tag.py
detail_filesystem.py detail_nagios.py detail_url.py
detail_interface.py detail_port.py detail_volume.py
detail_keyvalues.py detail_process.py
detail_login.py detail_role.py
$ mkdir -p generator/recipes/yns/{classes, templates}
[datasource_officepcs]
type = csv
dir = %HOME%/generator/recipes/yns/data
[recipe_nsmuc]
objects_dir = %HOME%/siteconfigs/office
classes = %HOME%/generator/recipes/yns/classes
templates = %HOME%/generator/recipes/yns/templates
datasources = officepcs
$ cat generator/recipes/yns/classes/app_ns_suelzomat.py
from application import Application
from templaterule import TemplateRule
from util import compare_attr
def __mi_ident__(params={}):
if compare_attr("type", params, ".*suelzomat.*"):
return Windows
class Suelzomat(Application):
template_rules = [
TemplateRule(needsattr=None,
template="os_windows_default"),
]
$ cat generator/recipes/yns/templates/app_ns_suelzomat_default.tpl {{ application|service("app_ns_suelzomat_default_check_cpu") }} host_name {{ application.host_name }} use app_ns_suelzomat_default,srv-pnp max_check_attempts 20 check_command check_nrpe_arg!60!checkSUELZ!80!90 } ...
Wir haben bereits datasource_csvfile.py nach recipes/yns/classes kopiert. Die Datei kann modifiziert werden oder als Vorlage dienen.
Mit __ds_ident__ sagt das datasource-Modul: ich fühle mich zuständig für type=csv. Meine Klasse CsvFile weiss, wie es geht.
def __ds_ident__(params={}):
if compare_attr("type", params, "csv"): # type = csv in coshsh.cfg
return CsvFile
CsvFile liest Rohdaten und bereitet diese für coshsh auf
class CsvFile(Datasource):
def __init__(self, **kwargs):
super(CsvFile, self).__init__(**kwargs)
#self.name = kwargs["name"] # [datasource_officepcs] in coshsh.csv
self.dir = kwargs["dir"] # dir = %COSHSH_HOME%/.../data in coshsh.csv
self.objects = {} # wichtig, Speicher für Hosts, Apps, ..
Dateien, Datenbank,... öffnen
def open(self):
logger.info('open datasource %s' % self.name)
...
Hosts und Applikationen auslesen
def read(self, filter=None, objects={}, force=False, **kwargs):
self.objects = objects
for row in ....Records der Datasource, aus DB-Tabelle, CSV-Zeile, ...
... # Hosts lesen
h = Host(row)
self.add('hosts', h) # diesen Host "registrieren"
... # Applikationen lesen
a = Application(row)
self.add('applications', a)
... # Details lesen
if self.find('applications', a.fingerprint()):
self.get('applications', \
a.fingerprint()).monitoring_details.append(detail)
Optionales Schliessen der Datasource
def close(self):
# Datenbank.close()
Fehlerbehandlung
from datasource import Datasource, DatasourceNotAvailable, DatasourceNotCurrent
...
raise DatasourceNotAvailable # falls ein Öffnen nicht möglich ist
raise DatasourceNotCurrent # falls keine neuen Daten vorliegen
cg = ContactGroup({ 'contactgroup_name' : 'admins' }) self.add('contactgroups', cg)
h = Host(row) h.contactgroups.append(cg) a = Application(row) a.contactgroups.append(cg)
c = Contact('lausser', 'WEBREADWRITE', 'lausser@...', 'lausser', '7x24') c.contactgroups.append('admins')