LOGSTASH
Aurélien Rougemont
Nicolas Szalay
Nicolas Szalay
logstash facts
Logstash is open source (Apache 2.0. license)
Logstash is distributed as a jar
Logstash is written in (j)ruby
Unix pipes on steroids
Inputs | Codecs
|
Filters | Outputs
Inputs |
about 30 input plugins :
- tcp
- udp
- syslog
- amqp
- file
- redis
- [...]
| CODECS |
more and more codecs :
- graphite
- json
- msgpack
- multiline
- netflow
- plain
- rubydebug
- [...]
| FILTERS |
about forty filters
- date
- grok
- geoip
- useragent
- mutate
- noop
- [...]
| outputS
last but not least fifty output plugins :
- es
- redis
- amqp
- syslog
- riemann
- nagios
- [...]
A log is ...
an event.
an event is ...
EVENT = [ DATETIME ] + [ DATA ]
or[ DATETIME ] + [ STRUCTURED DATA ]
Use standards datetime formats such as iso8601
2013-12-01T23:28:45.000Z
GROK
is a regexp-like for dummies engine
logstash embeds over 120 predefined grok patterns
grok syntax
55.3.244.1 GET /index.html 15824 0.043
logstash.conf should containfilter {
grok {
match => [ "message", "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" ]
}
}
and produces{
"client" => "55.3.244.1",
"method" => "GET",
"request" => "/index.html",
"bytes" => 15824,
"duration" => 0.043,
}
real life setups
GANDI
fotolia
configuration examples
Syslog |
input {
syslog {
port => 1337
type => "syslog"
tags => [ "global" ]
}
}
filter {
noop {
add_field => [ "lsprocessed" , "eventworker1" ]
}
}
output {
stdout { debug => true codec => "json" }
}
SYSLOG |
Dec 1 23:31:48 thrain su[5610]: FAILED su for root by beorn
logstash(SYSLOG){
"message" => "FAILED su for root by beorn",
"@timestamp" => "2013-12-01T22:31:48.000Z",
"@version" => "1",
"type" => "syslog",
"tags" => [
[0] "global"
],
"host" => "127.0.0.1",
"priority" => 13,
"timestamp" => "Dec 1 23:31:48",
"logsource" => "thrain",
"program" => "su",
"pid" => "5610",
"severity" => 5,
"facility" => 1,
"facility_label" => "user-level",
"severity_label" => "Notice",
"lsprocessed" => "eventworker1"
}
APACHE | logger
input {
syslog {
port => 1337
type => "syslog"
tags => [ "global" ]
}
}
filter {
noop {
add_field => [ "lsprocessed" , "eventworker1" ]
}
}
output {
stdout { debug => true codec => "json" }
}
APACHE | logger
logtsash(APACHE |logger)Dec 1 23:48:15 thrain sysadmin5: 127.0.0.1 - - [01/Dec/2013:23:48:15 +0100] "GET / HTTP/1.1" 200 482 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0 Iceweasel/24.0"
{
"message" => "127.0.0.1 - - [01/Dec/2013:23:48:15 +0100] \"GET / HTTP/1.1\" 200 482 \"-\" \"Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0 Iceweasel/24.0\"",
"@timestamp" => "2013-12-01T22:48:15.000Z",
"@version" => "1",
"type" => "syslog",
"tags" => [
[0] "global"
],
"host" => "127.0.0.1",
"priority" => 13,
"timestamp" => "Dec 1 23:48:15",
"logsource" => "thrain",
"program" => "sysadmin5",
"severity" => 5,
"facility" => 1,
"facility_label" => "user-level",
"severity_label" => "Notice",
"lsprocessed" => "eventworker1"
}
apache | JSON
LogFormat "{ \
\"@timestamp\": \"%{%Y-%m-%dT%H:%M:%S%z}t\", \
\"@version\": \"1\", \
\"clientip\": \"%a\", \
\"duration\": %D, \
\"status\": %>s, \
\"message\": \"%U%q\", \
\"urlpath\": \"%U\", \
\"urlquery\": \"%q\", \
\"bytes\": %B, \
\"method\": \"%m\", \
\"referer\": \"%{Referer}i\", \
\"useragent\": \"%{User-agent}i\", \
\"platform\": \"website\", \
\"role\": \"frontend\", \
\"environment\": \"prod\", \
\"vhost\": \"sysadmin5.binaries.fr\" }" logstash_json
apache | JSON
input {
syslog {
port => 1337
type => "syslog"
tags => [ "global" ]
}
}
filter {
noop {
add_field => [ "lsprocessed" , "eventworker1" ]
}
json {
source => "message"
}
}
output {
stdout { debug => true codec => "json" }
}
apache | json | logger
Dec 2 00:12:02 thrain sysadmin5: { "@timestamp": "2013-12-02T00:12:02+0100", "@version": "1", "clientip": "127.0.0.1", "duration": 1774, "status": 200, "message": "/index.html", "urlpath": "/index.html", "urlquery": "", "bytes": 146, "method": "GET", "referer": "-", "useragent": "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0 Iceweasel/24.0", "platform": "website", "role": "frontend", "environment": "prod", "vhost": "sysadmin5.binaries.fr" }
logtsash( apache | json|logger){
"message" => "/index.html",
"@timestamp" => "2013-12-01T23:12:02.000Z",
"@version" => "1",
"type" => "syslog",
"tags" => [
[0] "global"
],
"host" => "127.0.0.1",
"priority" => 13,
"timestamp" => "Dec 2 00:12:02",
"logsource" => "thrain",
"program" => "sysadmin5",
"severity" => 5,
"facility" => 1,
"facility_label" => "user-level",
"severity_label" => "Notice",
"lsprocessed" => "eventworker1",
"clientip" => "127.0.0.1",
"duration" => 1774,
"status" => 200,
"urlpath" => "/index.html",
"urlquery" => "",
"bytes" => 146,
"method" => "GET",
"referer" => "-",
"useragent" => "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0 Iceweasel/24.0",
"platform" => "website",
"role" => "frontend",
"environment" => "prod",
"vhost" => "sysadmin5.binaries.fr"
}
apache | json | logger
logtsash++( apache | json|logger){
"message" => "/index.html",
"@timestamp" => "2013-12-01T23:12:02.000Z",
"@version" => "1",
"type" => "syslog",
"tags" => [
[0] "global"
],
"host" => "127.0.0.1",
"priority" => 13,
"timestamp" => "Dec 2 00:12:02",
"logsource" => "thrain",
"program" => "sysadmin5",
"severity" => 5,
"facility" => 1,
"facility_label" => "user-level",
"severity_label" => "Notice",
"lsprocessed" => "eventworker1",
"clientip" => "127.0.0.1",
"duration" => 1774,
"status" => 200,
"urlpath" => "/index.html",
"urlquery" => "",
"bytes" => 146,
"method" => "GET",
"referer" => "-",
"useragent" => "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0 Iceweasel/24.0",
"platform" => "website",
"role" => "frontend",
"environment" => "prod",
"vhost" => "sysadmin5.binaries.fr",
"geoip" => {
"ip" => "127.0.0.1",
"country_code" => 0,
"country_code2" => "--",
"country_code3" => "--",
"country_name" => "N/A",
"continent_code" => "--"
},
"ua" => {
"name" => "Iceweasel",
"os" => "Linux",
"os_name" => "Linux",
"device" => "Other",
"major" => "24",
"minor" => "0"
}
}
apache | json | fleece
CustomLog "|| /usr/bin/fleece --host logstash --port 1338" logstash_json
ErrorLog "|| /usr/bin/fleece --host logstash --port 1339 --field vhost=sysadmin5.binaries.fr --field role=frontend --field environment=prod --field platform=webmail"
Fleece is a non blocking lightweight udp jsonifyer
Data mining
The most natural indexed storage engine for logstash is Elasticsearch
Kibana
is an AJAX web interface to ES
is an easy way to build and share dashboards
queries look like :
message: "/index.htm" AND tags: "apache" AND tags: "fleece"
KIBANA
a few numbers
Gandi
2000-3000 events/s steady
120 000 000 events / day
200 ms / day of search
Fotolia
1000 events/s steady
90 gB / day of data indexed
Feedbacks
- KISS
- start with capacity planning
- Logstash has a perfectible documentation
- read the code linked from the documentation
- secure your elasticsearch cluster
- understand how elasticsearch works (indices, mapping...)
- use grok the right way
- make consistent choices
- tune the jvm
- tune the IP stack ( especially net_backlog )
Debug cmdline
/usr/bin/java \
-Dcom.sun.management.jmxremote.port=7199 \
-Dcom.sun.management.jmxremote.ssl=false \
-Dcom.sun.management.jmxremote.authenticate=false \
-Xmx1024m \
-Djava.io.tmpdir=/var/lib/logstash/ \
-jar /usr/share/logstash/logstash-1.2.2-flatjar.jar agent \
-f /etc/logstash/ \
--log /var/log/logstash/logstash.log \
--filterworkers 8 \
-vv
questions ?
really ?
Thanks for your attention
nico@fotolia.com
beorn@binaries.fr
beorn@binaries.fr
logstash
By Aurélien ROUGEMONT
logstash
Presentation made for Sysadmin #5 conference in IRCAM paris
- 14,818