elasticsearch

kibana
Elasticsearch
- is real-time
- is a distributed search and analytics engine
- is a document store
Kibana
- visualisering av data i elasticsearch
- Tabeller och grafer
- Geo data
- Tidserier
- Relationer i grafdata
Basics
Node 1
Node 2
Node 3
Index A
Shard 1
Index A
Shard 2
Index A
Shard 3
Index A
Replica 2
Index A
Replica 3
Index A
Replica 1
Cluster 1
Kibana
Index
-
Index with static size
- job, employee, candidate
- Continuously growing index
-
logs, transactioner etc
- serilog-2018.01.01
-
logs, transactioner etc
Data input
What's a document
{
"name": "John Doe",
"age": 42,
"confirmed": true,
"created": "2018-01-01T12:00:00",
"adress": {
"street": "Gatan 1",
"zip": "12344",
"city": "Farsta"
},
"tags": [
{ "type": "Category", "value": "IT" },
{ "type": "Employment period", "value": "Deltid" }
]
}Document metadata
- _index: Name of the index the document lives in
- _type: Name of the type of the document (< 6)
- _id: Unique id of a document
- Settings
- Analyzers
- Mappings
Schemas
Different types of search
Boolean searches
- Efficient
- Match or no match
- Like WHERE in sql
- Does the data match?
Full text search
- Slower then boolean searches (more efficient than % searches in sql)
- Give result with a relevans to the search
- How well does the data match
Combinations
Inverted index
How the data is stored in elastic explains searches
Given the following documents:
- Den snabba bruna räven hoppar över den lata hunden
- Snabba bruna rävar hoppar över lata hundar på sommaren
Inverted index
| 1 | 2 |
----------------------
Den | x | |
---------|-----|-----|
snabba | x | |
---------|-----|-----|
bruna | x | x |
---------|-----|-----|
räven | x | |
---------|-----|-----|
hoppar | x | x |
---------|-----|-----|
över | x | x |
---------|-----|-----|
den | x | |
---------|-----|-----|
lata | x | x |
---------|-----|-----|
hunden | x | |
---------|-----|-----|
Snabba | | x |
---------|-----|-----|
rävar | | x |
---------|-----|-----|
hundar | | x |
---------|-----|-----|
på | | x |
---------|-----|-----|
sommaren | | x |
----------------------
Query: snabba bruna
Index
Terms | 1 | 2 |
--------------------
snabba | x | |
-------|-----|-----|
bruna | x | x |
--------------------
Total | 2 | 1 |
Normalisering
| 1 | 2 |
----------------------
den | x | |
---------|-----|-----|
snabb | x | x |
---------|-----|-----|
bruna | x | x |
---------|-----|-----|
räv | x | x |
---------|-----|-----|
hoppa | x | x |
---------|-----|-----|
över | x | x |
---------|-----|-----|
lata | x | x |
---------|-----|-----|
hund | x | x |
---------|-----|-----|
på | | x |
---------|-----|-----|
sommaren | | x |
----------------------
Query: snabba bruna
Index
Terms | 1 | 2 |
--------------------
snabb | x | x |
-------|-----|-----|
brun | x | x |
--------------------
Total | 2 | 2 |
Analysis
- Character filters
- Per tecken tranformering
- rensa html, w -> v
- Tokenizer
- Dela upp texten till ord
- Token filters
- lowercase, synonyms, stemming etc
Standard analysers
- Standard analyzer
- Word boundaries by unicode standard, erase most punctuations, lower case (generally best choice)
- Simple analyzer
- Splits the text on anything that isn’t a letter, and lowercases the terms
- Whitespace analyzer
- Split on whitespace, does not lowercase
- Language analyzer
- Language specific analyzers
When are analyzers used?
On all full text fields
It is used when indexing and when searching on the search string
Mapping
{
"mappings": {
"candidate": {
"properties": {
"name": { "type": "text" },
"birth_date": { "type": "date" },
"adress": {
"properties": {
"street": { "type": "text" },
"zipcode": { "type": "long" },
"city": { "type": "keyword" }
}
},
"contacts": {
"properties": {
"home_phone": { "type": "keyword" },
"modile_phone": { "type": "keyword" },
"email": { "type": "keyword" }
}
},
"ambition": {
"type": "text"
}
}
}
}
}
Index mapping
Queries
Match_All
POST candidates/_search
POST candidates/_search {}
POST candidates/_search
{
"query": {
"match_all": {}
}
}Match
// match on full text
POST candidates/candidate/_search
{
"query": {
"match": {
"name": "Maria"
}
}
}
POST candidates/candidate/_search
{
"query": {
"match": {
"adress.city": "Karlstad"
}
}
}
Term/Terms
// Ok
POST candidates/candidate/_search
{
"query": { "term": { "name": "maria" } }
}
// Fail wrong casing
POST candidates/candidate/_search
{
"query": { "term": { "name": "Maria" } }
}
// Ok
POST candidates/candidate/_search
{
"query": { "term": { "adress.city": "Karlstad" } }
}
// Fail wrong casing
POST candidates/candidate/_search
{
"query": { "term": { "adress.city": "karlstad" } }
}
Range
POST candidates/candidate/_search
{
"query": {
"range": {
"adress.zipcode": {
"gte": 13501
}
}
}
}
POST candidates/candidate/_search
{
"query": {
"range": {
"adress.zipcode": {
"lte": 13500
}
}
}
}
Aggregations
Types
- Metrics
- Min, max, percentiles etc
- Buckets
- Grupperingar så som: Terms, Histogram, Date histograms etc
- Pipeline
- Nested aggs
Aggregeringar together with search
- Facetterad search
- Combination of filters and search
- Aggregations are used for filters
- Most common aggregation is grouping of tokens (term agg)
- Histogram (spread of numbers, ex age, salary or price)
Terms
GET /_search
{
"aggs" : {
"genres" : {
"terms" : { "field" : "genre" }
}
}
}Histogram
POST /sales/_search?size=0
{
"aggs" : {
"prices" : {
"histogram" : {
"field" : "price",
"interval" : 50
}
}
}
}Kibana
Applikationer

Discover
Visualize
Dashboard
Timelion
Graph
Dev tools
Monitor
Management
Discover
Ett verktyg för att göra data utforskning genom sökning och filtrering
https://lucene.apache.org/core/2_9_4/queryparsersyntax.html
Visualize
- Tabeller
- Grafer
- Pie
- Bar
- Line
- Area
- Histogram
- Datehistogram
- Heatmap
- Tidserie grafer
Dashboard
Gruppering av sökningar och visualiseringar
Elasticsearch Kibana
By fhelje
Elasticsearch Kibana
- 425