elasticsearch

kibana

Elasticsearch

is real-time
is a distributed search and analytics engine
is a document store

Kibana

visualisering av data i elasticsearch
- Tabeller och grafer
- Geo data
- Tidserier
- Relationer i grafdata

Basics

Node 1

Node 2

Node 3

Index A

Shard 1

Index A

Shard 2

Index A

Shard 3

Index A

Replica 2

Index A

Replica 3

Index A

Replica 1

Cluster 1

Kibana

Index

Index with static size
- job, employee, candidate
Continuously growing index
- logs, transactioner etc
  - serilog-2018.01.01

Data input

What's a document

{
    "name": "John Doe",
    "age": 42,
    "confirmed": true,
    "created": "2018-01-01T12:00:00",
    "adress": {
        "street": "Gatan 1",
        "zip": "12344",
        "city": "Farsta"
    },
    "tags": [
        { "type": "Category", "value": "IT" },
        { "type": "Employment period", "value": "Deltid" }
    ]
}

Document metadata

_index: Name of the index the document lives in
_type: Name of the type of the document (< 6)
_id: Unique id of a document
Settings
Analyzers
Mappings

Schemas

Different types of search

Boolean searches

Efficient
Match or no match
Like WHERE in sql
Does the data match?

Full text search

Slower then boolean searches (more efficient than % searches in sql)
Give result with a relevans to the search
How well does the data match

Combinations

Inverted index

How the data is stored in elastic explains searches

Given the following documents:

Den snabba bruna räven hoppar över den lata hunden
Snabba bruna rävar hoppar över lata hundar på sommaren

Inverted index

         |  1  |  2  |
----------------------
Den      |  x  |     |
---------|-----|-----|
snabba   |  x  |     |
---------|-----|-----|
bruna    |  x  |  x  |
---------|-----|-----|
räven    |  x  |     |
---------|-----|-----|
hoppar   |  x  |  x  |
---------|-----|-----|
över     |  x  |  x  |
---------|-----|-----|
den      |  x  |     |
---------|-----|-----|
lata     |  x  |  x  |
---------|-----|-----|
hunden   |  x  |     |
---------|-----|-----|
Snabba   |     |  x  |
---------|-----|-----|
rävar    |     |  x  |
---------|-----|-----|
hundar   |     |  x  |
---------|-----|-----|
på       |     |  x  |
---------|-----|-----|
sommaren |     |  x  |
----------------------

Query: snabba bruna

Index

Terms  |  1  |  2  |
--------------------
snabba |  x  |     |
-------|-----|-----|
bruna  |  x  |  x  |
--------------------
Total  |  2  |  1  |

Normalisering

         |  1  |  2  |
----------------------
den      |  x  |     |
---------|-----|-----|
snabb    |  x  |  x  |
---------|-----|-----|
bruna    |  x  |  x  |
---------|-----|-----|
räv      |  x  |  x  |
---------|-----|-----|
hoppa    |  x  |  x  |
---------|-----|-----|
över     |  x  |  x  |
---------|-----|-----|
lata     |  x  |  x  |
---------|-----|-----|
hund     |  x  |  x  |
---------|-----|-----|
på       |     |  x  |
---------|-----|-----|
sommaren |     |  x  |
----------------------

Query: snabba bruna

Index

Terms  |  1  |  2  |
--------------------
snabb  |  x  |  x  |
-------|-----|-----|
brun   |  x  |  x  |
--------------------
Total  |  2  |  2  |

Analysis

Character filters
- Per tecken tranformering
- rensa html, w -> v
Tokenizer
- Dela upp texten till ord
Token filters
- lowercase, synonyms, stemming etc

Standard analysers

Standard analyzer
- Word boundaries by unicode standard, erase most punctuations, lower case (generally best choice)
Simple analyzer
- Splits the text on anything that isn’t a letter, and lowercases the terms
Whitespace analyzer
- Split on whitespace, does not lowercase
Language analyzer
- Language specific analyzers

When are analyzers used?

On all full text fields

It is used when indexing and when searching on the search string

Mapping

{
    "mappings": {
        "candidate": {
            "properties": {
                "name": { "type": "text" },
                "birth_date": { "type": "date" },
                "adress": {
                    "properties": {
                        "street": { "type": "text" },
                        "zipcode": { "type": "long" },
                        "city": { "type": "keyword" }
                    }
                },
                "contacts": {
                    "properties": {
                        "home_phone": { "type": "keyword" },
                        "modile_phone": { "type": "keyword" },
                        "email": { "type": "keyword" }
                    }
                },
                "ambition": {
                    "type": "text"
                }
            }
        }
    }
}

Index mapping

Queries

Match_All

POST candidates/_search

POST candidates/_search {}

POST candidates/_search
{
  "query": {
    "match_all": {}
  }
}

Match

// match on full text
POST candidates/candidate/_search
{
  "query": {
    "match": {
      "name": "Maria"
    }
  }
}

POST candidates/candidate/_search
{
  "query": {
    "match": {
      "adress.city": "Karlstad"
    }
  }
}

Term/Terms

// Ok
POST candidates/candidate/_search
{
  "query": { "term": { "name": "maria" } }
}

// Fail wrong casing
POST candidates/candidate/_search
{
  "query": { "term": { "name": "Maria" } }
}

// Ok
POST candidates/candidate/_search
{
  "query": { "term": { "adress.city": "Karlstad" } }
}

// Fail wrong casing
POST candidates/candidate/_search
{
  "query": { "term": { "adress.city": "karlstad" } }
}

Range

POST candidates/candidate/_search
{
  "query": { 
    "range": {
      "adress.zipcode": {
        "gte": 13501
      }
    }
  }
}

POST candidates/candidate/_search
{
  "query": { 
    "range": {
      "adress.zipcode": {
        "lte": 13500
      }
    }
  }
}

Aggregations

Types

Metrics
- Min, max, percentiles etc
Buckets
- Grupperingar så som: Terms, Histogram, Date histograms etc
Pipeline
- Nested aggs

Aggregeringar together with search

Facetterad search
Combination of filters and search
Aggregations are used for filters
Most common aggregation is grouping of tokens (term agg)
Histogram (spread of numbers, ex age, salary or price)

Terms

GET /_search
{
    "aggs" : {
        "genres" : {
            "terms" : { "field" : "genre" }
        }
    }
}

Histogram

POST /sales/_search?size=0
{
    "aggs" : {
        "prices" : {
            "histogram" : {
                "field" : "price",
                "interval" : 50
            }
        }
    }
}

Kibana

Applikationer

Discover

Visualize

Dashboard

Timelion

Graph

Dev tools

Monitor

Management

Discover

Ett verktyg för att göra data utforskning genom sökning och filtrering

https://lucene.apache.org/core/2_9_4/queryparsersyntax.html

Visualize

Tabeller
Grafer
- Pie
- Bar
- Line
- Area
- Histogram
- Datehistogram
- Heatmap
Tidserie grafer

Dashboard

Gruppering av sökningar och visualiseringar

Elasticsearch Kibana

By fhelje

Elasticsearch Kibana

fhelje

fhelje

elasticsearch

kibana

Elasticsearch Kibana

More from fhelje