Elastic Search

(Attachment mapper)

What is Elastic Search?

Elasticsearch is a search server based on Lucene. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java and is released as open source under the terms of the Apache License. Elasticsearch is the most popular enterprise search engine followed by Apache Solr, also based on Lucene.[1]

Why to use Elastic Search

Conventional SQL database managements systems aren't really designed for full-text searches, and they certainly don't perform well against loosely structured raw data that resides outside the database. On the same hardware, queries that would take more than 10 seconds using SQL will return results in under 10 milliseconds in Elasticsearch.

Example of document indexing and search in Elastic Search

Because Elastic Search is using web http interface, the next examples won't be in any web programming language, but in bash simple "CURL".

 

We are going to assume we have messages and every message can have an attachment, later we will try to search which messages have attachment that contains a specific phrase.

Creating new mapper

In order to store data into elastic search we need to create a mapper, but before let's define property mapping

curl -X POST "http://localhost:9200/messages"

now let's create our mapping

curl -X POST "http://localhost:9200/messages/attachment/_mapper" -d '
{
"attachment" : {
        "properties" : {
            "file" : { "type" : "attachment" }
        }
    }
}
'

The output we will get is

{
    "_index":"messages",
    "_type":"attachment",
    "_id":"_mapper",
    "_version":1,
    "_shards":{"total":2,"successful":1,"failed":0},"created":true}
}

Store our new document

Lets assume we have a document with the id 1

and we have a text file with the word 'Nevo David'

We will take our document data the convert it to base64,

And we will get something like that:

TmV2byBEYXZpZAo=

Now lets try to store the encoded string into message number one

curl -X POST "http://localhost:9200/messages/attachment/1" -d '
{
"file" : "TmV2byBEYXZpZA=="
}
'

pretty simple isn't it? same output here

{
    "_index":"messages",
    "_type":"attachment",
    "_id":"_mapper",
    "_version":1,
    "_shards":{"total":2,"successful":1,"failed":0},"created":true}
}

Search in Elastic Search

Now let's get the message which containing the work "Nev"

curl -X GET "http://localhost:9200/messages/attachment/_search" -d '
{
    "query" : {
        "query_string" : {
            "query" : "Nev*"
        }
    }
}
'

we used wildcard '*' in our example to search for Nevo David

This will give us the final output which we can then later extract the message id

Final Output


{
   "took": 2,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 0.16273327,
      "hits": [
         {
            "_index": "messages",
            "_type": "attachment",
            "_id": "1",
            "_score": 0.16273327,
            "fields": {
               "file.content": "TmV2byBEYXZpZAo="
            }
         }
      ]
   }
}

Elastic Search

By Nevo David

Elastic Search

  • 597