Implementing Filter, Analyzer, and Tokenizer in Elasticearch for Smart Search

Completion suggester limitation

  • Single Field
  • Prefix Query

Solutions

  • Filters
  • Analyzer
  • Tokenizer

Settings

Filters:

  1. synonym
  2. ngram

 

Tokenizer:

  1. whitespace

 

Analyzers:

  1. synonym_analyzer
  2. autocomplete_analyzer
PUT /synonym_test
{
  "settings": {
    "index": {
      "max_ngram_diff": 99,
      "analysis": {
        "analyzer": {
          "synonym_analyzer": {
            "tokenizer": "whitespace",
            "filter": [
              "synonym"
            ]
          },
          "autocomplete_analyzer": {
            "type": "custom",
            "tokenizer": "whitespace",
            "filter": [
              "lowercase",
              "autocomplete_filter"
            ]
          }
        },
        "filter": {
          "synonym": {
            "type": "synonym",
            "synonyms": [
              "unlimited => infinity",
              "chaos => war"
            ]
          },
          "autocomplete_filter": {
            "type": "ngram",
            "min_gram": 1,
            "max_gram": 20
          }
        }
      }
    }
  }
}

Mappings

  1. analyzer
  2. search_analyzer
  3. copy_to
PUT /synonym_test/_mapping/doc
{
  "properties": {
    "movie_name": {
      "type": "text",
      "analyzer": "autocomplete_analyzer",
      "search_analyzer": "synonym_analyzer",
      "copy_to": "query_text"
    },
    "year": {
      "type": "text",
      "analyzer": "autocomplete_analyzer",
      "search_analyzer": "synonym_analyzer",
      "copy_to": "query_text"
    },
    "subtitle": {
      "type": "text",
      "analyzer": "autocomplete_analyzer",
      "search_analyzer": "synonym_analyzer",
      "copy_to": "query_text"
    },
    "query_text": {
      "type": "text"
    }
  }
}

Query

  1. synonym
  2. typo (did you mean)
  3. autocomplete
  4. exact search
  5. middle search
GET /synonym_test/_search
{
  "query": {
    "multi_match" : {
      "query":    "2014 malay", 
      "fields": [ "movie_name", "year", "subtitle" ],
      "fuzziness": "auto"
    }
  }
}

Implementing Filter, Analyzer and Tokenizer in Elasticearch for Autocomplete

By Muhammad Izzuddin Abdul Mutalib

Implementing Filter, Analyzer and Tokenizer in Elasticearch for Autocomplete

  • 293