Implementing Auto Complete & Did You Mean in Elasticsearch

Objectives

  • What is the end result ?
  • Features and limitations ?
  • How to implement ?
  • What we need ?
  • Demo

End Result

End Result

Features

  1. Searching result for misspell query . Example, 'kula' (should be 'kuala'). Can fix this using fuzziness. (Example, fuzziness(2)).
  2. Size of the results can be controlled.
  3. Context suggester can be used to get specific result. (Example, query of 'kuala' in state 'kedah'. This will return result of 'Kuala Ketil' in Kedah.
  4. Weights can be defined with each document to control their ranking.

Limitations

  1. Duplicates result - cannot remove duplicates result in Elastic builder. But in Elasticsearch, we do have "skip_duplicates": true using GET method in Kibana.
  2. Completion suggester only support prefix matching. It starts matching from the start of the string. To deal with this problem, we need to tokenize the query.

How to implement ?

  • Mapping
  • Indexing
  • Querying
  • Front-end

What we need ?

 

  • Elasticsearch / Kibana (mapping & indexing)
  • Elastic-builder (query)
  • Reactjs (front end)

Mapping

1. Create an index.

2. Create mapping.

  • Under the properties, choose the field to implement autosuggest. The type of the field must be "completion". (Example, "bandar_kawasan").
PUT autosuggest_city
PUT autosuggest_city/_mapping/doc
{
  "properties": {
    "bandar_kawasan": {
            "type": "completion"
          },
    "state": {
            "type": "keyword"
          }
  }
}

Indexing

1. "input" : The input to store, this can be an array of strings or just a string. This field is mandatory.

2. "weight" : A positive integer or a string containing a positive integer, which defines a weight and allows you to rank your suggestions. This field is optional.

3. Completion suggester only support prefix matching. It starts matching from the start of the string. To deal with this type of situation, we can tokenize the input text on space.

POST autosuggest_city/doc
{
  "bandar_kawasan":{
    "input": "peringgit"
  },
  "state" : "melaka"
}
POST autosuggest_city/doc
{
  "bandar_kawasan":{
    "input": ["kuala ketil", "ketil"],
    "weight" : 30
  },
  "state" : "kedah"
}

Querying

1. "prefix" : Prefix used to search for suggestions.

2. "completion" : Types of suggestion.

3. "field" : Name of the field to search for suggestion.
4. "size" : The number of suggestions to return (defaults to 5).

5. "skip_duplicates" : Whether duplicate suggestions should be filtered out (defaults to false).

6. "fuzzy" : can have a typo in search and still get results back.

GET autosuggest_city/_search
{
    "suggest": {
        "autosuggest" : {
            "prefix" : "kuala", 
            "completion" : { 
                "field" : "bandar_kawasan",
		"size" : "2",
		"skip_duplicates" : true,
		"fuzzy" :{
                    "fuzziness" : 2
                }
            }
        }
    }
}

Querying - Elastic Builder

1. request body search:

  • "completionSuggester" : Types of suggestion.
  • "field" : Name of the field to search for suggestion.
  •  "prefix" : Prefix used to search for suggestions.
  • "size" : The number of suggestions to return (defaults to 5).
  • "fuzziness" : can have a typo in search and still get results back.

 

Reference - Elastic Builder

const elasticsearch = require('elasticsearch');
const bob = require('elastic-builder');
const _ = require('lodash');
const client = new elasticsearch.Client({
    host : '103.245.90.189:3002',
});
const index = 'autosuggest_city';

const simpleQuery = async () => {

    const requestBody = bob.requestBodySearch()
    .suggest(
        bob.completionSuggester('autosuggest', 'bandar_kawasan')
        .prefix('kuala')
        .size(10)
        .fuzziness(2)
        // .contexts('state', [
        //     "terengganu" 
        //     {context: 'kedah'}, 
        //     {context: 'melaka', boost:2}
        // ])
    );

    const response = await client.search({
            index: index,
            body: requestBody,
        });

    try{
        res =  _.map(response.suggest.autosuggest[0].options, options => {
            return options.text; 
        });
        console.log(res)
        // console.log(response.suggest.autosuggest[0].options)
    } catch (error) {
        console.log(error.message)
    }

}

simpleQuery();

Front End

// issues: https://gitlab.com/gds_datamanagement/myipcs-dm/issues/157

import React, { Component } from 'react';
import {
  Row,
  Col,
  CardBody,
  Card,
  CardHeader,
  Input,
  InputGroup,
  InputGroupAddon,
  Button,
  ListGroup,
  ListGroupItem,
  Table,
} from 'reactstrap';
import elasticsearch from 'elasticsearch';
import bob from 'elastic-builder';
import _ from 'lodash';
import 'react-table/react-table.css';

const client = new elasticsearch.Client({
  host: '103.245.90.189:3002',
});
const index = 'autosuggest_city';

class AutoCom extends Component {
  constructor(state) {
    super(state);
    this.state = {
      didYouMean: null,
      respondedItems: null,
      results: [],
    };
  }

  termSearch(input) { // eslint-disable-line
    client.search({
      index,
      body: {
        suggest: {
          'auto-suggest': {
            prefix: input,
            completion: {
              field: 'bandar_kawasan',
              size: '10',
              skip_duplicates: true,
              fuzzy: {
                fuzziness: 1,
              },
              // contexts: {
              //   state: [this.state.inputTwo],
              // },
            },
          },
        },
      },
    }).then((resp) => {
      const arrayPicked = [];
      // console.log(resp.suggest['auto-suggest'][0].options); // eslint-disable-line
      _.map(resp.suggest['auto-suggest'][0].options, (item, indexnum) => {
        // console.log(typeof item._source.bandar_kawasan.input); // eslint-disable-line
        if (typeof item._source.bandar_kawasan.input === 'object') {
          arrayPicked.push({ // for input as an array
            No: indexnum + 1,
            bandar_kawasan: item._source.bandar_kawasan.input[0], // eslint-disable-line
            state: item._source.state, // eslint-disable-line
          });
        } else { // for input as a string
          arrayPicked.push({
            No: indexnum + 1,
            bandar_kawasan: item._source.bandar_kawasan.input, // eslint-disable-line
            state: item._source.state, // eslint-disable-line
          });
        }
      });
      this.setState({
        results: arrayPicked,
      });
    });
  }

  querySuggestions() { // eslint-disable-line
    const requestBody = bob.requestBodySearch()
      .suggest(bob.completionSuggester('autosuggest', 'bandar_kawasan')
        .prefix(document.getElementById('userInput').value)
        .size(10)
        .fuzziness(2));
    client.search({
      index,
      body: requestBody,
    }).then(({ suggest: autosuggest }) => {
      const response = autosuggest.autosuggest[0].options;
      this.setState({
        didYouMean: null,
        respondedItems: response,
      });
    });
  }

  searchSuggestions(dontmean) {
    if (document.getElementById('userInput').value === '') {
      this.setState({
        results: [],
      });
    } else {
      const requestBody = bob.requestBodySearch()
        .suggest(bob.completionSuggester('autosuggest', 'bandar_kawasan')
          .prefix(document.getElementById('userInput').value)
          .size(10)
          .fuzziness(2));
      client.search({
        index,
        body: requestBody,
      }).then(({ suggest: autosuggest }) => {
        const response = autosuggest.autosuggest[0].options;
        const testCase = _.find(response, ['text', document.getElementById('userInput').value]);
        if (typeof testCase === 'object' || dontmean) {
          this.setState({
            didYouMean: null,
          });
        } else if (typeof testCase === 'undefined') {
          this.setState({
            didYouMean: response[0].text,
            respondedItems: null,
          });
        }
      });
    }
  }

  correctingInput(newInput) {
    document.getElementById('userInput').value = newInput;
    this.setState({
      didYouMean: null,
      respondedItems: null,
    }, () => this.searchSuggestions(true));
    this.termSearch(newInput);
  }

  render() {
    return (
      <div>
        <Row>
          <Col>
            <Card>
              <CardHeader>Search</CardHeader>
              <CardBody>
                <InputGroup>
                  <Input id="userInput" type="text" onChange={() => this.querySuggestions()} />
                  <InputGroupAddon addonType="prepend">
                    <Button onClick={() => {
                      this.searchSuggestions();
                      this.termSearch(document.getElementById('userInput').value);
                    }}
                    > Search
                    </Button>
                  </InputGroupAddon>
                </InputGroup>
                {this.state.respondedItems !== null && (
                <ListGroup>
                  {
                    _.map(this.state.respondedItems, ({ text, _id }) => (<ListGroupItem key={_id} onClick={() => this.correctingInput(text)} style={{ cursor: 'pointer' }}>{text}</ListGroupItem>))
                  }
                </ListGroup>
                )}
                {this.state.didYouMean !== null && (
                  <div>
                    <br />
                    <Button onClick={() => this.correctingInput(this.state.didYouMean)}>
                      did you mean: {this.state.didYouMean} ?
                    </Button>
                  </div>
                )}
              </CardBody>
            </Card>
            <Card>
              <CardHeader>Result</CardHeader>
              <CardBody>
                {this.state.results.length === 0 && (<span>No result found</span>)}
                {this.state.results.length > 0 && (
                <Table>
                  <thead>
                    <tr>
                      { _.map(Object.keys(this.state.results[0]), (item, innerIndex) => (
                        <th key={`${innerIndex}th`}>{item}</th>
                        ))}
                    </tr>
                  </thead>
                  <tbody>
                    { _.map(this.state.results, (item, innerIndex1) => (
                      <tr key={innerIndex1}>
                        {_.map(item, (innerItem, innerIndex2) => (
                          <td key={innerIndex2}>{innerItem}</td>
                          ))}
                      </tr>
                      ))}
                  </tbody>
                </Table>
                )
                }
              </CardBody>
            </Card>
          </Col>
        </Row>
      </div>
    );
  }
}

export default AutoCom;

Demo

Made with Slides.com