paws on Elasticsearch

15.12.2017
torsti @ Wunderdog
github

Elasticsearch is Good Fulltext search infra



  •  Wikipedia
  • The Guardian
  • StackOverflow
  • GitHub
  • many others

powered by Apache Lucene



lucene
queries and index

elastisearch 
restful interface
scale

word on host es


you can get things done using 

AWS Elasticsearch domains

elastic co Elastic Cloud on AWS or GCP

you may never hit the limitations described in
why you shouldn't use AWS Elasticsearch Service

Index


inverted index
tokens point to the document

Query



  1. make tokens (by some process)
  2. match tokens against all known tokens
  3. return documents where tokens match (by some algorithm)

Tf-idf roughly



Term frequency
how often does this term appear in this document


Inverse document frequency
how often does this term appear in all documents

stop words


a stop word is a word that doesn't get indexed even if it's in the document because it's so common
 
the idf is very small

paws on

Made with Slides.com