Playing with OpenSearch / ElasticSearch

What is Elastic/OpenSearch?

  • Search engine built on Lucene
    • Solr is also built on top of Lucene
  • Pass it json documents
  • Native geosearch capabilities
  • Supposedly fast

Why?

  • Curious for a long time
  • Solr has been a pain for us (maintenance-wise)
  • AWS offers a hosted solution
    • OpenSearch 2.x, ElasticSearch 7.x
  • Well supported

What about MongoDB?

  • Conceptually similar
    • Document oriented storage
    • Sharding
    • Geosearch
    • Ways to reference other objects
  • Pros
    • Closer to schemaless
    • Store very large docs
  • Cons
    • Limited # of indexes (64 vs ~1k)
    • Cannot do fast full-text search

How

  • Searchkick
    • https://github.com/ankane/searchkick
  • Compatible with OpenSearch and ElasticSearch
  • "One line" integration (not really)
    • Create index
    • Prep data to send for indexing
  • Autocorrect, autoboost, and more
  • Integrates with sidekiq

Demo

Doin' it live with unoptimized, unrefined codeĀ 

Demo (cont'd)

  • Users
    • Index/put subset of all data (no password or mfa info, for example)
    • User mgmt mysql query vs ES equivalent

Demo (cont'd)

  • App Variable Values
    • Create 'variable-type-specific' fields
      • 75 text fields
      • 75 keyword fields
      • 75 numeric fields
      • 75 date fields
      • 5 location fields
    • General query w/o app variable id vs ES equivalent
    • Location query w/ sorting by distance from point

Findings

  • Huge speed boost on user admin management page
    • 11 second queries to < 0.2s
  • Large speed/accuracy boosts on variable queries
    • 70 second queries to < 0.4s
  • No longer have to support/maintain Solr
  • Possibly cut DB server costs by over 75%
    • 128G RDS down to 16G
  • Slight offset of ES cost but net savings of > 60%

Possible next steps

  • Replace components using db search with searchkick (user mgmt page queries)
  • Search query builder?
  • Logs?
  • Apps?
  • Setup in AWS?

Thanks