NOSQL and Elasticsearch

  • What is NOSQL??
  • CAP Theorem
  • NOSQL Database Types
  • Elasticsearch Definition 
  • why we use ElasticSearch
  • ElasticSearch Mapping and Query
Ali Balci

WHAT IS NOSQL ???

NoSql class of DB Management  system that dont follow rules  of relation DBMS and cannot use  SQL to query data.This type of not generally replacement but rather complemantry to RDBMS and SQL

Database replication for reading performance

Database partition for writing

performance

  • ACID(Atomicity,Consistency,Isolation,Durability)
  • Scale Up/Out                                                          

              Scale Up/Out

Scalability is the ability of a system, network, or process to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth.

 

     Building performance into a software system was simple - you either increased your hardware resources (scale up) or modified your application to run more efficiently (performance tuning). Today, there's a third option: horizontal scaling (scale out).

Vertical Scaling (Scale-up): Generally refers to adding more processors and RAM, buying a more expensive and robust server.

  • Pros
    • Less power consumption than running multiple servers
    • Cooling costs are less than scaling horizontally
    • Generally less challenging to implement
    • Less licensing costs
    • (sometimes) uses less network hardware than scaling horizontally 
  • Cons
    • PRICE, PRICE, PRICE
    • Greater risk of hardware failure causing bigger outages

Horizontal Scaling (Scale-out): Generally refers to adding more servers with less processors and RAM. This is usually cheaper overall and can literally scale infinitely (although we know that there are usually limits imposed by software or other attributes of an environment’s infrastructure)

  • Pros
    • Much cheaper than scaling vertically
    • Easier to run fault-tolerance
    • Easy to upgrade
  • Cons
    • More licensing fees
    • Bigger footprint in the Data Center
    • Higher utility cost (Electricity and cooling)
    • Possible need for more networking equipment (switches/routers)

CAP THEOREM (Brewer's Theorem)

The CAP Theorem states that, in a distributed system (a collection of interconnected nodes that share data.), you can only have two out of the following three guarantees across a write/read pair: Consistency, Availability, and Partition Tolerance - one of them must be sacrificed.

  • BASE ( Basically Available Soft-state )
  • ACID( Atomicity, Consistency, Isolation, Durability )

Types of NoSQL Databases

  • Key-Value store :The main idea here is using a hash table where there is a unique key and a pointer to a particular item of data. The Key/value model is the simplest and easiest to implement
  • Wide-Column store :Each record can vary in the number of columns that are
    stored, and columns can be nested inside other columns called super columns.
  • Graph database : Based on graph theory, these databases are designed for data whose relations are well represented as a graph and has elements which are interconnected, with an undetermined number of relations between them
  • Document database : Expands on the basic idea of key-value stores where “documents” contain more complex in that they contain data and each document is assigned a unique key, which is used to retrieve the document. 

ElasticSearch

Elasticsearch is a search server based on Lucene. It provides a distributed, multitenant-capable full-text search engine with a RESTful web interface and schema-free JSON documents. Elasticsearch is developed in Java and is released as open source under the terms of the Apache License.

Inverted index:is an index data structure storing a mapping from content, such as words or numbers

T[0] = "it is what it is"
T[1] = "what is it"
T[2] = "it is a banana"
"a":      {2}
"banana": {2}
"is":     {0, 1, 2}
"it":     {0, 1, 2}
"what":   {0, 1}

search for the terms "what""is" and "it"

full inverted index

"a":      {(2, 2)}
"banana": {(2, 3)}
"is":     {(0, 1), (0, 4), (1, 1), (2, 1)}
"it":     {(0, 0), (0, 3), (1, 2), (2, 0)} 
"what":   {(0, 2), (1, 0)}

          Why we use ElasticSearch

  • Full Text Search

  • Clustering

  • Horizontal Scalability

  • Read and Write Efficiency

  • Fault Tolerant

  • REST API with JSON

  • ElasticSearch Query

post /_search
{
    "query": {
        "match_all": {}
    }
}


post /_search
{
    "query": {
        "match": {
           "brandId": "421"
        }
    }
}

POST  /_search
{
    "size": 20,
    "from": 10, 
    "query": {
        "filtered": {
           "query": {"match_all": {}},
           "filter": {
               "nested": {
                  "path": "navigations",
                  "query": {
                      "term": {
                         "navigations.navigationId": {
                            "value": "736"
                         }
                      }
                  }
               }
           }
        }
    }
}

Query DSL