Introduction to NoSQL

Chandan Jhunjhunwal

Session-1: 14-05-2017

https://github.com/indyarocks/
@ChandanJ

Agenda

  • History of databases
  • CAP Theorem
  • RDBMS vs NoSQL
  • ACID vs BASE
  • NoSQL Solutions
  • DynamoDB

History of DB

Source: https://www.pinterest.com/pin/416653403000976761/

CAP Theorem(Distributed DataStore)

  1. Consistency: Every read fetches the most recent write or an error
  2. Availability: Every read fetches a set of data without a guarantee of the latest data
  3. Partition tolerance: Distributed data store 

Misconception: You don't have to abandon one of the three all the time. The choice is really between consistency and availability only when network partition or failure happens

CAP Theorem(Distributed DataStore)

Availability

Consistency

Business Decision

CAP Theorem

Source: https://dzone.com/articles/better-explaining-cap-theorem

Source: https://dzone.com/articles/better-explaining-cap-theorem

RDBMS vs NoSQL(Not only SQL)

  • Relational Schema
  • Scalable Reads
  • Custom high-availability
  • Flexible queries
  • Consistency
  • ACID
  • Chooses consistency over ​availability
  • Schema free
  • Scalable writes/reads
  • Auto high-availability
  • Limited queries
  • Eventual consistency
  • BASE
  • Chooses availability over consistency

RDBMS

NoSQL

  • Atomicity - All or nothing
  • Consistency - One valid state to another valid state
  • Isolation - Sequential concurrent execution
  • Durability - Persistence even in case of failure

ACID


CREATE TABLE customer_account (CR INTEGER, DR INTEGER, CHECK(CR + DR = 100));
  • Basic Availability - Available even if the data may be inconsistent
  • Soft state: The state of system could change over time, so even if there is no input, there may be changes going on due to 'eventual consistency', thus the state of the system is always 'soft'
  • Eventual consistency: The system will become consistent 'eventually' once it stops receiving inputs.

BASE

  • Column database - Cassandra, HBase
    Use-case: Scaling, Keeping unstructured, non-volatile information
  • Document database: MongoDB, CouchDB
    Use-case: Nested information, JSON data
  • Key-value database: Redis, Memcached
    Use-case: Caching, Queueing
  • Graph database: Neo4j
    Use-case: Handling complex relational information

NoSQL Solutions

NoSQL Solutions

Source: Martin Fowler NoSQL Distilled

DynamoDB

  • A highly scalable and fully managed NoSQL database as a service by Amazon
  • No hassle of setting up replication(3x), cluster scaling, hardware provisioning, configuration etc
  • Supports document and key-value data
  • Scale up or down tables throughput capacity without downtime or performance degradation
  • AWS CloudWatch Monitoring
  • Seamless integration with DynamoDB Streams, EMR, Redshift etc
  • Simple API

 

Thank You!

 

References

Introduction to DynamoDB

By Chandan Jhunjhunwal

Introduction to DynamoDB

  • 528