MongoDB

What are we covering?

  • Introduction
  • Installation
  • Create, Update & Delete Documents
  • Querying Data
  • Indexes
  • Aggregation
  • Advanced Topics

Introduction

What is MongoDB

  • A Document oriented database
  • Schema Free
  • Built for scalability
  • Developer Friendly
  • Easy to administer
  • Makes complex data modelling easier
  • Not a replacement for RDBMS
  • Indexing, Stored JavaScript, Aggregation, Capped Collections, File storage (GridFS)

Documents & Collections

  • Document is the basic unit of data in MongoDB
  • Document is equivalent of a row in a RDBMS, but it is much more rich than a RDBMS row
  • Documents are grouped together in a Collection
  • A collection is a schema free equivalent of a RDBMS table
  • Every document has a special key "_id" that is unique across document collection

More Documents

  • Ordered set of keys and its associated value
  • Apart from key value pairs, a document can contain sub document
  • Key - Value pairs are ordered
  • Keys are strings - specifically UTF-8 encoded strings.
    • Keys must not contain \0 (null) character
    • . and $ are special
    • keys starting with _ are reserved.
  • Keys are type and case sensitive
  • Keys must not be duplicated

Collections

  • Group of Documents
  • Schema Free, each document within a collection can have different structure
  • Use namespaced sub-collection to organise related data (projects.material, projects.managers)
  • Naming
    • UTF-8
      • Should not contain null character
      • Should not prefix with system

Database

  • Grouping of collections
  • Single instance can host several databases
  • Database names must be UTF-8 Strings
    • No empty string
    • No Null Char "\0"
    • should be lower case name (convention)
    • Names limited to 64 bytes
  • Some reserved databases
    • admin
    • local
    • config

Datatypes

  • JSON Like & conceptually similar to JSON
  • Only Six Types in JSON
    • null, boolean, numeric, string, array & object
  • Mongo adds support for additional data types such as dates
Mongo data type Stores Example
null null value & non existent field {"x": null}
boolean true, false {"processed": true}
32 Bit Integer JavaScript Number
64 Bit Integer JavaScript Number
64 Bit Float JavaScript Number
date ISO Date Use new Date()

ISODate("2017-07-24T10:21:33.430Z"),
Arrays {"x": [1,2,3]}
objects {"x": {y: "z"}}
code JavaScript code {"x": function() {}}

Primary Key & ObjectID

  • Every document must have a "_id" key
  • Value can be anything, but defaults to ObjectId
  • In a collection every document must have a unique value for "_id"
  • ObjectId is default type for "_id", designated to be lightweight.
    • 12 bytes of storage

4 Byte Timestamp

3 Bytes Auto Incr

3 Bytes Machine ID

2 Bytes Process ID

Installation

Installation

Basic MongoDB Shell Commands

> help
        db.help()                    help on db methods
        db.mycoll.help()             help on collection methods
        sh.help()                    sharding helpers
        rs.help()                    replica set helpers
        help admin                   administrative help
        help connect                 connecting to a db help
        help keys                    key shortcuts
        help misc                    misc things to know
        help mr                      mapreduce

        show dbs                     show database names
        show collections             show collections in current database
        show users                   show users in current database
        show profile                 show most recent system.profile entries with time >= 1ms
        show logs                    show the accessible logger names
        show log [name]              prints out the last segment of log in memory, 'global' is default
        use <db_name>                set current database
        db.foo.find()                list objects in collection foo
        db.foo.find( { a : 1 } )     list objects in foo where a == 1
        it                           result of the last line evaluated; use to further iterate
        DBQuery.shellBatchSize = x   set default number of items to display on shell
        exit                         quit the mongo shell

Execute Code

// insert_records.js

var i = 10000;                                                                         
                                                                                       
while  (--i > 0) {                                                                       
 db.logs.insert({createdAt: new Date(), value: Math.random() * i})              
}



// execute insert
mongo <database> <jsfile>                                                                                

Create, Update, Delete

Create

// Create single document

db.person.insert({firstName: 'John', lastName: 'Doe', verified: false})

// Bulk insert documents
db.person.insert([
    { firstName: 'Jack', lastName: 'Doe', verified: false },
    { firstName: 'Jill', lastName: 'Doe', verified: false }
])

Delete

// Remove all documents
db.persons.remove()

var persons = [
    {firstName: 'John', lastName: 'Doe', isVerified: false },
    {firstName: 'Jill', lastName: 'Doe', isVerified: false },
    {firstName: 'John', lastName: 'Doe', isVerified: true },
    {firstName: 'Jill', lastName: 'Doe', isVerified: true }
]

db.persons.insert(persons)

// remove using a query
db.persons.remove({isVerified: true })


// Drop the collection
db.persons.drop()

Update

// Update a document
db.persons.update({firstName: 'John'}, {age: 27})
db.persons.update({_id:ObjectId(<your _id here>)}, {age: 27})

// update parts of a document
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$set: {age: 27}})
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$unset: {age: 1}})

// Update subdocuments
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$set: {friends.0.name: 'Mira'}})

// What happens ?

db.persons.update({}, {n: 33})
db.persons.update({}, {n: 33}, {multi: true})

Modifiers
// $inc
db.persons.update({}, {$inc: {age: 1}}, {multi: true})

// Array Modifiers
// $push
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$push: {friends: "Jill Doe"}})

// $ne -- Dont push if value is present
db.persons.update({firstName: 'John', lastName: 'Doe', friends: {$ne: "Jill Doe"}}, {$push: {friends: "Jill Doe"}})

// $addToSet prevent duplicates
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$addToSet: {friends: "Jill Doe"}})

// Multiple values
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$addToSet: {friends: {$each: ["Jill Doe", "Jack Doe"]}}})

Update - Continued

// Update a document
// $pop -- Positional
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$pop: {friends: 1}})
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$pop: {friends: -1}})

// $pull -- remove by criteria
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$pull: {friends: 'Jill Doe'}})


// Upsert
db.persons.update({firstName: 'Jeff', lastName: 'Doe'}, {firstName: 'Jeff', lastName: 'Doe', age: 32}, {upsert: true})

Import / Export

Command line tools

mongoexport --db myapp --collection persons --out persons_export.json
mongoimport --db myapp --collection persons --file persons.json
mongoimport --db myapp --drop --collection persons --file persons.json

Query

// Find All
db.persons.find({})

// Find using queries
db.persons.find({age: 27})
db.persons.find({age: 27, firstName: 'Joe'})


// Projections
db.persons.find({}, {firstName: 1, age: 1})

// Conditions
// $lt, $lte, $gt, $gte, $ne

db.persons.find({age: {$gte: 21, $lte: 30})
db.persons.find({age: {$ne: 0})

// OR Queries

// $in, $nin
db.persons.find({lastName: {$in: ['Doe', 'Miller']})
db.persons.find({lastName: {$in: ['Miller']})


// $or
// Or with multiple keys
db.persons.find({$or: [{lastName: 'Doe'}, {age: {$gte: 10}}]})

// $not
 db.persons.find({lastName: {$not: {$ne: 'Doe'}}})
// Find nulls
db.persons.find({z: null})
db.persons.find({z: {$eq: null, $exists: true}})

// Regular Expression
db.persons.find({firstName: /john/i})

// Arrays
db.persons.find({friends: 'Jill'})

// Arrays more than one match
db.persons.find({friends: {$all: ['Jill', 'Jim']})

// Size
db.persons.find({friends: {$size: 3}})

// Slice
db.persons.find({friends: {$slice: 1}})
db.persons.find({friends: {$slice: -1}})



// Querying Embedded documents

db.persons.find({'address.postCode': 600020})

// $where queries
db.persons.find({$where: function () { return this.firstName === 'Jack'}})
db.persons.find({$where: "this.firstName === 'Jack'")
// Limits, skips and sorts

db.persons.find({}).limit(10)

db.persons.find({}).skip(10)

// Pagination
db.persons.find({}).limit(10).skip(10)

// Sorting
db.persons.find({}).sort({age: 1})

Indexes

Generate data for indexes

const fs = require('fs');
const Chance = require('chance');
const chance = new Chance();

function generateRandomPerson () {
  const gender = chance.gender();
  const first = chance.first({gender: gender.toLowerCase()});
  const last = chance.last();
  return {
    firstname: first,
    lastname: last,
    email:  first + '.' + last + '@example.com',
    dob: chance.date({year: chance.pickone([1981, 1983, 1985])})
  };
}

var stream = fs.createWriteStream("persons.json");
stream.once('open', function(fd){

  // stream.write('[');
    for(var i=0; i < 5000; i++) {
      // i === 0 ? stream.write('') : stream.write(',');
      stream.write(JSON.stringify(generateRandomPerson()));
      stream.write('\n');
    }

  // stream.write(']');
  stream.end();
});

Title Text

  • ensureIndex
  • unique Index {unique: true}
  • re indexing
  • running in background {background: true}
  • Geospatial indexes {gps: "column name"}

Import data

// mongoimport 
mongoimport --db myapp --drop --collection persons --file persons.json

Aggregation

Aggregation

  • count
  • distinct (runCommand / db.collection.distinct)
  • group
  • aggregate
  • map reduce
db.persons.group(
   {
     key: { firstname: 1 },
     reduce: function( curr, result ) {
                 result.total += 1;
             },
     initial: { total : 0 }
   }
)


db.persons.group({
  keyf: function(doc) {
      return { year: doc.dob.getFullYear() + '_' + doc.firstname };
  },
  cond: { dob: { $gt: new Date('1980/01/01') } },
  reduce: function( curr, result ) {
       result.total += 1;
       result.count++;
   },
  initial: { total : 0, count: 0 }
})


db.persons.aggregate([
  {$match: { dob: {$gte: new Date('1985/01/01')}}},
  {$project: { year: {$year: "$dob"}, firstname: "$firstname"}},
  {$group: {_id: {"year": "$year", firstname: "$firstname"}, count: {$sum: 1}}}
])
var mapper = function () {
   emit(this.lastname, 1)
}

var reducer = function (name, instances) {
    return Array.sum(instances)
}

db.persons.mapReduce(
 mapper, 
 reducer,
 {
    out: "mr_bylastname"
 }
)

Advanced Topics

Advanced

  • Administration
  • Replication
  • Sharding

MongoDB

By Hari Narasimhan

MongoDB

  • 1,083