MongoDB
What are we covering?
- Introduction
- Installation
- Create, Update & Delete Documents
- Querying Data
- Indexes
- Aggregation
- Advanced Topics
Introduction
What is MongoDB
- A Document oriented database
- Schema Free
- Built for scalability
- Developer Friendly
- Easy to administer
- Makes complex data modelling easier
- Not a replacement for RDBMS
- Indexing, Stored JavaScript, Aggregation, Capped Collections, File storage (GridFS)
Documents & Collections
- Document is the basic unit of data in MongoDB
- Document is equivalent of a row in a RDBMS, but it is much more rich than a RDBMS row
- Documents are grouped together in a Collection
- A collection is a schema free equivalent of a RDBMS table
- Every document has a special key "_id" that is unique across document collection
More Documents
- Ordered set of keys and its associated value
- Apart from key value pairs, a document can contain sub document
- Key - Value pairs are ordered
- Keys are strings - specifically UTF-8 encoded strings.
- Keys must not contain \0 (null) character
- . and $ are special
- keys starting with _ are reserved.
- Keys are type and case sensitive
- Keys must not be duplicated
Collections
- Group of Documents
- Schema Free, each document within a collection can have different structure
- Use namespaced sub-collection to organise related data (projects.material, projects.managers)
- Naming
- UTF-8
- Should not contain null character
- Should not prefix with system
- UTF-8
Database
- Grouping of collections
- Single instance can host several databases
- Database names must be UTF-8 Strings
- No empty string
- No Null Char "\0"
- should be lower case name (convention)
- Names limited to 64 bytes
- Some reserved databases
- admin
- local
- config
Datatypes
- JSON Like & conceptually similar to JSON
- Only Six Types in JSON
- null, boolean, numeric, string, array & object
- Mongo adds support for additional data types such as dates
Mongo data type | Stores | Example |
---|---|---|
null | null value & non existent field | {"x": null} |
boolean | true, false | {"processed": true} |
32 Bit Integer | JavaScript Number | |
64 Bit Integer | JavaScript Number | |
64 Bit Float | JavaScript Number | |
date | ISO Date | Use new Date() ISODate("2017-07-24T10:21:33.430Z"), |
Arrays | {"x": [1,2,3]} | |
objects | {"x": {y: "z"}} | |
code | JavaScript code | {"x": function() {}} |
Primary Key & ObjectID
- Every document must have a "_id" key
- Value can be anything, but defaults to ObjectId
- In a collection every document must have a unique value for "_id"
- ObjectId is default type for "_id", designated to be lightweight.
- 12 bytes of storage
4 Byte Timestamp
3 Bytes Auto Incr
3 Bytes Machine ID
2 Bytes Process ID
Installation
Installation
- https://docs.mongodb.com/manual/administration/install-community/
- Either configure as service or use mongod (daemon)
- mongo client can be accessed using mongo
Basic MongoDB Shell Commands
> help
db.help() help on db methods
db.mycoll.help() help on collection methods
sh.help() sharding helpers
rs.help() replica set helpers
help admin administrative help
help connect connecting to a db help
help keys key shortcuts
help misc misc things to know
help mr mapreduce
show dbs show database names
show collections show collections in current database
show users show users in current database
show profile show most recent system.profile entries with time >= 1ms
show logs show the accessible logger names
show log [name] prints out the last segment of log in memory, 'global' is default
use <db_name> set current database
db.foo.find() list objects in collection foo
db.foo.find( { a : 1 } ) list objects in foo where a == 1
it result of the last line evaluated; use to further iterate
DBQuery.shellBatchSize = x set default number of items to display on shell
exit quit the mongo shell
Execute Code
// insert_records.js
var i = 10000;
while (--i > 0) {
db.logs.insert({createdAt: new Date(), value: Math.random() * i})
}
// execute insert
mongo <database> <jsfile>
Create, Update, Delete
Create
// Create single document
db.person.insert({firstName: 'John', lastName: 'Doe', verified: false})
// Bulk insert documents
db.person.insert([
{ firstName: 'Jack', lastName: 'Doe', verified: false },
{ firstName: 'Jill', lastName: 'Doe', verified: false }
])
Delete
// Remove all documents
db.persons.remove()
var persons = [
{firstName: 'John', lastName: 'Doe', isVerified: false },
{firstName: 'Jill', lastName: 'Doe', isVerified: false },
{firstName: 'John', lastName: 'Doe', isVerified: true },
{firstName: 'Jill', lastName: 'Doe', isVerified: true }
]
db.persons.insert(persons)
// remove using a query
db.persons.remove({isVerified: true })
// Drop the collection
db.persons.drop()
Update
// Update a document
db.persons.update({firstName: 'John'}, {age: 27})
db.persons.update({_id:ObjectId(<your _id here>)}, {age: 27})
// update parts of a document
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$set: {age: 27}})
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$unset: {age: 1}})
// Update subdocuments
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$set: {friends.0.name: 'Mira'}})
// What happens ?
db.persons.update({}, {n: 33})
db.persons.update({}, {n: 33}, {multi: true})
Modifiers
// $inc
db.persons.update({}, {$inc: {age: 1}}, {multi: true})
// Array Modifiers
// $push
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$push: {friends: "Jill Doe"}})
// $ne -- Dont push if value is present
db.persons.update({firstName: 'John', lastName: 'Doe', friends: {$ne: "Jill Doe"}}, {$push: {friends: "Jill Doe"}})
// $addToSet prevent duplicates
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$addToSet: {friends: "Jill Doe"}})
// Multiple values
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$addToSet: {friends: {$each: ["Jill Doe", "Jack Doe"]}}})
Update - Continued
// Update a document
// $pop -- Positional
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$pop: {friends: 1}})
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$pop: {friends: -1}})
// $pull -- remove by criteria
db.persons.update({firstName: 'John', lastName: 'Doe'}, {$pull: {friends: 'Jill Doe'}})
// Upsert
db.persons.update({firstName: 'Jeff', lastName: 'Doe'}, {firstName: 'Jeff', lastName: 'Doe', age: 32}, {upsert: true})
Import / Export
Command line tools
mongoexport --db myapp --collection persons --out persons_export.json
mongoimport --db myapp --collection persons --file persons.json
mongoimport --db myapp --drop --collection persons --file persons.json
Query
// Find All
db.persons.find({})
// Find using queries
db.persons.find({age: 27})
db.persons.find({age: 27, firstName: 'Joe'})
// Projections
db.persons.find({}, {firstName: 1, age: 1})
// Conditions
// $lt, $lte, $gt, $gte, $ne
db.persons.find({age: {$gte: 21, $lte: 30})
db.persons.find({age: {$ne: 0})
// OR Queries
// $in, $nin
db.persons.find({lastName: {$in: ['Doe', 'Miller']})
db.persons.find({lastName: {$in: ['Miller']})
// $or
// Or with multiple keys
db.persons.find({$or: [{lastName: 'Doe'}, {age: {$gte: 10}}]})
// $not
db.persons.find({lastName: {$not: {$ne: 'Doe'}}})
// Find nulls
db.persons.find({z: null})
db.persons.find({z: {$eq: null, $exists: true}})
// Regular Expression
db.persons.find({firstName: /john/i})
// Arrays
db.persons.find({friends: 'Jill'})
// Arrays more than one match
db.persons.find({friends: {$all: ['Jill', 'Jim']})
// Size
db.persons.find({friends: {$size: 3}})
// Slice
db.persons.find({friends: {$slice: 1}})
db.persons.find({friends: {$slice: -1}})
// Querying Embedded documents
db.persons.find({'address.postCode': 600020})
// $where queries
db.persons.find({$where: function () { return this.firstName === 'Jack'}})
db.persons.find({$where: "this.firstName === 'Jack'")
// Limits, skips and sorts
db.persons.find({}).limit(10)
db.persons.find({}).skip(10)
// Pagination
db.persons.find({}).limit(10).skip(10)
// Sorting
db.persons.find({}).sort({age: 1})
Indexes
Generate data for indexes
const fs = require('fs');
const Chance = require('chance');
const chance = new Chance();
function generateRandomPerson () {
const gender = chance.gender();
const first = chance.first({gender: gender.toLowerCase()});
const last = chance.last();
return {
firstname: first,
lastname: last,
email: first + '.' + last + '@example.com',
dob: chance.date({year: chance.pickone([1981, 1983, 1985])})
};
}
var stream = fs.createWriteStream("persons.json");
stream.once('open', function(fd){
// stream.write('[');
for(var i=0; i < 5000; i++) {
// i === 0 ? stream.write('') : stream.write(',');
stream.write(JSON.stringify(generateRandomPerson()));
stream.write('\n');
}
// stream.write(']');
stream.end();
});
Title Text
- ensureIndex
- unique Index {unique: true}
- re indexing
- running in background {background: true}
- Geospatial indexes {gps: "column name"}
Import data
// mongoimport
mongoimport --db myapp --drop --collection persons --file persons.json
Aggregation
Aggregation
- count
- distinct (runCommand / db.collection.distinct)
- group
- aggregate
- map reduce
db.persons.group(
{
key: { firstname: 1 },
reduce: function( curr, result ) {
result.total += 1;
},
initial: { total : 0 }
}
)
db.persons.group({
keyf: function(doc) {
return { year: doc.dob.getFullYear() + '_' + doc.firstname };
},
cond: { dob: { $gt: new Date('1980/01/01') } },
reduce: function( curr, result ) {
result.total += 1;
result.count++;
},
initial: { total : 0, count: 0 }
})
db.persons.aggregate([
{$match: { dob: {$gte: new Date('1985/01/01')}}},
{$project: { year: {$year: "$dob"}, firstname: "$firstname"}},
{$group: {_id: {"year": "$year", firstname: "$firstname"}, count: {$sum: 1}}}
])
var mapper = function () {
emit(this.lastname, 1)
}
var reducer = function (name, instances) {
return Array.sum(instances)
}
db.persons.mapReduce(
mapper,
reducer,
{
out: "mr_bylastname"
}
)
Advanced Topics
Advanced
- Administration
- Replication
- Sharding
MongoDB
By Hari Narasimhan
MongoDB
- 1,083