MongoDB
South America road trip
Matias Cascallares
April 2014
Agenda
- What is MongoDB?
- When should I use MongoDB?
- Genius Bar session
- What's new in MongoDB 2.6?
Who am I?
- Matias Cascallares
- Solutions Architect @ MongoDB Inc based in Singapore
- Software Engineer, University of Buenos Aires
- Experience mostly in web environments
- In my toolbox I have Java, Python and Node.js
Where I've been Working?
data:image/s3,"s3://crabby-images/93bc7/93bc76873f13bf8a8a3d972e5ea3eb0baf3fe2c1" alt=""
data:image/s3,"s3://crabby-images/8e0b0/8e0b0840d67a3efd1fb1953a0abac88cd7eb7d47" alt=""
data:image/s3,"s3://crabby-images/601ff/601ff1906be0ba7ddf1df7e0a28dbffda5789c95" alt=""
data:image/s3,"s3://crabby-images/484fe/484fedfbc7ca879c570d2e0acc6026184c7cc743" alt=""
data:image/s3,"s3://crabby-images/f6afd/f6afdf6ad55ffe01a2895ef53e25f3f8b127fdcc" alt=""
What is MongoDB?
Open source Database
https://www.flickr.com/photos/dasprid/8148007408/
Written in C++
positioning
data:image/s3,"s3://crabby-images/0a5f5/0a5f5263a49b02766b39c69617006f32d9a3bc3e" alt=""
FULL FEATUREd
- Flexible/Dynamic schema
- Ad Hoc queries
- Real time aggregation
- Rich query capabilities
- Strongly consistent
- Geospatial features
- Support for most programming languages
Object Semantic
var mybeer = {
name: "Lagunitas",
type: "Indian Pale Ale",
barrels: 106000,
alcohol: 5.7,
manufacturer: {
name: "Lagunitas Brewing Company",
address: "1280 N McDowell Blvd, Petaluma, CA 94954, US"
},
tasting_notes: ["sweet", "fruit"]
};
db.beers.insert(mybeer);
object semantic
// single condition
db.beers.find( { "name": "Lagunitas" } );
// AND condition
db.beers.find( { "type": "Indian Pale Ale", alcohol: { "$gte": 5 } } );
// OR condition
db.beers.find( { "$or": [
{ "type": "Indian Pale Ale" },
{ "alcohol": { "$gte": 5 } }
]});
Highly available
data:image/s3,"s3://crabby-images/d1212/d1212fa78e278a3fbaaef003b63cb013a81ed80b" alt=""
SCALABLE
When should i use mongoDB?
high volume data feeds
- Social Media
- Machine to Machine
- High Frequency Trading
operational intelligence
- Ad Targeting
- Monitoring
- Ticking Database
content management
- Product Catalogue
- Mobile apps
- Biometric
- Data Aggregation
MONGODB 2.6
What's new?
main improvement areas
- Operations
- Text Search
- Query System Improvements
- Security
- MMS and Automation
Operations
DEVOPS, DEVOPS, DEVOPS!!
new wire protocol
data:image/s3,"s3://crabby-images/b7038/b703803fc568f409eed60a62cb53186d91807253" alt=""
data:image/s3,"s3://crabby-images/9b2e4/9b2e42545fbf26b913c459cfaafccd93bf7f579f" alt=""
Bulk Writes - ORDERED
var bulk = db.beers.initializeOrderedBulkOp();
// insert three new beers
bulk.insert( { name: "Lagunitas", alcohol: 5.7 } );
bulk.insert( { name: "Leffe", alcohol: 6.5 } );
bulk.insert( { name: "London Pride", alcohol: 4.7 } );
bulk.execute();
BULK WRITES - UNORDERED
var bulk = db.beers.initializeUnorderedBulkOp();
// update one beer
bulk.find( { name: "Corona" } ).update( { $set: { alcohol: 4.3 } } );
// .. and delete another one
bulk.find( { name: "Chimay" } ).remove();
// execute everything as a single bulk operation
bulk.execute();
Max time per query
// expensive query with regex without anchor
db.articles.find( { "description": /August [0-9]+ 1969/ } ).maxTimeMS(30000)
// it also works with aggregation framework!
db.articles.aggregate([ {
"$match": {
"$text": {
"$search": "chien",
"$language": "fr"
}
}
}]).maxTimeMS(100);
Building an index
- Foreground
- Background
building an index on the foreground
- Faster
- More compact
- Blocking operation
// foreground by default
db.beers.ensureIndex( { name: 1} );
Building an index on the background
- Slower
- Sparser
- Non-blocking operation
// background is an optional argument
db.beers.ensureIndex( { name: 1}, { background: true } );
index build in 2.4
FG in primary -> FG in secondary
BG in primary ->
FG
in secondary
data:image/s3,"s3://crabby-images/f765a/f765ac71d038cd787dcdae8fcb8cbce9bd7a271e" alt=""
index build in 2.6
FG in primary -> FG in primary
BG in primary -> BG in secondary
data:image/s3,"s3://crabby-images/cf347/cf3472394b3b0e6b57536715dd640e86cee8ed3e" alt=""
Storage allocation
- usePowerOf2Sizes will be the default allocation method for new collections
data:image/s3,"s3://crabby-images/c1d69/c1d696d4c0ce1f8d0c043d3de52832cf26d6c972" alt=""
Sharding new commands
- mergeChunks
- cleanupOrphaned
Text search - GA
...Text Search was there in 2.4
data:image/s3,"s3://crabby-images/68712/68712765d83abddb46a4372d7aa9635306a46abb" alt=""
Integrated within find
db.articles.ensureIndex( { body: "text" } );
db.articles.insert(
{ body: "the quick brown fox jumped over the lazy dog" }
);
db.articles.find( { "$text" : { $search : "quickly" } } );
INTEGRATED WITHIN AF
db.articles.aggregate([
{ "$match": { "$text": { $search: "bRoWN"} } }
]);
playing with texts
// search for a single word
db.articles.find( { "$text": { "$search": "coffee" } } );
// search for any of these words
db.articles.find( { "$text": { "$search": "bake coffee cake" } } );
// search for a phrase
db.articles.find( { "$text": { "$search": "\"coffee cake\"" } } );
// excluding some terms
db.articles.find( { "$text": { "$search": "bake coffee -cake" } } );
Top 3 relevant documents
db.articles.find(
{ "$text": { "$search": "cake" } },
{ "score": { "$meta": "textScore" } }
).sort( { "score": { "$meta": "textScore" } } ).limit(3)
language support
db.articles.find(
{ "$text": { "$search": "leche", $language: "es" } }
);
language support
data:image/s3,"s3://crabby-images/8f29a/8f29a3c5c03a64bd03e7614c0ede7a720526548c" alt=""
Query system
improvements
Aggregation Framework
- Returns a cursor
- Can output to a collection
- New operators
- explain()
Aggregation Framework
// using a cursor
db.beers.aggregate([
{ "$match": { barrels: { "$gte": 10000}} }
],
{ cursor: { batchSize: 1 } }
);
// output to a collection
db.beers.aggregate([
{ "$match" : { barrels : { "$gte": 10000}} },
{ "$out" : "my_output_collection" }
]);
AGGREGAtion framework - explain
db.beers.aggregate(
[
{ "$match": { barrels: { "$gte": 500 } } },
{ "$group": { "_id": "$type", count: { "$sum":1 } } }
],
{ explain: true }
);
aggregation framework - explain
{"stages" : [ // one entry per pipeline stage
{
"$cursor" : {
"query" : { "barrels" : { "$gte" : 500 } },
"fields" : { "type" : 1, "_id" : 0 },
"plan" : {
"cursor" : "BtreeCursor barrels_1",
"isMultiKey" : false,
"scanAndOrder" : false,
"indexBounds" : { "barrels" : [ [500, Infinity] ] }
}
}
],
"ok" : 1
}
update operators - mul
db.beers.insert(
{ _id: 1, name: "Lagunitas", price: 10.99 }
);
// to increase the price by 20%
db.beers.update(
{ _id: 1 },
{ "$mul" : { price: 1.2 } }
);
UPDATE operators - bit
db.beers.update(
{_id: 1},
{ "$bit": { mask: { and: NumberInt(10) } } }
);
db.beers.update(
{_id: 1},
{ "$bit": { mask: { or: NumberInt(10) } } }
);
db.beers.update(
{_id: 1},
{ "$bit": { mask: { xor: NumberInt(10) } } }
);
update operators - min/max
db.scores.insert(
{ _id: 1, low_score: 200, high_score: 400 }
);
db.scores.update(
{ _id: 1 },
{ "$min": { low_score: 250 } }
);
db.scores.update(
{ _id: 1 },
{ "$max": { high_score: 450 } }
);
Index intersection
// index creation
db.beers.ensureIndex( { barrels: 1 } );
db.beers.ensureIndex( { alcohol: 1 } );
// retrieval
db.beers.find( { barrels: { "$gte": 100 }, alcohol: { "$gte": 5.5 } } );
index intersection
-
Less Index Maintenance
-
Smaller Working Set
-
Lower Write Overhead
-
More Adaptive
query optimizer - new concepts
- Query Shape
- Plan Cache
what is a query shape?
db.beers.find(
{ barrels: { "$gte": 300 } },
{ _id: -1, name: 1, barrels: 1}
).sort( { alcohol: -1 } );
db.runCommand( { planCacheListQueryShapes: "beers"});
{
"shapes" : [
{
"query" : { barrels: { "$gte": 300 } },
"sort" : { alcohol: -1 },
"projection" : { _id: -1, name: 1, barrels: 1}
}
]
}
... let's create a plan cache
db.runCommand({
"planCacheSetFilter": "beers",
"query" : { barrels: { "$gte": 300 } },
"sort" : { alcohol: -1 },
"projection" : { _id: -1, name: 1, barrels: 1}
"indexes": [
{ barrels: 1 },
{ alcohol: -1, barrels: 1 }
],
});
what do i get?
- Better control on which indexes are going to be evaluated
- Ability to predefine a set of candidate indexes
- ... but still is an empiric query optimizer
Geospatial enhancements
- Added support for multipart geometries
- MultiPoint
- MultiLineString
- MultiPolygon
- GeometryCollection
GEOSPATIAL ENHANCEMENTS
data:image/s3,"s3://crabby-images/387af/387af7771c14f696c9347dbae33e17cd041eca81" alt=""
Security
Authentication
- LDAP (Enterprise)
- x509
Authorization
- User defined roles
creating new roles
db.createRole({
role: "MMSMonitoringRole",
roles: ["clusterAdmin", "readAnyDatabase"]
});
db.createRole({
role: "MMSBackupRole",
roles: ["clusterAdmin", "readAnyDatabase", "userAdminAnyDatabase"]
});
using my new roles
db.addUser({
"user": "mms-monitoring",
"pwd": "abcd1234",
"roles": [
"MMSMonitoringRole"
]
});
db.addUser({
"user": "mms-backup",
"pwd": "efgh5678",
"roles": [
"MMSBackupRole"
]
});
creating custom privileges
db.createRole({
"role": "appUser",
"db": "myApp",
"privileges": [
{
"resource": { "db": "myApp" , "collection": "" },
"actions": [ "find", "dbStats", "collStats" ]
},
{
"resource": { "db": "myApp", "collection": "beers" },
"actions": [ "insert"]
}
]
};
Auditing
- Schema actions
- Replica Set actions
- Authentication & Authorization actions
- Other actions
output
mongod --dbpath data/db --auditDestination syslog
- syslog
- console
- JSON/BSON file
dropping a collection
var auditEntry = { "atype" : "dropCollection", "ts" : { "$date" : "2014-04-08T16:48:34.333+1000" }, "local" : { "ip" : "127.0.0.1", "port" : 27017 }, "remote" : { "ip" : "127.0.0.1", "port" : 55771 }, "users" : [ { "user": "matias", "db": "test" } ], "param" : { "ns" : "test.beers" }, "result" : 0 };
shutting down the server
var auditEntry = {
"atype" : "shutdown",
"ts" : { "$date" : "2014-04-08T16:54:24.373+1000" },
"local" : { "ip" : "127.0.0.1", "port" : 27017 },
"remote" : { "ip" : "127.0.0.1", "port" : 55771 },
"users" : [
{ "user": "matias", "db": "admin" }
],
"param" : {},
"result" : 0
};
MMS & automation
MMS
- Monitoring and backup service
- Cloud-based and on-premise
- Easy to setup
cloud numbers
- Monitoring: 75K updates/sec
- Backup: 100 GB/hr of new data
monitoring
data:image/s3,"s3://crabby-images/c1c9c/c1c9c5171ed9493589165c3d9c6dd0fe5a526a21" alt=""
monitoring
data:image/s3,"s3://crabby-images/3be62/3be62c4f6fb4138008c51e0417b03795be1c19e2" alt=""
monitoring
data:image/s3,"s3://crabby-images/cffb3/cffb327034a46f8eafed5ba286701d697b2eb76d" alt=""
alerts
data:image/s3,"s3://crabby-images/ac4fd/ac4fda241c4f695fa89d6df02419f045f757e50c" alt=""
BACKUP
- Backup a replica set or sharded cluster
- Initial sync + incremental
- Generated snapshots every 6 hs
- Restore via HTTPS or SCP
- Restore replica sets to point-in-time (last 24hs)
- Restore sharded clusters to any 15 minute (last 24hs)
BACKUP
data:image/s3,"s3://crabby-images/c861f/c861f570859f0cfc7d916e3e583188f6fde132ab" alt=""
BACKUP
data:image/s3,"s3://crabby-images/838a9/838a9ed2fe7529efbfda39295518a80413de15d5" alt=""
automation
- Provision, create, upgrade and maintain MongoDB deployments
- Hide complex stuff, just use your browser
- Initial supported platforms: AWS and OpenStack
- Alpha/Beta stage
what can i do?
- Create your deployment (replica set or sharded)
- Add/remove shards and replica sets
- Resize oplog
- Specify users and roles
- Provisioning new machines (only in AWS)
provisioning new servers
data:image/s3,"s3://crabby-images/b4c7c/b4c7c35b571b8172b34135f36d96461b2d7dd843" alt=""
creating your replica set
data:image/s3,"s3://crabby-images/bebb1/bebb1aec1f5600772f707394dc959009e650f6a0" alt=""
creating your cluster
data:image/s3,"s3://crabby-images/0249a/0249a126fa42481daf016ee243b9aeacef68d31c" alt=""
50 shards, one click
data:image/s3,"s3://crabby-images/09ce7/09ce7528bb5a6c963209aa1483816681dd6b9dc3" alt=""
See it in action..
https://www.youtube.com/watch?v=nSJiVXNsPHk&feature=youtu.be
THE END
Questions?
http://slid.es/mcascallares/mongodb-sa-road-trip
MongoDB - South America Road Trip - Buenos Aires
By Matias Cascallares
MongoDB - South America Road Trip - Buenos Aires
- 3,594