Changing for MongoDB
Thierry Delprat
tdelprat@nuxeo.com
https://github.com/tiry/
Switching to MongoDB
What does the switch to MongoDB change ?
For developers & architects
For ops and users of the product
Switching to MongoDB
Some Context
we provide a Platform that developers can use to
build highly customized Content Applications
we provide components, and the tools to assemble them
everything we do is open source
https://github.com/nuxeo
Nuxeo Platform & Storage
Content Repository
Storage
NUXEO PLATFORM & STORAGE
Content Repository
Simplify software architecture
Offer easy scalability options
Small impact on development
NUXEO PLATFORM & STORAGE
?
Simplify Architecture
Making work easier for Ops & Architects
IMPEDANCE ISSUE
Object
IMPEDANCE ISSUE
Object
IMPEDANCE ISSUE
Object
Coding & Maintenance Impact
No Lazy Loading
No Cache / No Invalidations
A lot of complexity and problems avoided !
Object
Example: Impact on Nuxeo deployment
EXAMPLE: IMPACT ON NUXEO DEPLOYMENT
Simplify deployment architecture
Hybrid Storage
Complex structures (schema) - R/W - Synchronous
Document properties and hierarchy
Large Streams - Large Storage
attached Blobs
Flexible Schema - Write Once/Read Many
Audit log, Activity log
Flexible Schema - Search
Search index
Hybrid Storage
GridFS
Complex structures (schema) - R/W - Synchronous
Document properties and hierarchy
Large Streams - Large Storage
attached Blobs
Flexible Schema - Write Once/Read Many
Audit log, Activity log
Flexible Schema - Search
Search index
CONSOLIDATED Storage
Single Consolidated Storage
Structure, Blobs, Audit & Index
Fewer building blocks to provision & configure
Easier to deploy
EASIER to Deploy a Robust Architecture
"built-in" - data redundancy & fault tolerance
active
active
Simplicity ?
No ORM Hell
Single storage
OTB robust deployment
Scalability
Avoid headaches at deployment time
Improve end-user experience
Will I Be Faster
with mongodb ?
Built for SPEED
No Impedance issue
fewer backend calls
no invalidation cost
Document level locking
no table level concurrency
Native distributed architecture
Easy scale out of read
SPEED
Significant RAW Speed improvements for all use cases
More importantly: some use cases are much better handled
https://benchmarks.nuxeo.com/continuous/index.html
More than RAW Perrformances
Handle more concurrent connections
No Cache
Less memory per Connection
Can handle more connections
Can handle more concurrent Users
MORE THAN RAW PERFORMANCES
Read & Write Operations
are competing
Write Operations
are not blocked
C4.xlarge (nuxeo)
C4.2Xlarge (DB)
SQL
WRITEs are not blocked by READs
More than RAW Performances
Processing on large Objects sets is challenging with ORM
No side effects of impedance mismatch
Sample batch on 100,000 documents
750 documents/s with SQL backend (cold cache)
11,500 documents/s with MongoDB / wiredTiger: x15
lazy loading
cache trashing
Will I SCALE BETTER
with mongodb ?
Scalability options
Scale out READs
-
Leverage ReplicaSets
(Read from secondaries)
Scale out WRITEs
-
Leverage Sharding
(Spread Writes)
No Impact at application level !
Scale out Test
1 Nuxeo node + 1 MongoDB node
1900 docs/s
MongoDB CPU is the bottleneck (800%)
Use massive read operations and queries.
2 Nuxeo nodes + 1 MongoDB node
1850 docs/s
MongoDB CPU is the bottleneck (800%)
2 Nuxeo nodes + 2 MongoDB nodes
3400 docs/s
(using read preferences)
SHARDING TEST
2 Nuxeo nodes
+
1 MongoDB ReplicaSet
11,000 docs/s
2 Nuxeo nodes
+
3 MongoDB Sharded ReplicaSet
27,400 docs/s
Use bulk import.
DEVELOPMENT IMPACT
Changes from a development point of view
New Storage Model
Document level transactions
No MVCC isolation
Provide shared mitigation policies
for critical use cases
Different transaction paradigm
Consistency in our Context
Atomic Document Operations are safe
Large batch updates can not be Atomic
Find a way to mitigate application level impact
Transactions can not span across multiple documents
Multi-documents transactions can be problematic
Workflows or custom event handlers
Ensuring consistency
Transient State Manager
Run all operations in Memory
Populate an Undo Log
- Recover Application level Transaction Management
-
Commit / Rollback model
-
Commit / Rollback model
-
"Read uncommited" isolation
- Need to flush transient state for queries
- "uncommited" changes are visible to others
Inertia
New Model
New API
New Query system
Provide an easy migration path
Nuxeo Approach
High level API + Encapsulation
Storage Adapters
DOCUMENT REPOSITORY
Helps transitioning between storages
DOCUMENT REPOSITORY
DOCUMENT REPOSITORY
DOCUMENT REPOSITORY
DOCUMENT REPOSITORY
DOCUMENT REPOSITORY
No Impact at application level
Can be deployment time choice
TakeAways
Simplify architecture
Offer simple scalability options
Be an easy migration
Changing for MongoDB can
Content Management + MongoDB
You should try Nuxeo !
Any Questions ?
Thank You !
https://github.com/nuxeo
http://www.nuxeo.com/careers/
Changing for MongoDB
By Thierry Delprat
Changing for MongoDB
- 6,502