Thierry Delprat
tdelprat@nuxeo.com
https://github.com/tiry/
What does the switch to MongoDB change ?
For developers & architects
For ops and users of the product
we provide a Platform that developers can use to
build highly customized Content Applications
we provide components, and the tools to assemble them
everything we do is open source
https://github.com/nuxeo
Content Repository
Storage
Content Repository
Simplify software architecture
Offer easy scalability options
Small impact on development
?
Making work easier for Ops & Architects
Object
Object
Object
No Lazy Loading
No Cache / No Invalidations
A lot of complexity and problems avoided !
Object
Simplify deployment architecture
Complex structures (schema) - R/W - Synchronous
Document properties and hierarchy
Large Streams - Large Storage
attached Blobs
Flexible Schema - Write Once/Read Many
Audit log, Activity log
Flexible Schema - Search
Search index
GridFS
Complex structures (schema) - R/W - Synchronous
Document properties and hierarchy
Large Streams - Large Storage
attached Blobs
Flexible Schema - Write Once/Read Many
Audit log, Activity log
Flexible Schema - Search
Search index
Single Consolidated Storage
Structure, Blobs, Audit & Index
Fewer building blocks to provision & configure
Easier to deploy
"built-in" - data redundancy & fault tolerance
active
active
No ORM Hell
Single storage
OTB robust deployment
Avoid headaches at deployment time
Improve end-user experience
No Impedance issue
fewer backend calls
no invalidation cost
Document level locking
no table level concurrency
Native distributed architecture
Easy scale out of read
Significant RAW Speed improvements for all use cases
More importantly: some use cases are much better handled
https://benchmarks.nuxeo.com/continuous/index.html
Handle more concurrent connections
No Cache
Less memory per Connection
Can handle more connections
Can handle more concurrent Users
Read & Write Operations
are competing
Write Operations
are not blocked
C4.xlarge (nuxeo)
C4.2Xlarge (DB)
SQL
WRITEs are not blocked by READs
Processing on large Objects sets is challenging with ORM
No side effects of impedance mismatch
Sample batch on 100,000 documents
750 documents/s with SQL backend (cold cache)
11,500 documents/s with MongoDB / wiredTiger: x15
lazy loading
cache trashing
Scale out READs
Leverage ReplicaSets
(Read from secondaries)
Scale out WRITEs
Leverage Sharding
(Spread Writes)
No Impact at application level !
1 Nuxeo node + 1 MongoDB node
1900 docs/s
MongoDB CPU is the bottleneck (800%)
Use massive read operations and queries.
2 Nuxeo nodes + 1 MongoDB node
1850 docs/s
MongoDB CPU is the bottleneck (800%)
2 Nuxeo nodes + 2 MongoDB nodes
3400 docs/s
(using read preferences)
2 Nuxeo nodes
+
1 MongoDB ReplicaSet
11,000 docs/s
2 Nuxeo nodes
+
3 MongoDB Sharded ReplicaSet
27,400 docs/s
Use bulk import.
Changes from a development point of view
Document level transactions
No MVCC isolation
Provide shared mitigation policies
for critical use cases
Different transaction paradigm
Atomic Document Operations are safe
Large batch updates can not be Atomic
Find a way to mitigate application level impact
Transactions can not span across multiple documents
Multi-documents transactions can be problematic
Workflows or custom event handlers
Transient State Manager
Run all operations in Memory
Populate an Undo Log
New Model
New API
New Query system
Provide an easy migration path
High level API + Encapsulation
Storage Adapters
Helps transitioning between storages
No Impact at application level
Can be deployment time choice
Simplify architecture
Offer simple scalability options
Be an easy migration
Changing for MongoDB can
Content Management + MongoDB
You should try Nuxeo !
Thank You !
https://github.com/nuxeo
http://www.nuxeo.com/careers/