Changing for MongoDB

Thierry Delprat
tdelprat@nuxeo.com
https://github.com/tiry/


Switching to MongoDB

What does the switch to MongoDB change ?
For developers & architects
For ops and users of the product

Switching to MongoDB
Some Context


we provide a Platform that developers can use to
build highly customized Content Applications
we provide components, and the tools to assemble them
everything we do is open source
https://github.com/nuxeo

Nuxeo Platform & Storage


Content Repository
Storage
NUXEO PLATFORM & STORAGE


Content Repository
Simplify software architecture
Offer easy scalability options
Small impact on development


NUXEO PLATFORM & STORAGE
?
Simplify Architecture
Making work easier for Ops & Architects

IMPEDANCE ISSUE


Object
IMPEDANCE ISSUE


Object
IMPEDANCE ISSUE


Object
Coding & Maintenance Impact


No Lazy Loading
No Cache / No Invalidations
A lot of complexity and problems avoided !
Object
Example: Impact on Nuxeo deployment


EXAMPLE: IMPACT ON NUXEO DEPLOYMENT


Simplify deployment architecture
Hybrid Storage
Complex structures (schema) - R/W - Synchronous
Document properties and hierarchy

Large Streams - Large Storage
attached Blobs
Flexible Schema - Write Once/Read Many
Audit log, Activity log
Flexible Schema - Search
Search index






Hybrid Storage






GridFS
Complex structures (schema) - R/W - Synchronous
Document properties and hierarchy
Large Streams - Large Storage
attached Blobs
Flexible Schema - Write Once/Read Many
Audit log, Activity log
Flexible Schema - Search
Search index
CONSOLIDATED Storage
Single Consolidated Storage
Structure, Blobs, Audit & Index

Fewer building blocks to provision & configure
Easier to deploy

EASIER to Deploy a Robust Architecture

"built-in" - data redundancy & fault tolerance

active
active
Simplicity ?
No ORM Hell
Single storage
OTB robust deployment

Scalability
Avoid headaches at deployment time

Improve end-user experience

Will I Be Faster
with mongodb ?
Built for SPEED
No Impedance issue
fewer backend calls
no invalidation cost
Document level locking
no table level concurrency
Native distributed architecture
Easy scale out of read

SPEED

Significant RAW Speed improvements for all use cases
More importantly: some use cases are much better handled

https://benchmarks.nuxeo.com/continuous/index.html
More than RAW Perrformances

Handle more concurrent connections
No Cache
Less memory per Connection
Can handle more connections
Can handle more concurrent Users

MORE THAN RAW PERFORMANCES



Read & Write Operations
are competing
Write Operations
are not blocked
C4.xlarge (nuxeo)
C4.2Xlarge (DB)
SQL
WRITEs are not blocked by READs
More than RAW Performances
Processing on large Objects sets is challenging with ORM

No side effects of impedance mismatch
Sample batch on 100,000 documents
750 documents/s with SQL backend (cold cache)
11,500 documents/s with MongoDB / wiredTiger: x15

lazy loading
cache trashing

Will I SCALE BETTER
with mongodb ?
Scalability options
Scale out READs
-
Leverage ReplicaSets
(Read from secondaries)

Scale out WRITEs
-
Leverage Sharding
(Spread Writes)
No Impact at application level !
Scale out Test
1 Nuxeo node + 1 MongoDB node
1900 docs/s
MongoDB CPU is the bottleneck (800%)




Use massive read operations and queries.

2 Nuxeo nodes + 1 MongoDB node
1850 docs/s
MongoDB CPU is the bottleneck (800%)
2 Nuxeo nodes + 2 MongoDB nodes
3400 docs/s
(using read preferences)
SHARDING TEST



2 Nuxeo nodes
+
1 MongoDB ReplicaSet
11,000 docs/s
2 Nuxeo nodes
+
3 MongoDB Sharded ReplicaSet
27,400 docs/s
Use bulk import.
DEVELOPMENT IMPACT
Changes from a development point of view

New Storage Model
Document level transactions
No MVCC isolation

Provide shared mitigation policies
for critical use cases
Different transaction paradigm
Consistency in our Context
Atomic Document Operations are safe
Large batch updates can not be Atomic

Find a way to mitigate application level impact
Transactions can not span across multiple documents
Multi-documents transactions can be problematic
Workflows or custom event handlers

Ensuring consistency
Transient State Manager
Run all operations in Memory
Populate an Undo Log


- Recover Application level Transaction Management
-
Commit / Rollback model
-
Commit / Rollback model
-
"Read uncommited" isolation
- Need to flush transient state for queries
- "uncommited" changes are visible to others
Inertia
New Model
New API
New Query system

Provide an easy migration path


Nuxeo Approach
High level API + Encapsulation
Storage Adapters


DOCUMENT REPOSITORY



Helps transitioning between storages
DOCUMENT REPOSITORY


DOCUMENT REPOSITORY


DOCUMENT REPOSITORY


DOCUMENT REPOSITORY


DOCUMENT REPOSITORY



No Impact at application level
Can be deployment time choice
TakeAways
Simplify architecture
Offer simple scalability options
Be an easy migration

Changing for MongoDB can
Content Management + MongoDB
You should try Nuxeo !
Any Questions ?
Thank You !

https://github.com/nuxeo
http://www.nuxeo.com/careers/
Changing for MongoDB
By Thierry Delprat
Changing for MongoDB
- 6,634