Nuxeo Platform
Building blocks
Scalability & Scale out
Architecture GoALS
Ensure Scalability and Elasticity
-
Allow addressing very large projects (processing & data)
- without having performance issues
- without requiring crazy hardware
-
allowing to scale progressively
Build custom tailored DAM applications
- Allow the application to evolve with Business
- without creating a maintenance nightmare
- without generating delays
Architecture principles
-
Make implementation pluggable
- choose the right implementation depending on constraints
- build custom tailored implementation if needed
-
keep a clear separation between platform code and custom code
-
Identify and isolate subsystems
- be able to segregate traffic and load
-
add resources where they are needed
-
Leverage Scale out and AutoScaling
- Scale-out at backend-level or at application level
- Leverage elasticity provided by Cloud and Container technologies
Component Model
- In Nuxeo architecture everything is a plugin
-
Everything is
configurable
- Logic and Data Structrures depends on configuration


see Explorer to browse Extension Points
Building with Nuxeo


Continuous deploymemt


Architecture principles
-
Make implementation pluggable
- choose the right implementation depending on constraints
- build custom tailored implementation if needed
-
keep a clear separation between platform code and custom code
- Identify and isolate subsystems
- be able to segregate traffic and load
-
add resources where they are needed
- Leverage Scale out and AutoScaling
- Scale-out at backend-level or at application level
- Leverage elasticity provided by Cloud and Container technologies
Logical Architecture

Logical Architecture

Logical Architecture

Scaling Processing
Scale-out subsystems independently as needed

Scaling Processing
Scale-out subsystems independently as needed

Scale interactive processing
Scaling Processing
Scale-out subsystems independently as needed

Scale interactive processing
Scaling Processing
Scale-out subsystems independently as needed

Scale batch
processing
Scaling Processing
Scale-out subsystems independently as needed

Scale batch
processing
Scaling Processing
Scale-out subsystems independently as needed

Auto-Scaling
- Nuxeo exposes metrics
- JMX system and application metrics
- application level probes
- status
- Metrics can be used to automatically adjust infrastructure
- Auto-Scaling in AWS
- Pod autoscaling in OpenShift / K8S

Importer & BackPressure
nuxeo-mq-importer

Nuxeo Platform
Scaling I/O
Scaling storage services
Leverage scalable storage backends
-
MongoDB
- Scale Reads by adding replicas
-
Scale Write by using sharding
-
Elasticsearch
- Add nodes & shards to the cluster

Scaling storage services
Leverage Nuxeo capabilities
-
Support for Multiple repositories
-
Application level Sharding
-
Application level Sharding
-
Offload searches
-
from Repository to ES
-
from Repository to ES
- HSM and multi-BlobStores

Storage Adapters

About LARGE Binary Storage
- Backend storage is
pluggable
- several implementations (FS, S3, GridFS, GDoc, DropBox ...)
- easy to implement
- Can do partitions
- Can do HSM

Binary Store & Federation

Http / Upload / Download - optimize
-
Leverage Reverse proxy to protect server
- Download / Upload buffering
- Caching
-
Traffic prioritization/throttling if needed
-
Direct upload/download with CDN support
- depending on backend (i.e. AWS S3)

Upload / Download Acceleration
-
Downloads are pluggable
- Redirect
- Redirect
-
FileStorage is pluggable
- Read/Write from any Storage API
- Read/Write from any Storage API
-
Upload is pluggable
- BlobStore abstraction
- Upload UI component
The solution depend on the available infrastructure

Downloads Acceleration
Leverage existing CDN and Edge servers

Downloads Acceleration
Build dedicated CDN and Edge servers

Uploads Acceleration
Leverage AWS Infrastructure

Completely transparant for users
Nuxeo Platform
Performances
Performances Impacting factors
-
Business Model
- Content Model
- Security
- Workflows
-
Volume
- Number of documents
- Size of files is usually not significant
-
Types of access
- Throughput
- Search
- Write operations
- Rendering technologies
Performances Impacting factors
Lot of different factors
Depends from application
No simple sizing
Scalable Architecture
Benchmarks
Scale the part of your application that needs it
Baseline and reusable tools.
Benchmarking
-
Performances tests are part of the development effort
- part of the nuxeo-platform source code
-
run on a nightly basis via CI Chain
-
We leverage
Gatling
- test Web UI
- test REST API
-
test mass import
-
Publish results
- https://benchmarks.nuxeo.com/
LTS benchmarks
https://benchmarks.nuxeo.com/

Nightly benchmarks
https://benchmarks.nuxeo.com/

1B benchmark


1B BENCHMARK
-
Bulk Import
- 32680 docs/s with peak at 40400 docs/s.
- 32680 docs/s with peak at 40400 docs/s.
-
Indexing
- 18660 docs/s with peak at 27400 docs/s.
- 18660 docs/s with peak at 27400 docs/s.
-
CRUD via REST
- 1000+ Requests/s

Nuxeo Platform
HA & DRP
High Availability
- Each component can be clustered
- Nuxeo Cluster: 2+ nodes
- Elasticsearch Cluster: 3+ nodes
- MongoDB Cluster: 3+ nodes
- Kafka / Zookeeper Cluster: 3+ nodes
- Can spread nodes across availability zones
- 3 AZs is ideal, 2 AZs is supported
- assert Low-Latency network between DC/AZ
-
HA and fail-over automatically tested
- Kubernetes deployment template
- Gatling tests + Chaos Monkey
High Availability

DRP - Multi-Regions
- In case of high latency network
- distributed storage can not be used accros DC
- distributed storage can not be used accros DC
- Deploy 2 Nuxeo clusters in 2 different regions
- 1 region is "master"
- 1 region is standby / fallback
- Leverage Kafka to handle asynchronous replication
- MongoDB + Kafka
- ES + Kafka
- Nuxeo Computations + Kafka Mirror Maker
- Simple FS sync
DRP - Multi-Regions

Nuxeo Platform
Multi-tenants
Nuxeo Based application

Multi-Tenants - Nuxeo Based application

2 possible approaches
Multi-tenants application
All clients
share the
same application.
Application manages
data & configuration partitionning.
Multi-tenants Infrastructure
All clients
share the
same infrastructure.
Deploy
isolated customized application on PaaS.
vs
Decision points
- What do you actually want to share
- What is the deployment infrastructure
-
How different must the different tenants be
- Synchronized deployment
APPLICATION LEVEL MULTI-TENANTS

Document Store
Security
Life Cycle
Indexing
Versioning
all clients share the same application
application manages data and configuration partitionning
Application level Multi-Tenancy - Data Isolation
-
Data Partitioning
-
Repository
- Security Policy
- "Domain based"
-
Elasticseach
- same index
-
Users/Groups
- filtering on per tenant basis
-
Repository

Logical isolation
Application level Multi-Tenancy - Data Isolation
-
Data Partitioning
-
Repository
-
Separated repositories
- MongoDB
- Separated Blob Stores
-
Separated repositories
-
Elasticseach
- per tenant index
-
Users/Groups
- different directories
-
Repository

Physical isolation
Application level Multi-Tenancy - Configuration
-
Share everything: 1 Application / 1 configuration
- All tenants share the same technical configuration
-
extension points contributions
-
extension points contributions
-
Tenant isolation is done via filtering
- filter docTypes per tenant
- define a per tenant facet/schema
-
UI filters access / hides part of it
-
WebUI and multi-tenants
- deploy one customized webapp per tenant
- share the same server side
APPLICATION LEVEL MULTI-TENANCY
-
Limitations
- Shallow isolation in terms of resources
- Monolithic deployment
- Customization needs to be carefully done
-
Scaling the number of tenants can be challenging
-
Pros
- well adapted for lightweight/standardized customization
- one application with multiple flavors
- Splitting the client side may be enough
Infrastructure Level Multi-tenants

Share the infrastructure - not the application
Infrastructure Level Multi-tenants

rely on infrastructure to provide tenants isolation
application does not need to be impacted
Infrastructure LEVEL MULTI-TENANTS
-
Application template
-
Application is deployed as a set of containers
-
Application is deployed as a set of containers
-
Customization
-
Each customer/department build its own image
- custom configuration
-
additional components if needed
-
Each customer/department build its own image
- Leverage Container Platform
- Automated Build and Deployment
- Resource allocation and Scale out
- Isolation
INFRASTRUCTURE LEVEL MULTI-TENANTS
- Leverage IaaS / or VMs based infrastructure
- Terraform & Ansible
- Nuxeo Cloud
- Leverage Container Platform
- OpenShift & Kubernetes




Unlimited Customization
Flexibility of isolated deployments
Full security Isolation & Quotas
Fully automated deployment
Dynamic Scale out
Bake custom images

Deploy custom images

Nuxeo & IPV
DAM & MAM
Principles
-
Leverage IPV features
- Video Transcoding
- Video Logging
- Frame accurate Proxies & IPV player
-
Adobe Premiere integration
- Limit data duplication
-
use IPV Storage for
- "Work In Progress" / raw format Archival
-
Reference from Nuxeo using BlobStore
- i.e. raw format
-
Publish completed work to Nuxeo
- i.e. Proxy
-
use IPV Storage for
Principles

pommeSlides
By Thierry Delprat
pommeSlides
- 3,049