Nuxeo Platform

Building blocks

Scalability & Scale out

Architecture GoALS

Ensure Scalability and Elasticity

  • Allow addressing very large projects (processing & data)
    • without having performance issues
    • without requiring crazy  hardware
    • allowing to scale progressively
       

 Build custom tailored DAM applications

  • Allow the application to evolve with Business
    • without creating a maintenance nightmare
    • without generating delays

Architecture principles

  • Make implementation pluggable
    • choose the right implementation depending on constraints
    • build custom tailored implementation if needed
    • keep a clear separation between platform code and custom code
       
  • Identify and isolate subsystems
    • be able to segregate traffic and load
    • add resources where they are needed
       
  • Leverage Scale out and AutoScaling
    • Scale-out at backend-level or at application level
    • Leverage elasticity provided by Cloud and Container technologies

Component Model

  • In Nuxeo architecture everything is a plugin
  • ​Everything is configurable
    • ​Logic and Data Structrures depends on configuration

see Explorer to browse Extension Points

Building with Nuxeo

Continuous deploymemt

Architecture principles

  • Make implementation pluggable
    • choose the right implementation depending on constraints
    • build custom tailored implementation if needed
    • keep a clear separation between platform code and custom code
       
  • Identify and isolate subsystems
    • be able to segregate traffic and load
    • add resources where they are needed
       
  • Leverage Scale out and AutoScaling
    • Scale-out at backend-level or at application level
    • Leverage elasticity provided by Cloud and Container technologies

Logical Architecture

Logical Architecture

Logical Architecture

Scaling Processing

Scale-out subsystems independently as needed

Scaling Processing

Scale-out subsystems independently as needed

Scale interactive processing

Scaling Processing

Scale-out subsystems independently as needed

Scale interactive processing

Scaling Processing

Scale-out subsystems independently as needed

Scale batch
processing

Scaling Processing

Scale-out subsystems independently as needed

Scale batch
processing

Scaling Processing

Scale-out subsystems independently as needed

Auto-Scaling

  • Nuxeo exposes metrics
    • JMX system and application metrics
    • application level probes
    • status
       
  • Metrics can be used to automatically adjust infrastructure
    • Auto-Scaling in AWS
    • ​Pod autoscaling in OpenShift / K8S 

Importer & BackPressure

nuxeo-mq-importer

Nuxeo Platform

Scaling I/O

Scaling storage services

Leverage scalable storage backends
 

  • MongoDB
    • Scale Reads by adding replicas
    • Scale Write by using sharding

       
  • Elasticsearch
    • Add nodes & shards to the cluster

Scaling storage services

Leverage Nuxeo capabilities

  • Support for Multiple repositories
    • ​​Application level Sharding
       
  • Offload searches
    • from Repository to ES​
       
  • HSM and multi-BlobStores

Storage Adapters

About LARGE Binary Storage

  • Backend storage is pluggable
    • several implementations (FS, S3, GridFS, GDoc, DropBox ...)
    • ​​easy to implement
  • Can do partitions
  • Can do HSM    

Binary Store & Federation

Http / Upload / Download - optimize

  • Leverage Reverse proxy to protect server
    • Download / Upload buffering
    • Caching
    • Traffic prioritization/throttling if needed







       
  • Direct upload/download with CDN support
    • depending on backend (i.e. AWS S3)

Upload / Download Acceleration

  • Downloads are pluggable
    • Redirect
       
  • FileStorage is pluggable
    • Read/Write from any Storage API
       
  • Upload is pluggable
    • BlobStore abstraction
    • Upload UI component

 

The solution depend on the available infrastructure

Downloads Acceleration

Leverage existing CDN and Edge servers 

Downloads Acceleration

Build dedicated CDN and Edge servers 

Uploads Acceleration

Leverage AWS Infrastructure

Completely transparant for users

Nuxeo Platform

Performances

Performances Impacting factors

  • Business Model
    • Content Model
    • Security
    • Workflows
  • Volume
    • Number of documents
    • Size of files is usually not significant ​
  • Types of access
    • Throughput
    • Search
    • Write operations
    • Rendering technologies

Performances Impacting factors

Lot of different factors

Depends from application

No simple sizing

Scalable Architecture

Benchmarks

Scale the part of your application that needs it

Baseline and reusable tools.

Benchmarking

  • Performances tests are part of the development effort
    • part of the nuxeo-platform source code
    • run on a nightly basis via CI Chain
       
  • We leverage Gatling
    • test Web UI
    • test REST API
    • test mass import
       
  • Publish results
    • https://benchmarks.nuxeo.com/  

LTS benchmarks

https://benchmarks.nuxeo.com/

Nightly benchmarks

https://benchmarks.nuxeo.com/

1B benchmark

1B BENCHMARK

  • Bulk Import
    • 32680 docs/s with peak at 40400 docs/s.
       
  • Indexing
    • 18660 docs/s with peak at 27400 docs/s.​
       
  • CRUD via REST
    • 1000+ Requests/s

  •  

Nuxeo Platform

HA & DRP

High Availability

  • Each component can be clustered
    • Nuxeo Cluster: 2+ nodes
    • Elasticsearch Cluster: 3+ nodes
    • MongoDB Cluster: 3+ nodes
    • Kafka / Zookeeper Cluster: 3+ nodes
       
  • Can spread nodes across availability zones
    • 3 AZs is ideal, 2 AZs is supported
    • assert Low-Latency network between DC/AZ
       
  • HA and fail-over automatically tested
    • Kubernetes deployment template
    • Gatling tests + Chaos Monkey

High Availability

DRP - Multi-Regions

  • In case of high latency network
    • distributed storage can not be used accros DC
       
  • Deploy 2 Nuxeo clusters in 2 different regions
    • 1 region is "master"
    • 1 region is standby / fallback
       
  • Leverage Kafka to handle asynchronous replication
    • MongoDB + Kafka
    • ES + Kafka
    • Nuxeo Computations + Kafka Mirror Maker
       
  • Simple FS sync 

DRP - Multi-Regions

Nuxeo Platform

Multi-tenants

Nuxeo Based application

Multi-Tenants - Nuxeo Based application

2 possible approaches

Multi-tenants application

All clients share the same application.
Application manages data & configuration partitionning.

Multi-tenants Infrastructure

All clients share the same infrastructure.
Deploy isolated customized application on PaaS.

vs

Decision points

  • What do you actually want to share
     
  • What is the deployment infrastructure
     
  • How different must the different tenants be
     
  • Synchronized deployment

APPLICATION LEVEL MULTI-TENANTS 

Document Store 
Security
Life Cycle
Indexing
Versioning

all clients share the  same application

application manages data and configuration  partitionning

Application level Multi-Tenancy - Data Isolation

 

  • Data Partitioning
    • Repository
      • Security Policy
      • "Domain based"
    • Elasticseach
      • same index
    • Users/Groups
      • filtering on per tenant basis

Logical isolation

Application level Multi-Tenancy - Data Isolation

 

  • Data Partitioning
    • Repository
      • Separated repositories
        • MongoDB
      • Separated Blob Stores
    • Elasticseach
      • per tenant index
    • Users/Groups
      • different directories

Physical isolation

Application level Multi-Tenancy - Configuration

  • Share everything: 1 Application / 1 configuration
     
  • All tenants share the same technical configuration
    • extension points contributions
       
  • ​Tenant isolation is done via filtering
    • ​filter docTypes per tenant
    • define a per tenant facet/schema
    • UI filters access / hides part of it
       
  • ​WebUI and multi-tenants
    • ​deploy one customized webapp per tenant
    • share the same server side

APPLICATION LEVEL MULTI-TENANCY

  • Limitations
    • Shallow isolation in terms of resources
    • Monolithic deployment
    • Customization needs to be carefully done
    • Scaling the number of tenants can be challenging
       
  • Pros
    • well adapted for lightweight/standardized  customization
    • one application with multiple flavors
    • Splitting the client side may be enough

Infrastructure Level Multi-tenants

Share the infrastructure - not the application

Infrastructure Level Multi-tenants

rely on infrastructure to provide tenants isolation

application does not need to be impacted 
 

Infrastructure LEVEL MULTI-TENANTS

  • Application template
    • Application is deployed as a set of containers
       
  • Customization
    • Each customer/department build its own image
      • custom configuration
      • additional components if needed
         
  • Leverage Container Platform
    • Automated Build and Deployment
    • Resource allocation and Scale out
    • Isolation

INFRASTRUCTURE LEVEL MULTI-TENANTS

  • Leverage IaaS / or VMs based infrastructure
    • Terraform & Ansible
    • Nuxeo Cloud
       
  • Leverage Container Platform
    • OpenShift & Kubernetes

Unlimited Customization
Flexibility of isolated deployments
Full security Isolation & Quotas

Fully automated deployment
Dynamic Scale out

 

Bake custom images

Deploy custom images

Nuxeo & IPV

DAM & MAM

Principles

  • Leverage IPV features
    • Video Transcoding
    • Video Logging
    • Frame accurate Proxies & IPV player
    • Adobe Premiere integration
       
  • Limit data duplication
    •  use IPV Storage for
      • "Work In Progress"​ / raw format Archival
    • Reference from Nuxeo using BlobStore
      • i.e. raw format
    • Publish completed work to Nuxeo
      • i.e. Proxy

Principles