Nuxeo Platform
Building blocks
Scalability & Scale out
Architecture GoALS
Ensure Scalability and Elasticity
-
Allow addressing very large projects (processing & data)
- without having performance issues
- without requiring crazy hardware
-
allowing to scale progressively
Build custom tailored DAM applications
- Allow the application to evolve with Business
- without creating a maintenance nightmare
- without generating delays
Architecture principles
-
Make implementation pluggable
- choose the right implementation depending on constraints
- build custom tailored implementation if needed
-
keep a clear separation between platform code and custom code
-
Identify and isolate subsystems
- be able to segregate traffic and load
-
add resources where they are needed
-
Leverage Scale out and AutoScaling
- Scale-out at backend-level or at application level
- Leverage elasticity provided by Cloud and Container technologies
Component Model
- In Nuxeo architecture everything is a plugin
-
Everything is
configurable
- Logic and Data Structrures depends on configuration
see Explorer to browse Extension Points
Building with Nuxeo
Continuous deploymemt
Architecture principles
-
Make implementation pluggable
- choose the right implementation depending on constraints
- build custom tailored implementation if needed
-
keep a clear separation between platform code and custom code
- Identify and isolate subsystems
- be able to segregate traffic and load
-
add resources where they are needed
- Leverage Scale out and AutoScaling
- Scale-out at backend-level or at application level
- Leverage elasticity provided by Cloud and Container technologies
Logical Architecture
Logical Architecture
Logical Architecture
Scaling Processing
Scale-out subsystems independently as needed
Scaling Processing
Scale-out subsystems independently as needed
Scale interactive processing
Scaling Processing
Scale-out subsystems independently as needed
Scale interactive processing
Scaling Processing
Scale-out subsystems independently as needed
Scale batch
processing
Scaling Processing
Scale-out subsystems independently as needed
Scale batch
processing
Scaling Processing
Scale-out subsystems independently as needed
Auto-Scaling
- Nuxeo exposes metrics
- JMX system and application metrics
- application level probes
- status
- Metrics can be used to automatically adjust infrastructure
- Auto-Scaling in AWS
- Pod autoscaling in OpenShift / K8S
Importer & BackPressure
nuxeo-mq-importer
Nuxeo Platform
Scaling I/O
Scaling storage services
Leverage scalable storage backends
-
MongoDB
- Scale Reads by adding replicas
-
Scale Write by using sharding
-
Elasticsearch
- Add nodes & shards to the cluster
Scaling storage services
Leverage Nuxeo capabilities
-
Support for Multiple repositories
-
Application level Sharding
-
Application level Sharding
-
Offload searches
-
from Repository to ES
-
from Repository to ES
- HSM and multi-BlobStores
Storage Adapters
About LARGE Binary Storage
- Backend storage is
pluggable
- several implementations (FS, S3, GridFS, GDoc, DropBox ...)
- easy to implement
- Can do partitions
- Can do HSM
Binary Store & Federation
Http / Upload / Download - optimize
-
Leverage Reverse proxy to protect server
- Download / Upload buffering
- Caching
-
Traffic prioritization/throttling if needed
-
Direct upload/download with CDN support
- depending on backend (i.e. AWS S3)
Upload / Download Acceleration
-
Downloads are pluggable
- Redirect
- Redirect
-
FileStorage is pluggable
- Read/Write from any Storage API
- Read/Write from any Storage API
-
Upload is pluggable
- BlobStore abstraction
- Upload UI component
The solution depend on the available infrastructure
Downloads Acceleration
Leverage existing CDN and Edge servers
Downloads Acceleration
Build dedicated CDN and Edge servers
Uploads Acceleration
Leverage AWS Infrastructure
Completely transparant for users
Nuxeo Platform
Performances
Performances Impacting factors
-
Business Model
- Content Model
- Security
- Workflows
-
Volume
- Number of documents
- Size of files is usually not significant
-
Types of access
- Throughput
- Search
- Write operations
- Rendering technologies
Performances Impacting factors
Lot of different factors
Depends from application
No simple sizing
Scalable Architecture
Benchmarks
Scale the part of your application that needs it
Baseline and reusable tools.
Benchmarking
-
Performances tests are part of the development effort
- part of the nuxeo-platform source code
-
run on a nightly basis via CI Chain
-
We leverage
Gatling
- test Web UI
- test REST API
-
test mass import
-
Publish results
- https://benchmarks.nuxeo.com/
LTS benchmarks
https://benchmarks.nuxeo.com/
Nightly benchmarks
https://benchmarks.nuxeo.com/
1B benchmark
1B BENCHMARK
-
Bulk Import
- 32680 docs/s with peak at 40400 docs/s.
- 32680 docs/s with peak at 40400 docs/s.
-
Indexing
- 18660 docs/s with peak at 27400 docs/s.
- 18660 docs/s with peak at 27400 docs/s.
-
CRUD via REST
- 1000+ Requests/s
Nuxeo Platform
HA & DRP
High Availability
- Each component can be clustered
- Nuxeo Cluster: 2+ nodes
- Elasticsearch Cluster: 3+ nodes
- MongoDB Cluster: 3+ nodes
- Kafka / Zookeeper Cluster: 3+ nodes
- Can spread nodes across availability zones
- 3 AZs is ideal, 2 AZs is supported
- assert Low-Latency network between DC/AZ
-
HA and fail-over automatically tested
- Kubernetes deployment template
- Gatling tests + Chaos Monkey
High Availability
DRP - Multi-Regions
- In case of high latency network
- distributed storage can not be used accros DC
- distributed storage can not be used accros DC
- Deploy 2 Nuxeo clusters in 2 different regions
- 1 region is "master"
- 1 region is standby / fallback
- Leverage Kafka to handle asynchronous replication
- MongoDB + Kafka
- ES + Kafka
- Nuxeo Computations + Kafka Mirror Maker
- Simple FS sync
DRP - Multi-Regions
Nuxeo Platform
Multi-tenants
Nuxeo Based application
Multi-Tenants - Nuxeo Based application
2 possible approaches
Multi-tenants application
All clients
share the
same application.
Application manages
data & configuration partitionning.
Multi-tenants Infrastructure
All clients
share the
same infrastructure.
Deploy
isolated customized application on PaaS.
vs
Decision points
- What do you actually want to share
- What is the deployment infrastructure
-
How different must the different tenants be
- Synchronized deployment
APPLICATION LEVEL MULTI-TENANTS
Document Store
Security
Life Cycle
Indexing
Versioning
all clients share the same application
application manages data and configuration partitionning
Application level Multi-Tenancy - Data Isolation
-
Data Partitioning
-
Repository
- Security Policy
- "Domain based"
-
Elasticseach
- same index
-
Users/Groups
- filtering on per tenant basis
-
Repository
Logical isolation
Application level Multi-Tenancy - Data Isolation
-
Data Partitioning
-
Repository
-
Separated repositories
- MongoDB
- Separated Blob Stores
-
Separated repositories
-
Elasticseach
- per tenant index
-
Users/Groups
- different directories
-
Repository
Physical isolation
Application level Multi-Tenancy - Configuration
-
Share everything: 1 Application / 1 configuration
- All tenants share the same technical configuration
-
extension points contributions
-
extension points contributions
-
Tenant isolation is done via filtering
- filter docTypes per tenant
- define a per tenant facet/schema
-
UI filters access / hides part of it
-
WebUI and multi-tenants
- deploy one customized webapp per tenant
- share the same server side
APPLICATION LEVEL MULTI-TENANCY
-
Limitations
- Shallow isolation in terms of resources
- Monolithic deployment
- Customization needs to be carefully done
-
Scaling the number of tenants can be challenging
-
Pros
- well adapted for lightweight/standardized customization
- one application with multiple flavors
- Splitting the client side may be enough
Infrastructure Level Multi-tenants
Share the infrastructure - not the application
Infrastructure Level Multi-tenants
rely on infrastructure to provide tenants isolation
application does not need to be impacted
Infrastructure LEVEL MULTI-TENANTS
-
Application template
-
Application is deployed as a set of containers
-
Application is deployed as a set of containers
-
Customization
-
Each customer/department build its own image
- custom configuration
-
additional components if needed
-
Each customer/department build its own image
- Leverage Container Platform
- Automated Build and Deployment
- Resource allocation and Scale out
- Isolation
INFRASTRUCTURE LEVEL MULTI-TENANTS
- Leverage IaaS / or VMs based infrastructure
- Terraform & Ansible
- Nuxeo Cloud
- Leverage Container Platform
- OpenShift & Kubernetes
Unlimited Customization
Flexibility of isolated deployments
Full security Isolation & Quotas
Fully automated deployment
Dynamic Scale out
Bake custom images
Deploy custom images
Nuxeo & IPV
DAM & MAM
Principles
-
Leverage IPV features
- Video Transcoding
- Video Logging
- Frame accurate Proxies & IPV player
-
Adobe Premiere integration
- Limit data duplication
-
use IPV Storage for
- "Work In Progress" / raw format Archival
-
Reference from Nuxeo using BlobStore
- i.e. raw format
-
Publish completed work to Nuxeo
- i.e. Proxy
-
use IPV Storage for
Principles
pommeSlides
By Thierry Delprat
pommeSlides
- 2,886