Thierry Delprat
td@nuxeo.com
https://github.com/tiry/
About Nuxeo Platform
we provide a Platform that developers can use to
build highly customized Content Applications
we provide components, and the tools to assemble them
everything we do is open source
various customers - various use cases
Track game builds
Electronic Flight Bags
Central repository for Models
Food industry PLM
https://github.com/nuxeo
Storing objects (think {JSON} object)
Custom Domain Model
Conversions & Previews
Security Policies
on any field
At SCALE
Application Log
Repository
Services
Custom tailored Business application
Application must evolve with Business
Principles and technologies used
One plugin model for
FrontEnd - Clients
Build - Test - Ship - Deploy
FrontEnd - Clients
Building a business App with Nuxeo
Pretty much everything inside the Platform can be customized
Easy maintenance and upgrade
Clear separation between infrastructure provided by Nuxeo
and the custom components
Nuxeo Studio configuration is transparently upgraded
Create a new Addon
This addon is a first class citizen
It can receive additional configuration
it can be mixed with other addons
You can use Studio to configure your addons
override or extend domain model
extend default configuration
Share common code and configuration
Share common code
Different branches
of configuration
Nuxeo Storage Model
a “Document” is not a simple file
one document = a persistent object with properties (String, Date, File, Complex types ...)
properties are defined by Schemas
Document types
a document type is defined by a set of schemas, inheritance is supported
Lifecycle
document type is associated with states and transitions
Facets
can be used to associate behavior (Folderish, Hidden, Commentable …)
can be associated with a schema (Mixins) and with a Business Object adapter
Document Schemas are based on XSD
a field can be a Binary Stream
a field can have constraints
required, validation pattern
reference to a document or a user
custom constraint
Scalar properties and arrays:
dc:title = "My Document"
dc:contributors = ["bob", "pete", "mary"]
dc:created = 2014-07-03T12:15:07+0200
ecm:uuid = 52a7352b-041e-49ed-8676-328ce90cc103
Complex properties and lists of them:
primaryAddress = { street = "1 rue René Clair", zip = "75018",
city = "Paris", country = "France" }
files = [
{ name = "doc.txt", length = 1234, mime-type = "plain/text",
data = 0111fefdc8b14738067e54f30e568115 },
{ name = "doc.pdf", length = 29344, mime-type = "application/pdf",
data = 20f42df3221d61cb3e6ab8916b248216 }
]
MongoDB
store structure & streams in a
BASE way
elasticsearch
provide powerful and scalable queries
SQL DB
store structures in an ACID way
Storage does not impact application : this can be a deployment choice!
A
tomic
C
onsistent
I
solated
D
urable
B
asic
A
vailability
S
oft state
E
ventually consistent
depends on Availability & Performances requirements
Send Queries
to the repository
(here SQL)
or send Queries
to elasticsearch
Store Structures
in SQL Database
or store Structures
in MongoDB
store streams
in MongoDB too
store streams
in S3
HSM
Leverage
Google Drive & Google Doc integration
Can mix all
storages types
Audit Log too
can be configured
to use
elasticsearch
loaded from Extension Points
(Startup time)
ORM like mapper
SQL
Tables are created at startup time
according to configuration
SQL
Fields removed from configuration
are simply ignored (no data loss)
SQL
Added fields are added to the tables.
Old entries get default values.
SQL
In case of incompatible change,
an error is raised at startup time.
SQL
No Schema Check at startup.
NoSQL
Searching the repository
SELECT * FROM Document WHERE dc:title = 'Sections' OR dc:title = 'Workspaces'
Fast indexing
No ACID constraints / No impedance issue
3,500 documents/s when using SQL backend
10,
000 documents/s when using MongoDB
Super query performance
query on term using inverted index
very efficient caching
native full text support &
distributed architecture
3,000 queries/s with 1 elasticsearch node
6,000 queries/s with 2 elasticsearch nodes
-- Use an explicit Elasticsearch field
SELECT * FROM Document WHERE /*+ES: INDEX(dc:title.ngram) */ dc:title = 'foo'
-- Use ES operators not present in NXQL
SELECT * FROM Document WHERE /*+ES: OPERATOR(regex) */ dc:title = 's.*y'
SELECT * FROM Document WHERE /*+ES: OPERATOR(fuzzy) */ dc:title = 'zorkspaces'
-- Use ES for GeoQuery based on geo_hash_cell location near a point using geohash;
SELECT * FROM Document WHERE /*+ES: OPERATOR(geo_hash_cell)*/ osm:location IN ('40','-74','5')
leverage what comes for free with elasticsearch
Easy real time data analytics on business data
Importing data in the repository
IO Bound / depend on backend
Single Server
6 core HT 3.5Ghz
126 GB RAM
std hdd
Flexible, Extensible, Composable
"
One API
"
but
Multiple
combinations
of
services, plugins
and Domain Models
Expose a Platform: not an application
developers using the platform
want to expose the API of
their Application
GET /nuxeo/api/v1/path/movies/star-wars HTTP/1.1
{
"entity-type": "document",
"repository": "default",
"uid": "5b352650-e49e-48cf-a4e3-bf97b518e7bf",
"path": "/movies/star-wars",
"type": "MovieCollection",
"isCheckedOut": true,
"title": "Star Wars",
"facets": [
"Folderish"
]
}
Server returns a minimal payload
Client need to control what data schemas are sent
GET /nuxeo/api/v1/path/movies/star-wars HTTP/1.1
X-NXProperties dublincore, common
{
"entity-type": "document",
"repository": "default",
"uid": "5b352650-e49e-48cf-a4e3-bf97b518e7bf",
"path": "/movies/star-wars",
"type": "MovieCollection",
"isCheckedOut": true,
"title": "Star Wars",
"properties": {
...
"common:icon": "/icons/movieCollection.png",
"dc:description": "Star Wars collection",
"dc:creator": "tiry",
"dc:modified": "2015-10-22T02:12:59.07Z",
"dc:lastContributor": "tiry",
"dc:created": "2015-10-22T02:12:59.07Z",
"dc:title": "Star Wars",
...
"dc:contributors": [tiry, "system" ]
},
"facets": [
"Folderish"
]
}
Marshaling registry is pluggable
custom Enrichers can be contributed
"How the data is fetched"
is a server side matter
GET /nuxeo/api/v1/path/movies/star-wars HTTP/1.1
X-NXenrichers.document: thumbnail
{
"entity-type": "document",
"repository": "default",
"uid": "5b352650-e49e-48cf-a4e3-bf97b518e7bf",
"path": "/movies/star-wars",
"type": "MovieCollection",
"isCheckedOut": true,
"title": "Star Wars",
"contextParameters":
{
"thumbnail":
{
"url": "/nuxeo/nxthumb/default/5b352650-e49e-48cf-a4e3-bf97b518e7bf/thumb:thumbnail/Small_photo.jpg"
}
},
"facets": [
"Folderish"
]
}
GET /nuxeo/api/v1/path/movies/star-wars?enrichers.document=thumbnail HTTP/1.1
Implicit JOIN
fetch.objectType=fieldToFetch
translate.objectType=fieldToTranslate
depth=children
GET /nuxeo/api/v1/path/movies/star-wars@acl HTTP/1.1
GET /nuxeo/api/v1/path/movies/star-wars@audit HTTP/1.1
GET /nuxeo/api/v1/path/movies/star-wars@bo/MyBusinessObject HTTP/1.1
{
entity-type: "MovieCollection"
id: "5b352650-e49e-48cf-a4e3-bf97b518e7bf",
"title": "Star Wars"
"episodes": 7
}
GET /nuxeo/api/v1/path/movies/star-wars@bo/MovieCollection HTTP/1.1
{"entity-type": "document",
"properties": {
{
"file:content" : {
"upload-batch' : "0b0061d48f69b072",
"upload-fileId" : 0,
"type" : "blob"
}
}}
POST /api/v1/upload/{batchId}/{fileIdx} HTTP 1.1
X-Upload-Chunk-Index 0
X-Upload-Chunk-Count 5
PUT /nuxeo/api/v1/path/movies/star-wars HTTP/1.1
Without creating 100 endpoints!
Need an other paradigm !
Command
INPUT
(Doc, Blob, User ...)
OUTPUT
(Doc, Blob, User ...)
Parameters
Context
(User, Doc ...)
WebUI.AddErrorMessage WebUI.AddInfoMessage WebUI.AddMessage Document.AddPermission Document.AddToCollection DocumentMultivaluedProperty.addItem Task.ApplyDocumentMapping Blob.AttachOnDocument BlobHolder.AttachOnCurrentDocument AttachFiles Audit.QueryWithPageProvider Blob.ImportClipboard Blob.ImportWorklist Blob.RunConverter Document.BlockPermissionInheritance WorkflowModel.BulkRestartInstances Business.BusinessCreateOperation Business.BusinessFetchOperation Business.BusinessUpdateOperation Navigation.GoBack WorkflowInstance.Cancel Navigation.ChangeCurrentTab Document.CheckIn Document.CheckOut Update.NextStep.ConditionalFolder WebUI.ClearClipboard WebUI.ClearSelectedDocuments WebUI.ClearWorklist WorkflowTask.Complete Blob.ConcatenatePDFs Context.FetchDocument Context.FetchFile Blob.ToPDF Blob.Convert Document.Copy Document.Create FileManager.Import UserWorkspace.CreateDocumentFromBlob Seam.CreateDocumentInUI Picture.Create Document.CreateLiveProxy Document.AddRelation Collection.Create Workflow.CreateRoutingTask Task.Create Directory.CreateEntries Document.Delete Document.DeleteRelation Directory.DeleteEntries WebUI.DestroySeamContext Repository.GetDocument Document.Export WebUI.DownloadFile Blob.ExportToFS Document.FetchByProperty Blob.CreateFromURL FileManager.ImportInSeam FileManager.ImportWithMetaData FileManager.ImportWithMetaDataInSeam Document.Filter Document.FollowLifecycleTransition Comment.Moderate Document.GetBlobs Document.GetChild Document.GetChildren Document.GetBlob Document.GetBlobsByProperty User.GetUserWorkspace Document.GetLinkedDocuments Proxy.GetSourceDocument User.Get Document.GetParent Context.GetEmailsWithPermissionOnDoc Context.GetTaskNames Context.GetUsersGroupIdsWithPermissionOnDoc Document.GetVersions Directory.Projection Collection.Suggestion User.GetCollections Directory.Entries Directory.SuggestEntries Collection.GetDocumentsFromCollection Favorite.GetDocuments Document.Routing.GetGraph Picture.GetView Workflow.GetOpenTasks Tag.Suggestion Task.GetAssigned UserGroup.Suggestion Document.GetRendition Blob.PostToURL Image.Blob.Resize WebUI.InitSeamContext JsonStack.ToggleDisplay Actions.GET GetRepositories Document.Lock Log Audit.LogEvent Auth.LoginAs Auth.Logout Document.Move Document.PublishToSections NRD-AC-PR-ChooseParticipants-Output NRD-AC-PR-LockDocument NRD-AC-PR-UnlockDocument NRD-AC-PR-ValidateNode-Output NRD-AC-PR-force-validate NRD-AC-PR-storeTaskInfo WebUI.NavigateTo NuxeoDrive.SetActiveFactories NuxeoDrive.AddToLocallyEditedCollection NuxeoDrive.AttachBlob NuxeoDrive.CanMove NuxeoDrive.CreateFile NuxeoDrive.CreateFolder NuxeoDrive.CreateTestDocuments NuxeoDrive.Delete NuxeoDrive.FileSystemItemExists NuxeoDrive.GenerateConflictedItemName NuxeoDrive.GetRoots NuxeoDrive.GetChangeSummary NuxeoDrive.GetChildren NuxeoDrive.GetClientUpdateInfo NuxeoDrive.GetFileSystemItem NuxeoDrive.GetTopLevelFolder NuxeoDrive.GetTopLevelChildren NuxeoDrive.Move NuxeoDrive.SetSynchronization NuxeoDrive.Rename NuxeoDrive.SetVersioningOptions NuxeoDrive.SetupIntegrationTests NuxeoDrive.TearDownIntegrationTests NuxeoDrive.UpdateFile NuxeoDrive.WaitForElasticsearchCompletion NuxeoDrive.WaitForAsyncCompletion Repository.PageProvider Context.PopDocument Context.PopDocumentList Context.PopBlob Context.PopBlobList Document.PublishToSection Context.PullDocument Context.PullDocumentList Context.PullBlob Context.PullBlobList Context.PushDocument Context.PushDocumentList Context.PushBlob Context.PushBlobList WebUI.AddToClipboard WebUI.PushDocumentToSeamContext WebUI.AddToWorklist LocalConfiguration.PutSimpleConfigurationParameters LocalConfiguration.PutSimpleConfigurationParameter Repository.Query Audit.Query Repository.ResultSetPageProvider WebUI.RaiseSeamEvents Blob.ReadMetadata Context.SetMetadataFromBlob Directory.ReadEntries WebUI.Refresh WebUI.Refresh Document.RemoveACL Services.RemoveDocumentTags Document.RemoveEntryOfMultivaluedProperty Blob.RemoveFromDocument Document.RemovePermission Document.RemoveProperty Collection.RemoveFromCollection Render.Document Render.DocumentFeed TemplateProcessor.Render Document.ReplacePermission Document.Reload Picture.Resize Context.RestoreDocumentInput Context.RestoreDocumentsInput Context.RestoreBlobInput Context.RestoreBlobsInput Document.RestoreVersion Context.RestoreBlobInputFromScript Context.RestoreBlobsInputFromScript Context.RestoreDocumentInputFromScript Context.RestoreDocumentsInputFromScript Repository.ResultSetQuery Document.Routing.Resume.Step Workflow.ResumeNode Counters.GET RunOperation RunDocumentOperation Context.RunDocumentOperationInNewTx RunFileOperation RunOperationOnList RunOperationOnProvider RunOperationOnListInNewTx RunInputScript RunScript WebUI.RunOperationInSeam Document.Save Seam.SaveDocumentInUI Repository.SaveSession SeamActions.GET Document.Mail Event.Fire Document.AddACE Context.SetVar Context.SetInputAsVar LocalConfiguration.SetSimpleConfigurationParameterAsVar Document.Routing.SetRunningStepFromTask Document.SetBlob Document.SetBlobName WebUI.SetJSFOutcome Workflow.SetNodeVariable Document.Routing.Step.Done Document.Routing.BackToReady Document.Routing.EvaluateCondition Context.SetWorkflowVar WebUI.ShowCreateForm Document.CreateVersion Context.StartWorkflow Search.SuggestersLauncher Services.TagDocument Traces.Get Traces.ToggleRecording Document.SetMetadataFromBlob Seam.GetChangeableDocument Seam.FetchFromClipboard Seam.GetCurrentDocument Seam.GetCurrentDomain Seam.GetCurrentWorkspace Seam.FetchDocument Seam.GetSelectedDocuments Seam.GetDocumentsFromSelectionList Seam.FetchFromWorklist Document.Unlock Document.UnblockPermissionInheritance Services.UntagDocument Document.Update Document.SetProperty Document.Routing.UpdateCommentsInfoOnDocument Directory.UpdateEntries Workflow.UserTaskPageProvider VersionAndAttachFile VersionAndAttachFiles Blob.SetMetadataFromDocument Blob.SetMetadataFromContext Blob.CreateZip acceptComment addCurrentDocumentToWorklist blobToPDF cancelWorkflow conditionalTask decideNextStepAndSimpleValidate downloadFilesZip evaluateCondition followLifeCycleTransition followLifeCycleTransitionTask initInitiatorComment logInAudit nextAssignee notifyInitiatorEndOfWorkflow publishDocument publishTask reinitAssigneeComment rejectComment Workflow.RemoveRoutingTask sendTaskCreatedNotificationMail setDone setNextStep setTaskDone simpleChooseNextOption1AndDone simpleChooseNextOption2AndDone simpleRefuse simpleTask simpleUndo simpleValidate terminateWorkflow undoRunningTask updateCommentsOnDoc validateDocument voidChain xmlExportRendition zipTreeExportRendition
Favorite.GetDocuments
Blob.ToPDF
Image.Blob.Resize
Document.AddRelation
Workflow.CreateRoutingTask
lot of contributed operations
Commands as REST resources
GET to retrieve definition
POST to execute
GET /nuxeo/api/v1/automation/Document.PageProvider HTTP/1.1
HTTP/1.1 200 OK
Content-Type: application/json
{
"id":"Document.PageProvider",
"label":"PageProvider",
"description":"Perform a query ...",
"signature":[ "void", "documents" ],
"params":[
{ "name":"page",
"type":"integer",
"required":false
},{
"name":"query",
"type":"string",
"required":false, },
... ]
}
POST /nuxeo/api/v1/automation/Document.PageProvider HTTP/1.1
Content-Type: application/json+nxrequest
{ "params" :
{ "query" : "select * from Note",
"page" : 0
}
}
HTTP/1.1 200 OK
Content-Type: application/json
{
"entity-type": "documents",
"pageIndex": 0,
"pageSize": 2,
"pageCount": 2,
"entries": [
{
"entity-type": "document",
"repository": "default",
"uid": "3f76a415-ad73-4522-9450-d12af25b7fb4",
...
}, { ...}, ...
]
}
> cat /doc/path/somedoc | command(p3,p4)
POST /nuxeo/api/v1/path/somePath/@op/Blob.ToPDF HTTP/1.1
HTTP/1.1 200 OK
Content-Type: application/pdf
...
assemble API blocks without having to code
build business API
Server side assembly
One Context
Scale out Architecture
Scale Interactive Processing
Scale Batch Processing
Scale
Queries
Scale out Storage
Scale Storage
with NoSQL
Geographical redundancy & disaster recovery
Supercharging Nuxeo Repository
SELECT "hierarchy"."id" AS "_C1" FROM "hierarchy"
JOIN "fulltext" ON "fulltext"."id" = "hierarchy"."id"
LEFT JOIN "misc" "_F1" ON "hierarchy"."id" = "_F1"."id"
LEFT JOIN "dublincore" "_F2" ON "hierarchy"."id" = "_F2"."id"
WHERE
("hierarchy"."primarytype" IN ('Video', 'Picture', 'File'))
AND ((TO_TSQUERY('english', 'sydney')
@@NX_TO_TSVECTOR("fulltext"."fulltext")))
AND ("hierarchy"."isversion" IS NULL)
AND ("_F1"."lifecyclestate" <> 'deleted')
AND ("_F2"."created" IS NOT NULL )
ORDER BY "_F2"."created" DESC
LIMIT 201 OFFSET 0;
some types of queries
can
simply
not be fast in SQL
Repository & Audit Trail
Fast indexing
No ACID constraints / No impedance issue
Append only index
Super query performance
query on term using inverted index
very efficient caching
native full text support
distributed architecture
Search & Audit Trail
MongoDB
store structure & streams in a
BASE way
elasticsearch
provide powerful and scalable queries
SQL DB
store structures in an ACID way
Storage does not impact application : this can be a deployment choice!
A
tomic
C
onsistent
I
solated
D
urable
B
asic
A
vailability
S
oft state
E
ventually consistent
depends on Availability & Performances requirements
SQL DB collapses
(on commodity hardware)
MongoDB handles the volume
Transactions can not span across multiple documents
Sample use case:
Press Agency
production system
mixed
requirements
Understanding Nuxeo Security
Nuxeo defines a set of Atomic permissions
Nuxeo defines groups of permissions
Repository always checks the Atomic permissions
You can define custom permissions and groups of permissions
You can use Core API to check permissions explicitly
READ_PROPERTIES, ADD_CHILDREN, READ_LIFECYCLE ...
READ, WRITE, MANAGE ...
session.hasPermission(Document, Perm)
(also available in Directories)
Security is checked at Document Level
field/schemas do not hold ACLs
No field level security
Download action is checked by a custom Download Policy
depending on Document, File, XPath, User
Can view document meta-data without being able
to download or preview
Compound Documents
use nested documents with different ACLs
Handle finer grained security
Custom API for custom visibility
Leverage Custom indexing in Elasticsearch
Custom marshaling layer in the Rest API
Expose data that would otherwise not be accessible
ACL based
Computed Groups
i.e. compute groups based on user attributes
Automatically apply ACLs
i.e. Listeners and Automation
Complex to manage, test and update
Security Policies
Integrate custom logic at the core of the security system
Initially introduced for "military" use cases
Low administration + Good testability
Atomic permission check: Checkperm
Override or complement ACL based security
Java Based logic to Grant/Access based on
Document (including attributes)
User (including attributes)
Search security filtering: QueryTransformer
Avoid post-filtering in search
generate additional where clause
Allows custom security to scale with large queries
Listeners & Queues
Scale out Architecture
Provide sample dashboard for Graphite
Nuxeo
Metrics
System
Metrics
Integrate with DataDog
Deploying Nuxeo on AWS
API driven provisioning and deployment
transparent fail-over
easy scalability
Isolated configuration in Nuxeo
Document Store
Security
Life Cycle
Indexing
Versioning
all clients share the same application
application manages data and configuration partitionning
Shallow isolation
Monolithic
Not even simple
Can not leverage
OSGi / Extension Point model
Not
"Cloud Native approach"
rely on infrastructure to provide tenants isolation
application does not need to be impacted
Flexible
Unlimited
Customization
Full
Isolation &
Quotas
Create "on demand" application for each customer
Build Your Own Application
Deploy & Run !
What we are working on
http://roadmap.nuxeo.com/
Comments,
Thank You !