June 25-28, 2019
iRODS User Group Meeting 2019
Utrecht, Netherlands
Justin James
Applications Engineer
iRODS Consortium
Administration
Resource Hierarchies
and Composition
Administration
Resource Hierarchies
and Composition
iRODS Resource Plugins
Motivation
Many iRODS users spent considerable time implementing the same basic use cases as policy in their rule base
Resource hierarchies provide an out of the box means to implement the majority of the use cases, while remaining future-proof
Introduction to Resource Hierarchies
Capture the implementation of various policies as nodes in a decision tree, a well known metaphor
Two types of nodes:
By convention Coordinating Resources do not manage storage
- this is not enforced
Coordinating Resources - Branches
Compound - provide POSIX interface to alternative storage
Deferred - defer to children regarding voting behavior
Load Balanced - use gathered load values to determine choices
Passthru - weight, then delegate operations to a child resource
Random - randomly choose a child for a write operation
Replication - ensure all data objects are consistent across children
Purely virtual in memory - they are not pinned to any given server and plugins must exist on every server in the grid
Storage Resources - Leaves
Storage resources which provide POSIX semantics
- do not require a compound resource hierarchy and a cache
Unix File System - surfaces any mount point
Ceph-RADOS - Ceph object storage
HPSS - access to IBM High Performance Storage System
Cacheless S3 - resource for Amazon S3 service (soon to be released)
Usually pinned to a given server by hostname, as they are expected to access storage provided by the server - plugins do not need to be available on every server.
Exception: The soon to be released cacheless S3 resource implements a detached mode which is not pinned to a server.
Archive Storage Resources - Leaves
Storage Resources which participate in the role as an Archive Resource in a Compound Resource Composition
S3 - archive resource for Amazon S3
WOS - DDN Web Object Scalar
Universal MSS - script based access to generic object storage
Must be pinned to the servers which host the cache in order to synchronize data to the archive resource
The Voting Mechanism
Voting - Weighted Passthru
For both Put and Get operations
This resource has many uses, such as disabling writes or reads to a given resource, or providing an abstraction to the users
The weights also may be overridden by the rule engine which allows for dynamically influencing votes based on policy
Voting - Unix File System Resource
For a Put operation
Voting - Unix File System Resource
For a Get operation
Voting - Random Resource
For a Put Operation
Voting - Random Resource
For a Get Operation
Voting - Replication Resource
For a Put Operation
Voting - Replication Resource
For a Get Operation
Note that given the behavior of the Unix File System, locality of reference significantly affects the behavior of reads and writes for this Resource
Voting - A Put Example
pt1 - passthru
pt2 - passthru
pt3 - passthru
rnd1 - random
rnd2 - random
repl1 - replication
ufsN
ufs0
Voting - A Put Example
Votes 1.0 Connected
Votes 0.5
Not Connected
Votes 0.0 Marked Down
Randomly chooses ufs0 for vote of 1.0
Passes ufsN-1 vote of 0.5
Write weight of 0.25, passes ufs0 as a vote of 0.25
Randomly chooses ufsN-1 vote of 0.5
Chooses ufsN-1 with a vote of 0.5 > 0.25
Passes ufsN-1
Compound Resources
Necessary for POSIX compliance of Tape, Object, etc.
For a Put
For a Get
By default, the compound resource
synchronously replicates to the archive
Compound Resources - An S3 Example
iadmin mkresc resc_name type context_string
iadmin mkresc comp_resc compound iadmin mkresc cache_resc unixfilesystem `hostname`:/tmp/cache_resc
iadmin addchildtoresc parent_name child_name parent_child_context_string
iadmin addchildtoresc comp_resc cache_resc cache
The compound resource plugin honors two parent-child context values: "cache", and "archive"
These are used internally to identify the resources by role
Compound Resources - An S3 Example
iadmin mkresc arch_resc s3 `hostname`:/bucket/name <context_string>
iadmin mkresc arch_resc s3 `hostname`:/bucket/name "S3_DEFAULT_HOSTNAME=s3.amazonaws.com;S3_AUTH_FILE=/etc/irods/auth_file;S3_RETRY_COUNT=3;S3_WAIT_TIME_SEC=1;S3_PROTO=HTTP"
iadmin addchildtoresc comp_resc arch_resc archive
Add the archive resource to the compound resource
Compound Resources - Delayed Replication
Set the compound resource context string to "auto_repl=off"
iadmin modresc comp_resc context "auto_repl=off"
Leverage the rule engine for replication via acPostProcForPut
acPostProcForPut() { if("ufs_cache" == $KVPairs.rescName) { delay("<PLUSET>1s</PLUSET><EF>1h DOUBLE UNTIL SUCCESS OR 6 TIMES</EF>") { *CacheRescName = "comp_resc;ufs_cache"; msisync_to_archive("*CacheRescName", $filePath, $objPath); } } }
Excellent examples of delayed replication and cache purging can be found here:
https://github.com/trel/irods-compound-resource/blob/master/rules/SaraRules.re
Demonstration