Policy Training
Rule Engine Plugins
Policy Training
Rule Engine Plugins
Jason Coposky
@jason_coposky
Executive Director, iRODS Consortium
August 3-6, 2020
KU Leuven Training
Webinar Presentation
Anatomy of a Rule Engine Plugin
Each plugin must define seven operations:
- start
- stop
- rule_exists
- list_rules
- exec_rule
- exec_rule_text
- exec_rule_expression
The Rule Engine Plugin Framework
Manages multiple instantiations of rule engine plugins
- First by regex match
- Second by rule_exists()
Delegates policy invocation to rule engine plugins
If a policy returns success, meaning a zero, the framework stops
order matters!
Walks plugins sequentially looking for a plugin to satisfy the invocation
If a policy returns RULE_ENGINE_CONTINUE, the process continues which allows for a clean separation of concerns
If a policy returns an error, meaning a negative value, the framework stops
Rule Engine Continuation
acPostProcForPut() {
if($rescName == "demoResc") {
# extract and apply metadata
}
else if($rescName == "cacheResc") {
# async replication to archive
}
else if($objPath like "/tempZone/home/alice/*" &&
$rescName == "indexResc") {
# launch an indexing job
}
else if(xyz) {
# compute checksums ...
}
# and so on ...
}
The original approach to policy implementation
Rule Engine Continuation
Separate the implementation into several rule bases:
pep_api_data_obj_put_post(*INSTANCE_NAME, *COMM, *DATAOBJINP, *BUFFER, *PORTAL_OPR_OUT) {
# metadata extraction and application code
RULE_ENGINE_CONTINUE
}
/etc/irods/metadata.re
pep_api_data_obj_put_post(*INSTANCE_NAME, *COMM, *DATAOBJINP, *BUFFER, *PORTAL_OPR_OUT) {
# checksum code
RULE_ENGINE_CONTINUE
}
/etc/irods/checksum.re
pep_api_data_obj_put_post(*INSTANCE_NAME, *COMM, *DATAOBJINP, *BUFFER, *PORTAL_OPR_OUT) {
# access time application code
RULE_ENGINE_CONTINUE
}
/etc/irods/access_time.re
Configuration of Rule Engine Plugins
Server configuration is found in /etc/irods/server_config.json
...
"federation": [],
"server_control_plane_port": 1248,
"plugin_configuration": {
"rule_engines": [
{
// native rule engine configuration
},
{
// python rule engine configuration
},
{
// C++ default rule engine configuration
}
...
]
}
All plugins may have configuration parameters in the 'plugin_configuration' object
All instances of rule engine plugins must be configured in the 'rule_engines' array
Configuration of Rule Engine Plugins
Anatomy of a rule engine plugin instance
...
"federation": [],
"server_control_plane_port": 1248,
"plugin_configuration": {
"rule_engines": [
{
"instance_name": "<UNIQUE NAME>",
"plugin_name": "<DERIVED FROM SHARED OBJECT>",
"plugin_specific_configuration": {
<ANYTHING GOES HERE>
}
"shared_memory_instance": "<UNIQUE SHM NAME>"
},
...
]
}
iRODS Rule Language Configuration
{
"instance_name": "irods_rule_engine_plugin-irods_rule_language-instance",
"plugin_name": "irods_rule_engine_plugin-irods_rule_language",
"plugin_specific_configuration": {
"re_data_variable_mapping_set": [
"core"
],
"re_function_name_mapping_set": [
"core"
],
"re_rulebase_set": [
"example_custom_rule_base_0",
"example_custom_rule_base_1",
"example_custom_rule_base_2",
"core"
],
"regexes_for_supported_peps": [
"ac[^ ]*",
"msi[^ ]*",
"[^ ]*pep_[^ ]*_(pre|post)"
]
},
"shared_memory_instance": "upgraded_legacy_re"
}
iRODS Rule Language Configuration
/etc/irods/core.re is the default implementation and should remain unchanged
Rule implementations are found in /etc/irods/
Custom rule implementations reside in /etc/irods/ and should be configured above core in re_rule_base_set
Basic iRODS Rule Language Example
Static (Legacy) Policy Enforcement Points
Dynamic Policy Enforcement Points
acPostProcForPut() {
if("ufs_cache" == $KVPairs.rescName) {
delay("<PLUSET>1s</PLUSET><EF>1h DOUBLE UNTIL SUCCESS OR 6 TIMES</EF>") {
*CacheRescName = "comp_resc;ufs_cache";
msisync_to_archive("*CacheRescName", $filePath, $objPath );
}
}
}
pep_resource_resolve_hierarchy_pre(*INST_NAME, *CTX, *OUT, *OP_TYPE, *HOST, *RESC_HIER, *VOTE)
{
if( "CREATE" == *OP_TYPE ) {
if( "pt1" == *INST_NAME) {
*OUT = "read=1.0;write=0.5"
}
else if ( "pt2" == *INST_NAME ) {
*OUT = "read=1.0;write=1.0"
}
}
}
Installing the Python Rule Engine Plugin
sudo apt-get -y install irods-rule-engine-plugin-python
ls /usr/lib/irods/plugins/rule_engines/
libirods_rule_engine_plugin-cpp_default_policy.so
libirods_rule_engine_plugin-irods_rule_language.so
libirods_rule_engine_plugin-python.so
Install the plugin
See the new shared objects
Python Rule Engine Configuration
Create /etc/irods/core.py from packaged template file
irods $ cp /etc/irods/core.py.template /etc/irods/core.py
"rule_engines": [
{
"instance_name" : "irods_rule_engine_plugin-python-instance",
"plugin_name" : "irods_rule_engine_plugin-python",
"plugin_specific_configuration" : {}
},
{
"instance_name": "irods_rule_engine_plugin-irods_rule_language-instance",
...
Add the plugin configuration to /etc/irods/server_config.json
Python Rule Language Example
def acPostProcForPut(rule_args, callback, rei):
Map = session_vars.get_map(rei)
Kvp = { str(a):str(b) for a,b in Map['key_value_pairs'].items() }
if 'ufs_cache' == Kvp['rescName']:
callback.delayExec(
("<PLUSET>1s</PLUSET><EF>1h DOUBLE UNTIL SUCCESS OR 6 TIMES</EF>" +
"<INST_NAME>irods_rule_engine_plugin-python-instance</INST_NAME>" ),
"callback.msisync_to_archive('{cacheResc}','{file_path}','{object_path}')".format( cacheResc='comp_resc;ufs_cache',
**Map['data_object'] ), "")
def pep_resource_resolve_hierarchy_pre(rule_args, callback):
(INST_NAME, CTX, OUT, OP_TYPE, HOST, RESC_HIER, VOTE) = rule_args
if "CREATE" == OP_TYPE :
if "pt1" == INST_NAME:
rule_args[2] = "read=1.0;write=0.5"
elif "pt2" == INST_NAME:
rule_args[2] = "read=1.0;write=1.0"
Static (Legacy) Policy Enforcement Points
Dynamic Policy Enforcement Points
A Combination Use of Both Python and iRODS Rules
Add a custom rulebase to /etc/irods/server_config.json
"rule_engines": [
{
"instance_name" : "irods_rule_engine_plugin-python-instance",
"plugin_name" : "irods_rule_engine_plugin-python",
"plugin_specific_configuration" : {}
},
{
"instance_name": "irods_rule_engine_plugin-irods_rule_language-instance",
"plugin_name": "irods_rule_engine_plugin-irods_rule_language",
"plugin_specific_configuration": {
"re_data_variable_mapping_set": [
"core"
],
"re_function_name_mapping_set": [
"core"
],
"re_rulebase_set": [
----> "training", <----
"core"
],
A Combination Use of Both Python and iRODS Rules
Create /etc/irods/training.re rulebase
add_metadata_to_objpath(*str, *objpath, *objtype) {
msiString2KeyValPair(*str, *kvp);
msiAssociateKeyValuePairsToObj(*kvp, *objpath, *objtype);
}
getSessionVar(*name,*output) {
*output = eval("str($"++*name++")");
}
A Combination Use of Both Python and iRODS Rules
Copy core.py and python_storage_balancing.py into /etc/irods
This will overwrite the default core.py
sudo cp ~/irods_training/advanced/python_storage_balancing.py /etc/irods/ sudo cp ~/irods_training/advanced/core.py /etc/irods/
The python instantiation of the static PEP now in core.py:
def acPostProcForPut(rule_args, callback, rei):
sv = session_vars.get_map(rei)
phypath = sv['data_object']['file_path']
objpath = sv['data_object']['object_path']
exiflist = []
with open(phypath, 'rb') as f:
tags = EXIF.process_file(f, details=False)
for (k, v) in tags.iteritems():
if k not in ('JPEGThumbnail', 'TIFFThumbnail', 'Filename', 'EXIF MakerNote'):
exifpair = '{0}={1}'.format(k, v)
exiflist.append(exifpair)
exifstring = '%'.join(exiflist)
callback.add_metadata_to_objpath(exifstring, objpath, '-d')
callback.writeLine('serverLog', 'PYTHON - acPostProcForPut() complete')
Test our combination of rules
wget https://github.com/irods/irods_training/raw/master/stickers.jpg
iput stickers.jpg
Get some test data into iRODS
imeta ls -d stickers.jpg
Confirm the EXIF metadata was harvested
irm -f stickers.jpg
Make sure earlier stickers example is removed
AVUs defined for dataObj stickers.jpg:
attribute: Image Orientation
value: Horizontal (normal)
units:
----
attribute: EXIF ColorSpace
value: sRGB
units:
----
Microservices
We were using the word before it was cool...
- C++ plugins bound into the rule languages
- Necessary to reach certainly custom libraries
- Useful for complex or compute intensive applications
Many are provided by the server but additional plugins may be installed
We will walk through the statically linked plugins here:
Microservices
Can be invoked directly by the native rule engine
example_rule()
{
msiDataObjChksum("/tempZone/home/rods/example.txt", "verifyChksum=++++ChksumAll=", *result)
}
def example_rule(rule_args, callback):
callback.msiDataObjChksum("/tempZone/home/rods/example.txt", "verifyChksum=++++ChksumAll=", *result)
May be reached in python through the callback mechanism
Delayed Execution
iRODS provides an asynchronous means to execute rules
The delay() directive will schedule its body of code into the delayed execution queue
The delay() directive is built into the iRODS rule language and available using the callback mechanism in Python
Delayed Execution
- EA
- ET
- PLUSET
- EF
- INST_NAME
The delay is configured via a string using XML parameter syntax
Options include:
execution address, host where the delayed execution needs to be performed
execution time, absolute time when it needs to be performed.
relative execution time to current time when it needs to execute
execution frequency (in time widths) it needs to be performed.
rule engine plugin instance name to target for this rule
Delayed Execution
The EF value is of the format: nnnnU <directive>
- nnnn is a number
- U is the unit of the number (s-sec,m-min,h-hour,d-day,y-year)
- <empty-directive> same as REPEAT FOR EVER
- REPEAT FOR EVER
- REPEAT UNTIL SUCCESS
- REPEAT nnnn TIMES where nnnn is an integer
- REPEAT UNTIL <time>
- REPEAT UNTIL SUCCESS OR UNTIL <time>
- REPEAT UNTIL SUCCESS OR nnnn TIMES
- DOUBLE FOR EVER
- DOUBLE UNTIL SUCCESS where delay is doubled every time.
- DOUBLE nnnn TIMES
- DOUBLE UNTIL <time>
- DOUBLE UNTIL SUCCESS OR UNTIL <time>
- DOUBLE UNTIL SUCCESS OR nnnn TIMES
- DOUBLE UNTIL SUCCESS UPTO <time>
Where <directive> can be of the form:
Delayed Execution
The <time> format may be one of three forms:
- nnnn
- nnnns
- nnnnm
- nnnnh
- nnnnd
- nnnny
<dd.hh:mm:ss> where dd, hh, mm and ss are 2 digits integers
representing days, hours minutes and seconds
Truncation from the end is allowed. e.g. 20:40 means mm:ss
an integer. assumed to be in sec
an integer followed by 's' meaning in seconds
an integer followed by 'm' meaning in minutes
an integer followed by 'h' meaning in hours
an integer followed by 'd' meaning in days
an integer followed by 'y' meaning in years
The input can also be full calendar time in the form:
YYYY-MM-DD.hh:mm:ss
Truncation from the beginning is allowed.
e.g., 2007-07-29.12 means noon of July 29, 2007.
Delayed Execution
For the native rule language it takes one parameter:
def example_python_rule(rule_args, callback, rei):
rule_body = """callback.msiObjStat("/tempZone/home/rods/example.txt", stat_out)"""
callback.delayExec(
("<PLUSET>1s</PLUSET><EF>1h DOUBLE UNTIL SUCCESS OR 6 TIMES</EF>" +
"<INST_NAME>irods_rule_engine_plugin-python-instance</INST_NAME>" ),
rule_body)
example_delayed_rule() {
delay("<EF>REPEAT FOR EVER<\EF><INST_NAME>irods-rule-language-instance<\INST_NAME>") {
msiObjStat("/tempZone/home/rods/example.txt", *stat_out)
}
}
For the python rule language it takes two parameters, the configuration and the rule body
Remote Execution
The remote directive executes the body of its code on another iRODS server with a signature of:
remote("server.host.name", "<ZONE>zone_name</ZONE>")
Remote Execution
The native rule engine plugin usage is similar to the delay()
def example_python_rule(rule_args, callback, rei):
rule_code = """
def main(rule_args, callback, rei):
print('This is a test of the Python Remote Rule Execution')"""
callback.py_remote('irods.example.org', '', rule_code, '')
example_delayed_rule() {
remote("irods.example.org", "<ZONE>tempZone<\ZONE>") {
msiObjStat("/tempZone/home/rods/example.txt", *stat_out)
}
}
For the python rule language there is a separate implementation
Questions?
KU Leuven Policy Training - Rule Engine Plugins
By jason coposky
KU Leuven Policy Training - Rule Engine Plugins
- 1,230