Advanced Training:
Rule Engine Plugins
June 17-20, 2025
iRODS User Group Meeting 2025
Durham, NC
Anatomy of a Rule Engine Plugin
Represents the last of the core plugin interfaces
Each plugin must define nine operations:
Configuration of Rule Engine Plugins
"plugin_configuration": { "rule_engines": [ { ... },
... ] }
Within /etc/irods/server_config.json -
JSON array holding configuration for the rule engine(s)
Configuration of Rule Engine Plugins
Anatomy of a Rule Engine Plugin JSON object
{ "instance_name": "<UNIQUE_NAME>", "plugin_name": "<DERIVED_FROM_SHARED_OBJECT>", "plugin_specific_configuration": { <ANYTHING_GOES_HERE> }, "shared_memory_instance": "<UNIQUE_SHARED_MEMORY_NAME>" }
iRODS Rule Language Configuration
{ "instance_name": "irods_rule_engine_plugin-irods_rule_language-instance", "plugin_name": "irods_rule_engine_plugin-irods_rule_language", "plugin_specific_configuration": { "re_data_variable_mapping_set": [ "core" ], "re_function_name_mapping_set": [ "core" ], "re_rulebase_set": [ "core" ], "regexes_for_supported_peps": [ "ac[^ ]*", "msi[^ ]*", "[^ ]*pep_[^ ]*_(pre|post|except|finally)" ] }, "shared_memory_instance": "irods_rule_language_rule_engine" },
Basic iRODS Rule Language Example
pep_resource_resolve_hierarchy_pre(*INST_NAME, *CTX, *OUT, *OP_TYPE, *HOST, *RESC_HIER, *VOTE) { if ("CREATE" == *OP_TYPE) { if ("pt1" == *INST_NAME) { *OUT = "read=1.0;write=0.5"; } else if ("pt2" == *INST_NAME) { *OUT = "read=1.0;write=1.0"; } } }
Dynamic Policy Enforcement Points
Questions?
Installing the Python Rule Engine Plugin
sudo apt-get -y install \ irods-rule-engine-plugin-python \ python3-exif
$ ls /usr/lib/irods/plugins/rule_engines/
...
libirods_rule_engine_plugin-python.so
As the ubuntu user, install the plugin and a python module.
See the new shared object.
Python Rule Engine Configuration
As the irods user, create /etc/irods/core.py from packaged template file.
Edit /etc/irods/server_config.json.
"rule_engines": [
{
"instance_name" : "irods_rule_engine_plugin-python-instance",
"plugin_name" : "irods_rule_engine_plugin-python",
"plugin_specific_configuration" : {}
},
{
"instance_name": "irods_rule_engine_plugin-irods_rule_language-instance",
cp /etc/irods/core.py.template /etc/irods/core.py
Set up a Custom Rulebase (training.re)
"rule_engines": [
...
{
"instance_name": "irods_rule_engine_plugin-irods_rule_language-instance",
"plugin_name": "irods_rule_engine_plugin-irods_rule_language",
"plugin_specific_configuration": {
"re_data_variable_mapping_set": [
"core"
],
"re_function_name_mapping_set": [
"core"
],
"re_rulebase_set": [
"training",
"core"
],
Add a custom rulebase to /etc/irods/server_config.json.
A Combination Use of Both Python and iRODS Rules
add_metadata_to_objpath(*str, *objpath, *objtype) { msiString2KeyValPair(*str, *kvp); msiAssociateKeyValuePairsToObj(*kvp, *objpath, *objtype); } getSessionVar(*name,*output) { *output = eval("str($"++*name++")"); }
Create /etc/irods/training.re rulebase
As the Ubuntu user, copy core.py and python_storage_balancing.py into /etc/irods, overwriting the default core.py, and stage stickers.jpg for the irods user.
sudo cp ~/irods_training/advanced/python_storage_balancing.py /etc/irods/ sudo cp ~/irods_training/advanced/core.py /etc/irods/
sudo cp ~/irods_training/stickers.jpg /var/lib/irods/
A Python Dynamic Policy Enforcement Point
def pep_api_data_obj_put_post(rule_args, callback, rei): import os data_obj_inp = rule_args[2] obj_path = str(data_obj_inp.objPath) resc_hier = str(data_obj_inp.condInput['resc_hier']) query_condition_string = f'COLL_NAME = \'{os.path.dirname(obj_path)}\' and ' \ f'DATA_NAME = \'{os.path.basename(obj_path)}\' and ' \ f'DATA_RESC_HIER = \'{resc_hier}\'' # Note: The physical path fetched by the query may not exist on the host executing this # bit of policy. In a real deployment, the policy implementer should consider the hostname # of the resource on which the data resides and consider using the remote() microservice. phypath = list(Query(callback, 'DATA_PATH', query_condition_string))[0] exiflist = [] with open(phypath, 'rb') as f: tags = exifread.process_file(f, details=False) for (k, v) in tags.items(): if k not in ('JPEGThumbnail', 'TIFFThumbnail', 'Filename', 'EXIF MakerNote'): exifpair = '{0}={1}'.format(k, v) exiflist.append(exifpair) exifstring = '%'.join(exiflist) callback.add_metadata_to_objpath(exifstring, obj_path, '-d') callback.writeLine('serverLog', 'PYTHON - pep_api_data_obj_put_post() complete')
The python instantiation of the dynamic PEP now in core.py:
As the irods user, reload the iRODS server configuration.
kill -HUP $(cat /var/run/irods/irods-server.pid)
Test our combination of rules
imeta ls -d stickers.jpg
Confirm the EXIF metadata was extracted and applied.
irm -f stickers.jpg iput stickers.jpg
Switch to the irods user. Make sure earlier stickers example is removed and put some test data into iRODS.
AVUs defined for dataObj stickers.jpg: attribute: Image Orientation value: Horizontal (normal) units: ---- attribute: EXIF ColorSpace value: sRGB units: ----
Storage Balancing
Create a resource hierarchy for our storage balancing example.
iadmin mkresc def_resc deferred iadmin mkresc ufs1 unixfilesystem $(hostname -f):/tmp/ufs1 iadmin mkresc ufs2 unixfilesystem $(hostname -f):/tmp/ufs2 iadmin mkresc pt1 passthru iadmin mkresc pt2 passthru iadmin addchildtoresc def_resc pt1 iadmin addchildtoresc def_resc pt2 iadmin addchildtoresc pt1 ufs1 iadmin addchildtoresc pt2 ufs2 iadmin modresc pt1 context "max_bytes=20000000" iadmin modresc pt2 context "max_bytes=20000000"
$ ilsresc
def_resc:deferred
├── pt1:passthru
│ └── ufs1
└── pt2:passthru
└── ufs2
demoResc: unixfilesystem
Implementing the Storage Balancing Plugin
findRescType(*INST_NAME, *OUT) { foreach (*ROW in SELECT RESC_TYPE_NAME WHERE RESC_NAME = '*INST_NAME') { *OUT = *ROW.RESC_TYPE_NAME; } } findInstId(*INST_NAME, *OUT) { foreach (*ROW in SELECT RESC_ID WHERE RESC_NAME = '*INST_NAME') { *OUT = *ROW.RESC_ID; } } findBytesUsed(*INST_ID, *OUT) { foreach (*ROW1 in SELECT RESC_NAME WHERE RESC_PARENT = '*INST_ID') { *STORAGE_RESC = *ROW1.RESC_NAME; *TEMP = 0; foreach (*ROW2 in SELECT sum(DATA_SIZE) WHERE RESC_NAME = '*STORAGE_RESC') { *TEMP = *TEMP + int(*ROW2.DATA_SIZE); } *OUT = "*TEMP"; } } findContextString(*INST_NAME, *OUT) { foreach (*ROW in SELECT RESC_CONTEXT WHERE RESC_NAME = '*INST_NAME') { *OUT = *ROW.RESC_CONTEXT; } }
A few helper functions to add to /etc/irods/training.re.
Code can be found at ~/irods_training/advanced/python_storage_balancing.re.
The Python Storage Balancing Rule
def pep_resource_resolve_hierarchy_pre(rule_args, callback, rei): if rule_args[3] == 'CREATE': ret = callback.findRescType(rule_args[0], '') resc_type = ret['arguments'][1] if (resc_type == 'passthru'): ret = callback.findInstId(rule_args[0], '') inst_id = ret['arguments'][1] ret = callback.findBytesUsed(inst_id, '') bytes_used = ret['arguments'][1] ret = callback.findContextString(rule_args[0], '') context_string = ret['arguments'][1] max_bytes = -1 max_bytes_index = context_string.find('max_bytes') if max_bytes_index != -1: max_bytes_re = 'max_bytes=(\d+)' max_bytes_search = re.search(max_bytes_re, context_string) max_bytes_str = max_bytes_search.group(1) max_bytes = max_bytes_str percent_full = 0.0 if max_bytes == -1: percent_full = 0.0 elif max_bytes == 0: percent_full = 1.0 else: percent_full = float(bytes_used)/float(max_bytes) write_weight = 1.0 - percent_full rule_args[2] = 'read=1.0;write=' + str(write_weight)
Uncomment definition for pep_resource_resolve_hierarchy_pre in /etc/irods/core.py.
#def pep_resource_resolve_hierarchy_pre(rule_args, callback, rei): # return python_storage_balancing.pep_resource_resolve_hierarchy_pre(rule_args, callback, rei)
The definition in /etc/irods/python_storage_balancing.py:
Testing the Storage Balancing Rule
As the irods user, put the data, and replicas should alternate between resources.
iput -R def_resc version.json f1 iput -R def_resc version.json f2 iput -R def_resc version.json f3 iput -R def_resc version.json f4
$ ils -l
/tempZone/home/rods: rods 0 def_resc;pt2;ufs2 240 2025-05-20.15:39 & f1 rods 0 def_resc;pt1;ufs1 240 2025-05-20.15:39 & f2 rods 0 def_resc;pt2;ufs2 240 2025-05-20.15:39 & f3 rods 0 def_resc;pt1;ufs1 240 2025-05-20.15:39 & f4
As the irods user, reload the iRODS server configuration.
kill -HUP $(cat /var/run/irods/irods-server.pid)
Questions?
A Storage Balancing C++ Rule Engine Plugin
Code can be found at
~/irods_training/advanced/irods_rule_engine_plugin_storage_balancing/src/libirods_rule_engine_plugin-cpp-storage-balancing.cpp
sudo apt-get -y install irods-dev cmake
As the ubuntu user, install required build tools...
Start with the Factory
extern "C" irods::pluggable_rule_engine<irods::default_re_ctx>* plugin_factory(const std::string& _instance_name, const std::string& _context) { auto* re{new irods::pluggable_rule_engine<irods::default_re_ctx>(_instance_name , _context)};
re->add_operation("setup", std::function<irods::error(irods::default_re_ctx&, const std::string&)>(setup)); re->add_operation("teardown", std::function<irods::error(irods::default_re_ctx&, const std::string&)>(teardown)); re->add_operation("start", std::function<irods::error(irods::default_re_ctx&, const std::string&)>(start)); re->add_operation("stop", std::function<irods::error(irods::default_re_ctx&, const std::string&)>(stop)); re->add_operation("rule_exists", std::function<irods::error(irods::default_re_ctx&, const std::string&, bool&)>(rule_exists)); re->add_operation("list_rules", std::function<irods::error(irods::default_re_ctx&, std::vector<std::string>&)>(list_rules)); re->add_operation( "exec_rule", std::function<irods::error(irods::default_re_ctx&, const std::string&, std::list<boost::any>&, irods::callback)>(exec_rule)); re->add_operation( "exec_rule_text", std::function<irods::error(irods::default_re_ctx&, const std::string&, msParamArray_t*, const std::string&, irods::callback)>(exec_rule_text)); re->add_operation( "exec_rule_expression", std::function<irods::error(irods::default_re_ctx&, const std::string&, msParamArray_t*, irods::callback)>(exec_rule_expression)); return re; }
Anatomy of the Plugin Factory
extern "C" irods::pluggable_rule_engine<irods::default_re_ctx>* plugin_factory( const std::string& _instance_name, const std::string& _context)
{ ... }
Must have C linkage
Returns an irods::pluggable_rule_engine<>*
Accepts two const std::string& as instance name and context
Instantiate a new rule engine plugin
extern "C"
irods::pluggable_rule_engine<irods::default_re_ctx>*
plugin_factory(
const std::string& _instance_name,
const std::string& _context)
{
auto* re{new irods::pluggable_rule_engine<irods::default_re_ctx>(
_instance_name,
_context)};
}
Allocate a raw pointer to an irods::pluggable_rule_engine
Pass the instance name and context to the constructor
Attributes of the irods::plugin_base class
Wire the plugin operations
... re->add_operation( "start", std::function< irods::error(irods::default_re_ctx&, const std::string&)>(start));
...
Template parameters are the parameters of the function operation
First parameter is the calling name of the operation (e.g. "start")
Second parameter is a std::function wrapping the local function definition
Takes the full signature of the function as a template parameter
Takes the function pointer as an argument
All operations attached to the instance must be wrapped in an anonymous namespace or be marked as static
Similar treatment for the other operations
setup
teardown
stop
rule_exists
re->add_operation("setup", std::function<irods::error(irods::default_re_ctx&, const std::string&)>(setup));
re->add_operation("teardown", std::function<irods::error(irods::default_re_ctx&, const std::string&)>(teardown));
re->add_operation("stop", std::function<irods::error(irods::default_re_ctx&, const std::string&)>(stop)); re->add_operation( "rule_exists", std::function<irods::error(irods::default_re_ctx&, const std::string&, bool&)>(rule_exists)); re->add_operation( "list_rules", std::function<irods::error(irods::default_re_ctx&, std::vector<std::string>&)>(list_rules)); re->add_operation( "exec_rule", std::function< irods::error(irods::default_re_ctx&, const std::string&, std::list<boost::any>&, irods::callback)>(exec_rule)); re->add_operation( "exec_rule_text", std::function<irods::error( irods::default_re_ctx&, const std::string&, msParamArray_t*, const std::string&, irods::callback)>(exec_rule_text)); re->add_operation(
"exec_rule_expression", std::function<irods::error( irods::default_re_ctx&, const std::string&, msParamArray_t*, irods::callback)>(exec_rule_expression));
exec_rule
list_rules
exec_rule_text
exec_rule_expression
start, stop, rule_exists, and list_rules
static irods::error setup(irods::default_re_ctx&, const std::string&) { return SUCCESS(); } static irods::error teardown(irods::default_re_ctx&, const std::string&) { return SUCCESS(); }
static irods::error start(irods::default_re_ctx&, const std::string&) { return SUCCESS(); } static irods::error stop(irods::default_re_ctx&, const std::string&) { return SUCCESS(); } static irods::error rule_exists(irods::default_re_ctx&, const std::string& _rule_name, bool& _ret) { _ret = (_rule_name == "pep_resource_resolve_hierarchy_pre"); return SUCCESS(); } static irods::error list_rules(irods::default_re_ctx&, std::vector<std::string>& _rules) { _rules.emplace_back("pep_resource_resolve_hierarchy_pre"); return SUCCESS(); }
setup, teardown, start, and stop are no-ops
rule_exists is simply matching on pep_resource_resolve_hierarchy_pre as that is the only operation supported by the rule engine plugin
list_rules simply responds with the single dynamic PEP
exec_rule
static irods::error exec_rule(
irods::default_re_ctx&,
const std::string& _rule_name,
std::list<boost::any>& _rule_arguments,
irods::callback _effect_handler)
{
try {
auto it_args{std::begin(_rule_arguments)};
const auto& arg_resource_name{boost::any_cast<std::string&>(*it_args)};
auto& arg_plugin_context{boost::any_cast<irods::plugin_context&>(*++it_args)};
auto& arg_out{*boost::any_cast<std::string*>(*++it_args)};
const auto& arg_operation_type{*boost::any_cast<const std::string*>(*++it_args)};
const auto& arg_host{*boost::any_cast<const std::string*>(*++it_args)};
auto& arg_hierarchy_parser{*boost::any_cast<irods::hierarchy_parser*>(*++it_args)};
auto& arg_vote{*boost::any_cast<float*>(*++it_args)};
Begin by extracting all of the arguments from the list
Rule arguments are packed into a std::list for every operation
Target specific operations
if (arg_operation_type != "CREATE") { return SUCCESS(); } ruleExecInfo_t& rei{get_rei(_effect_handler)}; const std::string resource_type{get_resource_type(arg_resource_name, *rei.rsComm)}; if (resource_type != "passthru") { return SUCCESS(); } const boost::optional<uint64_t> max_bytes{get_max_bytes(arg_resource_name, *rei.rsComm)};
Skip non-CREATE operations
Fetch the ruleExecInfo_t from the framework
Fetch other values via helper functions
Handle max_bytes edge cases
if (!max_bytes) { arg_out = "read=1.0;write=1.0"; return SUCCESS(); } else if (*max_bytes == 0) { arg_out = "read=1.0;write=0.0"; return SUCCESS(); }
The context string may not be defined
Short circuit if max_bytes is set to 0 explicitly
Compute the write_weight
const uint64_t bytes_used_by_children{ get_bytes_used_by_all_children( arg_resource_name, *rei.rsComm)}; const uint64_t bytes_required_for_new_data_object{ get_bytes_of_incoming_data_object( arg_plugin_context)}; const uint64_t hypothetical_bytes_used{ bytes_used_by_children + bytes_required_for_new_data_object}; const double percent_used{ std::max(0.0, std::min(1.0, static_cast<double> (hypothetical_bytes_used) / *max_bytes))}; const double write_weight{1.0 - percent_used};
Fetch bytes used by resource and new data object
Compute new total bytes used
Compute write_weight given max_bytes
Wrapping up exec_rule
std::stringstream out_stream; out_stream << "read=1.0;write=" << write_weight_string; arg_out = out_stream.str(); return SUCCESS(); } catch (const irods::exception& e) { rodsLog(LOG_ERROR, e.what()); return ERROR(e.code(), "irods exception in exec_rule"); } return SUCCESS(); }
Build the weight string given the computed write weight
Set the out variable - arg_out
Return SUCCESS() if all goes well
Catch the exception, log and return an error otherwise
exec_rule_text and exec_rule_expression
static irods::error exec_rule_text( irods::default_re_ctx&, const std::string&, msParamArray_t*, const std::string&, irods::callback) { return ERROR(SYS_NOT_SUPPORTED, "not supported"); } static irods::error exec_rule_expression( irods::default_re_ctx&, const std::string&, msParamArray_t*, irods::callback) { return ERROR(SYS_NOT_SUPPORTED, "not supported"); }
Both return SYS_NOT_SUPPORTED
Another Rule Engine Plugin could pick them up
Building and Installing the Package
mkdir ~/build_storage_cpp cd ~/build_storage_cpp cmake ../irods_training/advanced/irods_rule_engine_plugin_storage_balancing/ make package sudo dpkg -i ./irods_rule_engine_plugin-cpp-storage-balancing*.deb
As the ubuntu user:
$ ls /usr/lib/irods/plugins/rule_engines/ | grep storage libirods_rule_engine_plugin-cpp-storage-balancing.so
See the newly installed rule engine plugin
Configuring the Rule Engine Plugin
Edit /etc/irods/server_config.json
{ "instance_name": "irods_rule_engine_plugin-cpp-storage-balancing-instance", "plugin_name": "irods_rule_engine_plugin-cpp-storage-balancing", "plugin_specific_configuration": {} }, { "instance_name": "irods_rule_engine_plugin-python-instance", "plugin_name": "irods_rule_engine_plugin-python", "plugin_specific_configuration": {} },
Reload the iRODS server configuration and then run ils to confirm that the configuration syntax is correct.
Comment out definition for pep_resource_resolve_hierarchy_pre in /etc/irods/core.py again
#def pep_resource_resolve_hierarchy_pre(rule_args, callback, rei): # return python_storage_balancing.pep_resource_resolve_hierarchy_pre(rule_args, callback, rei)
Testing the Rule Engine Plugin
Remove all data from def_resc
irm -f f1 f2 f3 f4 iput -R def_resc version.json f1 iput -R def_resc version.json f2 iput -R def_resc version.json f3 iput -R def_resc version.json f4
$ ils -l
/tempZone/home/rods: rods 0 def_resc;pt2;ufs2 240 2025-05-20.18:14 & f1 rods 0 def_resc;pt1;ufs1 240 2025-05-20.18:14 & f2 rods 0 def_resc;pt2;ufs2 240 2025-05-20.18:14 & f3 rods 0 def_resc;pt1;ufs1 240 2025-05-20.18:14 & f4
Questions?