Python Rule Engine

August 3-6, 2020

KU Leuven Training

Webinar Presentation

Daniel Moore

Applications Engineer, iRODS Consortium

The Python Rule Engine

Writing iRODS Rules in Python

With the Python Rule Engine Plugin loaded and configured:

Python functions within the /etc/irods/core.py module's namespace can be called as iRODS rules.
If function names correspond to the conventional names of static (old-style) or dynamic (new-style) Policy Enforcement Points, we can program iRODS system policy directly in Python (see the default core.py.template file for examples)

Translating basic iRODS Rule Language Examples

A Static (Legacy) Policy Enforcement Point:

The Python version:

acPostProcForPut() {
    if("ufs_cache" == $KVPairs.rescName) {
        delay("<PLUSET>1s</PLUSET><EF>1h DOUBLE UNTIL SUCCESS OR 6 TIMES</EF>") {
            *CacheRescName = "comp_resc;ufs_cache";
            msisync_to_archive("*CacheRescName", $filePath, $objPath );
        }
    }
}

import session_var
def acPostProcForPut(rule_args, callback, rei):
    Map = session_vars.get_map(rei)
    Kvp = { str(a):str(b) for a,b in Map['key_value_pairs'].items() }
    if 'ufs_cache' == Kvp['rescName']:
        callback.delayExec( 
          ("<PLUSET>1s</PLUSET><EF>1h DOUBLE UNTIL SUCCESS OR 6 TIMES</EF>" +
           "<INST_NAME>irods_rule_engine_plugin-python-instance</INST_NAME>" ),
         "callback.msisync_to_archive('{cacheResc}','{file_path}','{object_path}')".format(
             cacheResc='comp_resc;ufs_cache',**Map['data_object'] ),
         "")

Translating basic iRODS Rule Language Examples

Dynamic Policy Enforcement Point:

The Python Version:

pep_resource_resolve_hierarchy_pre(
  *INST_NAME,*CTX,*OUT,*OP_TYPE,*HOST,*RESC_HIER,*VOTE){
    if( "CREATE" == *OP_TYPE ) {
        if( "pt1" == *INST_NAME) {
            *OUT = "read=1.0;write=0.5"
        }
        else if ( "pt2" == *INST_NAME ) {
            *OUT = "read=1.0;write=1.0"
        }
    }
}

def pep_resource_resolve_hierarchy_pre (rule_args, callback, rei):
   (INST_NAME, CTX, OUT, OP_TYPE, HOST, RESC_HIER, VOTE) = rule_args
   if "CREATE" == OP_TYPE :
       if   "pt1" == INST_NAME:
           rule_args[2] = "read=1.0;write=0.5"
       elif "pt2" == INST_NAME:
           rule_args[2] = "read=1.0;write=1.0"

Composing a Rule-base in Python

Creating an iRODS rule in Python

Rules are defined in top-level module /etc/irods/core.py or imported there from other modules
By convention they are declared with three input parameters: rule_args, callback, and rei.

import helpers  # --> utility module, also in /etc/irods
  
def top_level_rule(rule_args, callback, rei ):
  input_ = rule_args[0]
  rv = callback.get_complex_result( input_ )  # Execute another rule.
  rule_output = rv['arguments'][0]                      # --> get that rule's output ...
  end_result = helpers.to_fortran(complex(rule_output)) # --> Format complex value for use in FORTRAN
  rule_args[0] = end_result                             # --> and return the resulting string.
  
#import another set of rules  
from rulebase1 import *

Below is an example rule in Python. In the course of its execution, it calls a different rule called "get_complex_result"

Composing a Rule-base in Python

The parameters of the rule signature are as follows:

rule_args: Conveys rule arguments into, and results out of, the rule.

callback: Allows calling other rules and microservices via the framework.

rei : The rule execution instance; contains session variables. (As seen in the static PEP).

__all__ = ['get_complex_result']  #minimize core.py namespace pollution
import helpers
def get_complex_result (rule_args, callback, rei):
  radian_angle = float(rule_args[0])
  result =  str(helpers.rotate_unit_phasor(radian_angle))
  callback.writeLine ("serverLog", "Result was: " + result)
  rule_args[0] = result

Suppose that we make the auxiliary rule get_complex_result a Python rule as well, and we place it in /etc/irods/rulebase1.py:

import cmath  # -- any standard Python library module can be used
def to_fortran (cplx): return "({0.real},{0.imag})".format(cplx)
def rotate_unit_phasor(radn): return cmath.exp(1j * radn)

And then our utility module /etc/irods/helpers.py which contains only traditional, "direct-call" python functions can be very simple:

Composing a Rule-base in Python

Finally we can test our rulebase with the following launch script call_top_level.r

def main(rule_args, callback, rei):
  x = global_vars ['*x'][1:-1]
  returned = callback.top_level_rule(x)
  callback.writeLine( "stdout", "call to top_level_rule gives:" +  
                                str( returned['arguments'][0] )
                    )
input *x="3.1416"
output ruleExecOut

Using this irule command:

irule -r irods_rule_engine_plugin-python-instance -F top_level_rule.r

Calling from Python rules into Native rules

As with Native rule language, we can call iRODS microservices from a Python rule. If we have a Python rule:

      def myrule (rule_args, callback, rei ):

            arg0 = str( rule_args [0] )

            return_val = callback.error_msg_from_index (arg0)
            msg = return_val ['arguments'] [0]

            rule_args [0] = ("System Code {} - {}".format(arg0,msg))

      error_types() { list("Unsupported operation", "Syntax", "Runtime") }

      error_msg_from_index (*io) { *io = elem(error_types(), int(*io)) }

And the following is defined in the native rulebase /etc/irods/core.re:

$ irule -r irods_rule_engine_plugin-irods_rule_language-instance \
    "*x='1'; myrule(*x) writeLine('stderr',*x)" null ruleExecOut
System Code 1 - Syntax

We can run the Python rule as follows: (Note that, even targeting the iRODS rule language, the call to myrule can still resolve properly via the rule framework):

Calling from Native rules into Python rules

As with Native rule language, we can call iRODS microservices from a Python rule. If we have a Python rule:

      def math_function ( rule_args, callback, rei):
            import math
            funcname = str( rule_args [0] )  #e.g. "sin", atan","sqrt"
            argument =  rule_args [1]
            fHandle = getattr(math, funcname )
            result = fHandle (float(argument))
            rule_args [1] =  str(result)

do_math(*fn,*x) {
  *intm = str(eval(*x))
  *y= math_function(*fn,  *intm)
  double(*intm)
}

$ irule -r irods_rule_engine_plugin-irods_rule_language-instance \
    " writeLine('stdout',do_math('cos','355.0/452.0')" null ruleExecOut
0.707107

And a native rule:

We see that Python rules can add utility (even system calls) to the Native rule language , much like microservices:

Calling from Python rules into iRODS microservices

As with Native rule language, we can call iRODS microservices from a Python rule. Call this rule script test_msvc.r

from irods_types import (objType, char_array, RodsObjStat)

def main(rule_args,callback,rei):   # <-- rule under test
  path = rule_args[0] if rule_args else global_vars['*x'][1:-1]
  description = ""
  st = RodsObjStat()
  try:
    rv = callback.msiObjStat(path,st)
    st = rv['arguments'][1]
  except RuntimeError: #if the microservice fails
    callback.writeLine("stderr","[msiObjStat failed on path '{}'".format(path))
  else:
    if st.objType == objType.DATA_OBJ:   
      description = 'data object; size = ' + str(st.objSize)
    elif st.objType == objType.COLL_OBJ: 
      description = 'collection'
  callback.writeLine("stderr","object type ["+description+"]")
  if len(rule_args) > 0: rule_args[0] = description

INPUT *x=$"/tempZone/home/rods"
OUTPUT ruleExecOut

$ irule -r irods_rule_engine_plugin-irods_rule_language-instance -F test_msvc.r null ruleExecOut

Again using irule (you'll have the option of changing *x if inaccurate):

Calling from Python rules into iRODS microservices

Things to note with regard to the test script:

While under test, our Python rule must be named 'main'.
Rule and microservice calls can always fail, so we anticipate failure by wrapping such calls in a try/except clause.
Input variables ( *x in this example ) are accessible through the global_vars dictionary object. This is only true in a Python rule script, not within the rule-base proper.
Rule scripts may only be executed by a rodsadmin user. This is because the default python interpreter gives us access to the filesystem and potentially other system resources required by iRODS.

After the main rule is tested to our satisfaction , we can rename and relocate it to our permanent rulebase.

Failing in Python rules

As in Native rules, in Python we have the option of issuing a fail condition. This is done by raising RuntimeError.

Place the following rule code into /etc/irods/core.py :

from irods_types import (objType, char_array, RodsObjStat)

def testy(rule_args,callback,rei):   # <-- rule under test
  if int( rule_args[0]) % 2   : raise RuntimeError( 'Not an even number') # fail
  # exit naturally, ie success

$ irule -r irods_rule_engine_plugin-irods_rule_language-instance \
    "writeLine('stdout', errorcode( testy('*x') ))"  "x=5" ruleExecOut

And try it out using irule :

Python REP's Built-in Utility Modules

session_vars

exports a get_map function for access to session variables stored in the third rule parameter, rei
These correspond to the session variables in the legacy rule language, ie the "dollar-variables" listed in /etc/irods/core.dvm or (as in the earlier acPostProcForPut example) the fields of the $KVPairs structure .

import session_vars

def pep_resource_create_post (rule_args, callback, rei):
  
  # get info from Rule Execution Instance
  svars = session_vars.get_map( rei )
  proxy_user = svars['proxy_user']['user_name']
  callback.writeLine("serverLog","proxy_user "+  proxy_user)

  # get info from PluginContext object
  d=(rule_args[1].map())
  callback.writeLine("serverLog","physical_path "+  d['physical_path'])

An example usage:

Python REP's Built-in Utility Modules

genquery

exports a Query class with similar functionality to the Language Integrated Query in the native rule language.
By default result rows are packed as tuples.

from genquery import Query

def data_ids_from_file_extension (rule_args,callback,rei):
   ext = rule_args[0]
   mylist = []
   for data_id in Query(callback,
     "DATA_ID", "DATA_NAME like '%.{ext}'".format(**locals)): 
        mylist.append(data_id)
   rule_args[0] = ",".join(mylist)

from genquery import Query, AS_DICT
# ... 
   # Essential arguments are: (1) callback, (2) selected columns, (3) condition, (4) row_type
   
   for row in Query(callback,                                                    #(1)
                    [ "DATA_NAME", "COLL_NAME", "META_DATA_ATTR_VALUE"],         #(2)
                    ("META_DATA_ATTR_NAME = "
                     " 'irods::netcdf::group_{}_variable_{}".format(group,var)), #(3)
                    AS_DICT                                                      #(4)
       ):
       key = "{COLL_NAME}/{DATA_NAME}".format.(**row)
       subhashw = netcdf_vars.setdefault( key, {} )
       subhash[group,var] = row["META_DATA_ATTR_VALUE"]

Example with dictionary result rows and "auto-joined" tables:

Sample equivalent rules in Native Rule Lang vs Python

Sample rule in the Legacy iRODS Rule Language:

verify_replica_placement(*violations) {
    *attr = verify_replicas_attribute
    # get a list of all matching collections given the metadata attribute
    foreach(*row0 in SELECT COLL_NAME, META_COLL_ATTR_VALUE WHERE META_COLL_ATTR_NAME = "*attr") {
        *resource_list = *row0.META_COLL_ATTR_VALUE
        *number_of_resources = size(split(*resource_list, ","))
        *coll_name = *row0.COLL_NAME
        foreach(*row1 in SELECT COLL_NAME, DATA_NAME WHERE COLL_NAME like "*coll_name%") {
            *matched = 0 
            *coll_name = *row1.COLL_NAME
            *data_name = *row1.DATA_NAME
            foreach(*row2 in SELECT RESC_NAME WHERE COLL_NAME = "*coll_name" AND DATA_NAME = "*data_name") {
                *resource_name = *row2.RESC_NAME
                 *split_list = split(*resource_list, ",")
                 while(size(*split_list) > 0) {
                     *name = str(hd(*split_list))
                     *split_list = tl(*split_list)
                     *name = triml(trimr(*name,' '),' ')
                     # ( set write permission for collaborator - ?? )
                     if(*name == *resource_name) { *matched = *matched + 1 }
                 }
            } # for resources
            if(*matched < *number_of_resources) {
                *violations = cons("*coll_name/*data_name violates the placement policy "++*resource_list,*violations)
            }
        } # for objects
    } # for collections
} # verify_replica_placement

Implementing the replica placement policy

The rule translated into Python :

__all__ = ['verify_replica_placement']
from genquery import Query, AS_DICT
from irods_capability_integrity_utils import *

def verify_replica_placement (rule_args, callback, rei):
    violations = split_text_lines( rule_args[0] )
    attr = verify_replicas_attribute()
    for row0 in Query (callback, ['COLL_NAME','META_COLL_ATTR_VALUE'],
                       "META_COLL_ATTR_NAME = '{}'".format(attr), AS_DICT):

        resource_list = row0 ['META_COLL_ATTR_VALUE']
        split_resource_list = set(r.strip() for r in resource_list.split(","))

        number_of_resources = len( split_resource_list )
        coll_name = row0 ['COLL_NAME']

        for row1 in Query (callback, ['COLL_NAME','DATA_NAME'],
                           "COLL_NAME like '{}%'".format( coll_name ), AS_DICT):
            matched = 0
            coll_name = row1['COLL_NAME']
            data_name = row1['DATA_NAME']

            for row2 in Query (callback, ['RESC_NAME'],
                               "COLL_NAME = '{0}' AND DATA_NAME = '{1}'".format(coll_name,data_name), AS_DICT):

                resource_name = row2['RESC_NAME']
                # set modify for all collaborators - ?
                for r in split_resource_list:
                    # set write permission for collaborator - ?
                    if r == resource_name:
                        matched += 1
            # -- end -- for resources

            if matched < number_of_resources:
                violations.append("Object {coll_name}/{data_name} violates the replica replacement policy".format(**locals()))
        # -- end -- for objects

    # -- end -- for collections
    rule_args [0] = join_text_lines( violations )

Common utility functions for the Python rule base:

/etc/irods/irods_capability_integrity_utils.py

def get_error_value          ()   : return "ERROR_VALUE"
def RULE_ENGINE_CONTINUE     ()   : return 5000000
def SYS_INVALID_INPUT_PARAM  ()   : return -130000
def verify_replicas_attribute()   : return "irods::verification::replica_placement"
def verify_checksum_attribute()   : return "irods::verification::replica_checksum"
def verify_replica_number_attribute() : return "irods::verification::replica_number"

def TRUE(): return "true"
def FALSE(): return ""

def split_text_lines( string_in ) :
    return filter (None, string_in.split("\n"))

def join_text_lines( list_in ) :
    return "\n".join(filter(None, list_in)) + "\n"

Setting up to test the Python rulebase

Place imports for the rule modules containing our python rules, near the near of the core (/etc/irods/core.py) module:

Stage the Python modules from the repo directory (and make sure we've removed the native rulesets from /etc/irods/server_config.json):

From the repo directory:
  
sudo cp *.py /etc/irods/.

from verify_replica_number import *
from verify_replica_placement import *
from verify_checksum import *

Check out the code from:

http://github.com/jasoncoposky/irods_capability_integrity

Testing the replica placement policy

The Python version of the rule launch script is as follows:

def main (rule_args, callback, rei):
    from irods_capability_integrity_utils import (split_text_lines, )
    violations = ""
    retval = callback.verify_replica_placement(violations)
    violations = retval ['arguments'][0]
    for v in split_text_lines(violations):
        callback.writeLine("stdout", v)

INPUT null
OUTPUT ruleExecOut

irule -r irods_rule_engine_plugin-python-instance  \
  -F execute_replica_placement_policy_Py.r

Test the Python replica placement policy as shown:

Questions?