Policy Training
Native Rule Language
Policy Training
Native Rule Language
Jason Coposky
@jason_coposky
Executive Director, iRODS Consortium
August 3-6, 2020
KU Leuven Training
Webinar Presentation
The Native Rule Language
The iRODS Rule Language is a domain specific language (DSL) provided by iRODS to define policies and actions in the system.
The iRODS Rule Language provides a syntax similar to C
rule_name(*rule, *parameters)
{
# a comment in a rule
writeLine("stdout", "Hello, World!")
0 # return value
}
Documentation can be found here:
Boolean Values
Boolean literals include true and false
Boolean operations include
! # not
&& # and
|| # or
Numeric Values
Numeric literals include integers and doubles
Numeric operations include
- # Negation
^ # Power
* # Multiplication
/ # Division
% # Modulo
- # Subtraction
+ # Addition
> # Greater than
< # Less than
>= # Greater than or equal
<= # Less than or equal
Numeric functions include
exp(<num>)
log(<num>)
abs(<num>)
floor(<num>)
ceiling(<num>)
average(<num>,<num>,...)
max(<num>,<num>,...)
min(<num>,<num>,...)
String Values
String literals include 'I am a string' and "I am also a string"
String operations include
str() # converts other values to strings
int() # converts a string to an integer
double() # converts a string to a double
bool() # converts a string to a boolean
++ # concatenates two strings
like # wildcard comparison of two strings
like regex # regular expression matching
substr() # extract a substring from a string
strlen() # compute the length of a string
split() # split a string on a given character
triml() # trim left to a given character
trimr() # trim right to a given character
Variables may be expanded in a string 'I am a variable: *x'
The * must be escaped to be a literal 'I am not a variable: \*x'
Key Value Pairs
The rule language provides a dictionary style data structure:
*var.key = "value"
For example:
*A.a="A";
*A.b="B";
*A.c="C";
str(*A); # a=A++++b=B++++c=C
Currently only string values are supported
Lists
The rule language provides a list style data structure:
*x = list('this', 'is', 'a', 'list')
List operations include:
*x = list('this', 'is', 'a', 'list')
elem(*x, 1) # extracts an element, evaluates to 'is'
setelem(*x, 1,"isn't") # sets an element, replaces 'is' with 'isn't'
size(*x) # computes the size of a list, evaluates to 4
hd(*x) # head of the list, returns 'this'
tl(*x) # tail of the list, returns ('is', 'a', list')
cons('foo', *x) # prepends an element to a list,
# returns ('foo', 'this', is', 'a', list')
All entries in a list must be of the same type
Flow Control
The rule language provides a standard if - then - else structure
For Example:
if(*x == 'one') {
# code for case one
}
else if(*x == 'two') {
# code for case two
}
else {
# code for default
}
Iteration
The rule language provides foreach and while constructs for iteration
For Example:
*x = list('this', 'is', 'a', 'list')
foreach(*e in *x) {
writeLine('stdout', 'element *e')
}
*y = 0
while(*y < 10) {
writeLine('stdout', 'Hello, World!')
*y = *y + 1
}
Error Handling
The rule language provides errormsg and errorcode constructs to capture and manage errors from microservices
For Example:
*logical_path = '/tempZone/home/rods/example.txt'
*err = errorcode(msiObjStat(*logical_path, *stat))
if(*err < 0) {
writeLine('serverLog', "msiObjStat failed for *logical_path")
}
*err = errormsg(msiObjStat(*logical_path, *stat), *msg)
if(*err < 0) {
writeLine('serverLog', "msiObjStat failed for *logical_path with message *msg")
}
If the error is not captured properly the rule will error out
Error Handling
Alternatively fail() and failmsg() allow a rule to report errors
For Example:
*logical_path = '/tempZone/home/rods/example.txt'
*err = errorcode(msiObjStat(*logical_path, *stat))
if(*err < 0) {
writeLine('serverLog', "msiObjStat failed for *logical_path")
fail(*err)
}
*err = errormsg(msiObjStat(*logical_path, *stat), *msg)
if(*err < 0) {
*msg = "msiObjStat failed for *logical_path with message *msg"
writeLine('serverLog', *msg)
failmsg(*err, *msg)
}
Language Integrated General Query
Provides a coupling between catalog queries and the rule language
Follows the General Query syntax, much like iquest
For Example:
*data_name = 'example.txt'
*coll_name = '/tempZone/home/rods'
*query = SELECT RESC_NAME, DATA_REPL_NUM, WHERE COLL_NAME = '*coll_name' AND DATA_NAME = '*data_name'
foreach(*row in *query) {
*resc_name = *row.RESC_NAME
*repl_num = *row.DATA_REPL_NUM
writeLine('stdout', 'replicas found for *coll_name/*data_name on *resc_name, *repl_num')
}
The row returned is a key value structure whose keys are the column names
Example Policy Implementations
Consider 3 use cases:
- Do we have a sufficient number of replicas?
- Are the replicas in the correct locations?
- Is the data correct at rest?
How do we provide these assertions and guarantees?
When do we enforce this policy?
How do we know we should enforce this policy?
What do we do when policy is in violation?
Example Policy Implementations
Example code can be found here:
git clone https://github.com/jasoncoposky/irods_capability_integrity
sudo cp irods_capability_integrity/*.re /etc/irods
Clone the repository and stage the rule bases:
Data Integrity Policy - Replica Number
Provides checks around the number of replicas
For Example:
imkdir placement_policy
imeta set -C placement_policy irods::verification::replica_number 3
Driven by collection metadata:
irods::verification::replica_number <positive integer>
Data Integrity Policy - Replica Number
# Single point of truth for an error value
get_error_value(*err) { *err = "ERROR_VALUE" }
# The code to return for the rule engine plugin framework to look for additional PEPs to fire.
RULE_ENGINE_CONTINUE { 5000000 }
# Error code if input is incorrect
SYS_INVALID_INPUT_PARAM { -130000 }
# metadata attribute driving policy for user status
verify_replica_number_attribute { "irods::verification::replica_number" }
verify_replica_number(*violations)
{
*attr = verify_replica_number_attribute
# get a list of all matching collections given the metadata attribute
foreach(*row0 in SELECT COLL_NAME, META_COLL_ATTR_VALUE WHERE META_COLL_ATTR_NAME = "*attr") {
*number_of_replicas = int(*row0.META_COLL_ATTR_VALUE)
*coll_name = *row0.COLL_NAME
Data Integrity Policy - Replica Number
*number_of_replicas = int(*row0.META_COLL_ATTR_VALUE)
*coll_name = *row0.COLL_NAME
# get a list of all data objects in the given collection
foreach(*row1 in SELECT COLL_NAME, DATA_NAME WHERE COLL_NAME like "*coll_name%") {
*matched = 0
*coll_name = *row1.COLL_NAME
*data_name = *row1.DATA_NAME
# get all of the resource names where this objects replicas reside
foreach(*row2 in SELECT RESC_NAME WHERE COLL_NAME = "*coll_name" AND DATA_NAME = "*data_name") {
*matched = *matched + 1
} # for resources
if(*matched < *number_of_replicas) {
*violations = cons("*coll_name/*data_name violates the number policy " ++ str(*number_of_replicas), *violations)
}
} # for objects
} # for collections
} # verify_replica_number
Data Integrity Policy - Replica Number
execute_replica_number_policy {
*violations = list()
verify_replica_number(*violations)
foreach(*v in *violations) {
writeLine("stdout", "*v")
}
}
INPUT null
OUTPUT ruleExecOut
Executing the replica number policy
irule -r irods_rule_engine_plugin-irods_rule_language-instance -F execute_replica_number_policy.r
Test from the command line
Data Integrity Policy - Replica Placement
Provides checks around the location of replicas on specific resources
For Example:
imkdir placement_policy
imeta set -C placement_policy irods::verification::replica_placement "demoResc, ufs0"
Driven by collection metadata:
irods::verification::replica_placement <list of resource names>
Data Integrity Policy - Replica Placement
# Single point of truth for an error value
get_error_value(*err) { *err = "ERROR_VALUE" }
# The code to return for the rule engine plugin framework to look for additional PEPs to fire.
RULE_ENGINE_CONTINUE { 5000000 }
# Error code if input is incorrect
SYS_INVALID_INPUT_PARAM { -130000 }
# metadata attribute driving policy for user status
verify_replicas_attribute { "irods::verification::replica_placement" }
verify_replica_placement(*violations)
{
*attr = verify_replicas_attribute
# get a list of all matching collections given the metadata attribute
foreach(*row0 in SELECT COLL_NAME, META_COLL_ATTR_VALUE WHERE META_COLL_ATTR_NAME = "*attr") {
*resource_list = *row0.META_COLL_ATTR_VALUE
*number_of_resources = size(split(*resource_list, ","))
*coll_name = *row0.COLL_NAME
# get a list of all data objects in the given collection
foreach(*row1 in SELECT COLL_NAME, DATA_NAME WHERE COLL_NAME like "*coll_name%") {
*matched = 0
*coll_name = *row1.COLL_NAME
*data_name = *row1.DATA_NAME
# get all of the resource names where this objects replicas reside
foreach(*row2 in SELECT RESC_NAME WHERE COLL_NAME = "*coll_name" AND DATA_NAME = "*data_name") {
*resource_name = *row2.RESC_NAME
# set modify for all collaborators
*split_list = split(*resource_list, ",")
Data Integrity Policy - Replica Placement
while(size(*split_list) > 0) {
# pull head of list
*name = str(hd(*split_list))
# subset remainder of list
*split_list = tl(*split_list)
# chomp space
*name = triml(*name, ' ')
*name = trimr(*name, ' ')
# set write permission for collaborator
if(*name == *resource_name) {
*matched = *matched + 1
}
}
} # for resources
if(*matched < *number_of_resources) {
*violations = cons("*coll_name/*data_name violates the placement policy "
++ *resource_list, *violations)
}
} # for objects
} # for collections
} # verify_replica_placement
Data Integrity Policy - Replica Placement
execute_replica_placement_policy {
*violations = list()
verify_replica_placement(*violations)
foreach(*v in *violations) {
writeLine("stdout", "*v")
}
}
INPUT null
OUTPUT ruleExecOut
Executing the replica placement policy
irule -r irods_rule_engine_plugin-irods_rule_language-instance -F execute_replica_number_policy.r
Test from the command line
Data Integrity Policy - Replica Checksum
Provides checks around the integrity of replicas data at rest
For Example:
imkdir placement_policy
imeta set -C placement_policy irods::verification::replica_checksum sha256
Driven by collection metadata:
irods::verification::replica_checksum <checksum type>
Data Integrity Policy - Replica Checksum
# Single point of truth for an error value
get_error_value(*err) { *err = "ERROR_VALUE" }
# The code to return for the rule engine plugin framework to look for additional PEPs to fire.
RULE_ENGINE_CONTINUE { 5000000 }
# Error code if input is incorrect
SYS_INVALID_INPUT_PARAM { -130000 }
# metadata attribute driving policy for user status
verify_checksum_attribute { "irods::verification::checksum" }
verify_replica_checksum(*all_flag, *resource_name, *violations)
{
*attr = verify_checksum_attribute
# get a list of all matching collections given the metadata attribute
foreach(*row0 in SELECT COLL_NAME, META_COLL_ATTR_VALUE WHERE META_COLL_ATTR_NAME = "*attr") {
*coll_name = *row0.COLL_NAME
# get a list of all data objects in the given collection
foreach(*row1 in SELECT COLL_NAME, DATA_NAME WHERE COLL_NAME like "*coll_name%") {
*coll_name = *row1.COLL_NAME
*data_name = *row1.DATA_NAME
Data Integrity Policy - Replica Checksum
*coll_name = *row1.COLL_NAME
*data_name = *row1.DATA_NAME
if(true == *all_flag) {
foreach(*row2 in SELECT DATA_REPL_NUM, DATA_CHECKSUM WHERE COLL_NAME = "*coll_name" AND DATA_NAME = "*data_name") {
*repl_num = *row2.DATA_REPL_NUM
*checksum = *row2.DATA_CHECKSUM
msiDataObjChksum("*coll_name/*data_name", "forceChksum=++++replNum=*repl_num", *out)
if(*checksum != *out) {
*violations = cons("*coll_name/*data_name violates the checksum policy *out vs *checksum", *violations)
}
} # for resources
}
else {
foreach(*row2 in SELECT DATA_REPL_NUM, DATA_CHECKSUM WHERE COLL_NAME = "*coll_name" AND DATA_NAME = "*data_name" AND RESC_NAME = "*resource_name") {
*repl_num = *row2.DATA_REPL_NUM
*checksum = *row2.DATA_CHECKSUM
msiDataObjChksum("*coll_name/*data_name", "forceChksum=++++replNum=*repl_num", *out)
if(*checksum != *out) {
*violations = cons("*coll_name/*data_name violates the checksum policy *out vs *checksum", *violations)
}
} # for resources
}
} # for objects
} # for collections
} # verify_replica_checksum
Data Integrity Policy - Replica Checksum
execute_replica_checksum_policy {
*violations = list()
# all_flag : checksum all replicas
# resource_name : if all_flag is false, provide a resource name
# violations : list of violating objects
verify_replica_checksum(true, "", *violations)
foreach(*v in *violations) {
writeLine("stdout", "*v")
}
}
INPUT null
OUTPUT ruleExecOut
Executing the replica checksum policy
irule -r irods_rule_engine_plugin-irods_rule_language-instance -F execute_replica_checksum_policy.r
Test from the command line
Configuring the Policies
"rule_engines": [
{
"instance_name": "irods_rule_engine_plugin-irods_rule_language-instance",
"shared_memory_instance": "irods_rule_language_rule_engine",
"plugin_specific_configuration": {
"re_rulebase_set": [
"verify_replica_placement",
"verify_replica_number",
"verify_checksum",
"core"
],
Place the rule base files into /etc/irods
Add to /etc/irods/server_config.json for the rule language plugin
Enforcing Policies
Synchronous Enforcement
process_violations(*v)
{
# auditing code?
# reporting code?
# recovery code?
}
pep_api_data_obj_put_post(*INSTANCE_NAME, *COMM, *DATAOBJINP, *BUFFER, *PORTAL_OPR_OUT)
{
*logical_path = *DATAOBJINP.obj_path
verify_replica_checksum_for_object(true, "", *violations)
verify_replica_number_for_object(*violations)
verify_replica_placement_for_object(*violations)
process_violations(*violations)
}
(our implementations were designed to be asynchronous)
Enforcing Policies
Asynchronous Enforcement
execute_replica_checksum_policy
{
*violations = list()
delay("<EF>RUN FOR EVER</EF><ET>7d</ET>") {
verify_replica_checksum(true, "", *violations)
verify_replica_number(*violations)
verify_replica_placement(*violations)
process_violations(*violations)
}
}
INPUT null
OUTPUT ruleExecOut
Questions?
KU Leuven Policy Training - Native Rule Language
By jason coposky
KU Leuven Policy Training - Native Rule Language
- 1,203