Provenance
June 13-15, 2017
iRODS User Group Meeting 2017
Utrecht, Netherlands
Terrell Russell, Ph.D.
@terrellrussell
Interim Chief Technologist, iRODS Consortium
Provenance
Definition
Provenance is:
1) the place of origin or earliest known history of something.
2) the beginning of something's existence; something's origin.
3) a record of ownership of a valued object, work of art or an antique, used as a guide to authenticity or quality.
Provenance in iRODS
We are interested in the third definition:
3) a record of ownership of a valued object, work of art
or an antique, used as a guide to authenticity or quality
We care about what has happened to our system
and the objects contained within it:
Provenance in iRODS
Provenance in iRODS
iRODS allows an organization to decide how much effort it wishes to expend on questions of data provenance.
The iRODS Rule Engine Plugin Framework (REPF) provides Policy Enforcement Points (PEPs), or event hooks, with which an organization can execute arbitrary code. Every event within the system (iput, iget, authentication, replication, database query, etc.) fires hundreds of PEPs.
Depending on the goals of an organization, the PEPs could be programmed to log information for query or to be part of a specifically crafted report, or some of both.
An organization can start with a reactive provenance model where every PEP is logged and can be searched (e.g. Elastic Stack query). This could provide enough information to answer the occasional ad hoc inquiry into how iRODS is being used.
Or with a more proactive model, targeted logging could be aggregated and included in regulatory or executive reports.
Engineering Tradeoffs