Provenance

June 5-7, 2018

iRODS User Group Meeting 2018

Durham, NC

Justin James

Applications Engineer, iRODS Consortium

Provenance

Definition

Provenance is:

 

1) the place of origin or earliest known history of something.

 

2) the beginning of something's existence; something's origin.

 

3) a record of ownership of a valued object, work of art or an antique, used as a guide to authenticity or quality.

 

Provenance in iRODS

We are interested in the third definition:

 

3) a record of ownership of a valued object, work of art

or an antique, used as a guide to authenticity or quality

 

 

We care about what has happened to our system

and the objects contained within it:

 

  • data objects
  • collections
  • resources
  • users
  • groups
  • zones

 

  • metadata about all of these

Provenance in iRODS

Provenance in iRODS

iRODS allows an organization to decide how much effort it wishes to expend on questions of data provenance.

 

The iRODS Rule Engine Plugin Framework (REPF) provides Policy Enforcement Points (PEPs), or event hooks, with which an organization can execute arbitrary code. Every event within the system (iput, iget, authentication, replication, database query, etc.) fires hundreds of PEPs.

 

Depending on the goals of an organization, the PEPs could be programmed to log information for query or to be part of a specifically crafted report, or some of both.

 

An organization can start with a reactive provenance model where every PEP is logged and can be searched (e.g. Elastic Stack query). This could provide enough information to answer the occasional ad hoc inquiry into how iRODS is being used.

 

Or with a more proactive model, targeted logging could be aggregated and included in regulatory or executive reports

Engineering Tradeoffs

UGM 2018 - Provenance

By iRODS Consortium

UGM 2018 - Provenance

iRODS User Group Meeting 2018 - Advanced Training Module

  • 1,300