Metadata Templates

May 17, 2017

Renaissance Computing Institute

UNC-Chapel Hill

Rick Skarbez, Ph.D.

rskarbez@renci.org

Systems Programmer, iRODS Consortium

Metadata Templates

iRODS and Metadata

One of the primary functions of iRODS is to connect unstructured data with metadata. Metadata may be attached to data objects, users, groups, collections, resources, and zones

 

iRODS stores metadata in the form of attribute-value-unit "triples" in a relational database

 

Once metadata is applied, it can be used:

  • for data discovery
  • to trigger actions defined in the iRODS rule engine
  • to drive computation

Why do we need Metadata Templates in iRODS?

In a default deployment of iRODS, metadata is stored as unadorned attribute-value-unit (AVU) triples of strings

 

As such, it is not possible, for example, to ensure that an attribute latitude actually contains a latitude value

 

Our goal is to provide users, curators, and grid administrators more control over metadata, without reimplementing how iRODS actually handles metadata

What are Metadata Templates?

In iRODS, Metadata Templates are JSON files that map metadata triples (AVUs) to more detailed information about their values

 

For example, a metadata template might indicate that a given AVU is part of Dublin Core metadata on an object, that it was added to the object by Rick Skarbez, or that it is automatically updated based on data in the iRODS catalog

What can you do with Metadata Templates?

MTs enable users to interact with AVUs in a user-friendly interface by providing guidance to UIs

 

For example, a value can be an integer that must fall within a range, and thus can be represented in the UI as a slider

What can you do with Metadata Templates?

MTs enable users and curators to standardize the metadata elements associated with a collection, and to quickly and uniformly apply common metadata elements to many iRODS objects

What can you do with Metadata Templates?

MTs enable curators to require certain metadata elements be populated on iRODS objects, and to validate those metadata elements with respect to their types and values

Metadata Template JSON Schema

Available on GitHub

    https://github.com/irods/irods_schema_metadata_templates

 

A Metadata Template consists of the following properties:

name

type

source

destination

description
author

version

required

elements

Metadata Template type

An enum that indicates what type of Metadata Template it is

 

Currently, we only support FORM_BASED Metadata Templates

 

We have discussed adding, for example, support for Templates derived from schema.org schema (type SCHEMA_REF?)

Metadata Template source

An enum that indicates where the data needed to populate the Metadata Template will come from

 

Currently, the only supported source is USER

 

We have discussed adding, for example, the ability to populate a Metadata Template with output from an iRODS rule (RULE) or a combination of rule output and user input (MIXED)

Metadata Template destination

An enum that indicates how the metadata will actually be stored on disk

 

Currently, the only supported destination is IRODS, indicating that the metadata will be stored as AVU triples in the iRODS catalog

 

We have discussed adding, for example, the ability to store metadata in an external Postgres database, as is done in CyVerse

Metadata Element JSON Schema

Available on GitHub as part of the form_based_metadata_template schema

    https://github.com/irods/irods_schema_metadata_templates

 

A Metadata Element consists of the following properties:

name

i18nName

description

i18ndescription

type
source

defaultValue

validationStyle

validationOptions

required

Metadata Element type

Indicates what type of data is contained in the value

RAW_STRING

RAW_TEXT

RAW_URL

RAW_INT

RAW_FLOAT

RAW_BOOLEAN
RAW_DATE

RAW_TIME

RAW_DATETIME

REF_IRODS_QUERY

REF_IRODS_CATALOG

REF_URL
LIST_STRING

LIST_INT

LIST_FLOAT

Metadata Element validationStyle

DEFAULT

IS

IN_LIST

IN_RANGE

IN_RANGE_EXCLUSIVE

An enum that indicates how/if a metadata element will be validated

REGEX

FOLLOW_REF

DO_NOT_VALIDATE

Components of the Metadata Template architecture

Parser

    Generates MetadataTemplate POJOs from JSON and vice  versa

 

Validator

    Validates Metadata Templates and Metadata Elements

 

Resolver

    Handles find/list/CRUD operations on template files

 

Exporter

    Saves populated MetadataTemplates to permanent metadata       store (i.e. iRODS catalog)

Metadata Template REST API demo

An upload with required metadata templates