Data Management Design Patterns
Terrell Russell, Ph.D.
Chief Technologist, iRODS Consortium

Data Management Design Patterns
November 14-16, 2017
Supercomputing 2017
Denver, CO

Why Data Management Matters

As data matures and reaches a broader community, data management policy must also evolve to meet these additional requirements.

iRODS is
- Open Source
- Distributed
- Data Centric
- Metadata Driven
A flexible framework for the abstraction of infrastructure

iRODS as the Integration Layer

iRODS Build and Test - Today

Spring 2015 - onwards
- Jenkins → Python → Ansible → zone_bundles → vSphere dynamic VMs
Changes Since 2017
- Centos 6 and Ubuntu 12 no more supported
irods build logic moved out of ansible
workflow to test all plugins
- run-script-on-irods-zone

20+ year legacy
- 10 years of federal funding for grid storage research
- 10 years of federal funding for policy engine research
- iRODS Consortium founded in 2013
Our Membership

Community Driven

Input from the Open Source Community
- Support Requests
- Community Feedback
- Working Groups
- Use Cases
- Proofs of Concept
All with the Expectation of Public Discourse and Disclosure
Discovered a common enabling practice...

(aka metadata)
Annotation with meaning

Annotation is both descriptive and prescriptive.
It is useful
- for discovery of the past and the present
- to direct the future
Metadata Everywhere

With the appropriate abstractions, everything in a system can be described with metadata and therefore, all actions within a system can be driven by that metadata.

Metadata Driven Patterns:
- Good Metadata (Templates)
- Landing Zone / Ingest
- Replication
- Tiering
- Archiving
- Auditing
- Data to Compute
- Compute to Data
Metadata Templates

iRODS Capabilities

From Prototype to Production

Provenance and Reporting

Data to Compute Pattern

Compute to Data Pattern

An open community-driven process
- is hard
- is slow
But, it also
- sets clear expectations
- generates a shared language
- produces a strong culture
- produces a better 'product'
- is worth it
Discovering Design Patterns

Thank you!
iRODS Consortium @iRODS
Booth #437
Terrell Russell, Ph.D.
Chief Technologist, iRODS Consortium
SC17 - Data Management Design Patterns
By iRODS Consortium
SC17 - Data Management Design Patterns
- 2,517