Together at Last:
Automated Ingest to Storage Tiering
December 5, 2018
TRiRODS
UNC, Chapel Hill
Alan King
Developer, iRODS Consortium
Together at Last:
Automated Ingest to Storage Tiering

iRODS Capabilities

- Packaged and supported solutions
- Require configuration not code
- Derived from the majority of use cases observed in the user community








Automated Ingest


- Built on top of Python iRODS Client with Celery and Redis
- Walks filesystem and populates/updates catalog in parallel
- Extract and annotate metadata via user-defined callbacks
- Provides a "front door" to iRODS namespace for your data


Automated Ingest


- C++ rule engine plugin
- Migrates data between storage resources in tier groups
- Tiers and tier groups defined by metadata on resources
- Configuration for violation criteria, data verification, etc.
Storage Tiering



Storage Tiering


- Existing solutions are costly and create vendor lock-in
- iRODS is free and compatible with many storage technologies
- Packaged solution for bringing data under management from ingest to archive
- Need a plan once data is ingested
Why it matters




Why it matters



- Ingest 1,010 files on filesystem in iRODS namespace
- REGISTER_SYNC with to_resource implemented
- Execute migration policy via storage tiering
- 1 tier group with 3 resources
- Default violation query (time-based)
Demo


Future work

- Annotate data objects with tiering group
- Restage data objects based on metadata
- Externalize migration mechanism (Celery)
Automated ingest
- Error resilience
- Additional configuration/expertise for Redis
- Containerization
Storage tiering
Questions?

Together at Last: Automated Ingest to Storage Tiering
By Alan King
Together at Last: Automated Ingest to Storage Tiering
- 1,390