Together at Last:
Automated Ingest to Storage Tiering
December 5, 2018
TRiRODS
UNC, Chapel Hill
Alan King
Developer, iRODS Consortium
Together at Last:
Automated Ingest to Storage Tiering
iRODS Capabilities
- Packaged and supported solutions
- Require configuration not code
- Derived from the majority of use cases observed in the user community
Automated Ingest
- Built on top of Python iRODS Client with Celery and Redis
- Walks filesystem and populates/updates catalog in parallel
- Extract and annotate metadata via user-defined callbacks
- Provides a "front door" to iRODS namespace for your data
Automated Ingest
- C++ rule engine plugin
- Migrates data between storage resources in tier groups
- Tiers and tier groups defined by metadata on resources
- Configuration for violation criteria, data verification, etc.
Storage Tiering
Storage Tiering
- Existing solutions are costly and create vendor lock-in
- iRODS is free and compatible with many storage technologies
- Packaged solution for bringing data under management from ingest to archive
- Need a plan once data is ingested
Why it matters
Why it matters
- Ingest 1,010 files on filesystem in iRODS namespace
- REGISTER_SYNC with to_resource implemented
- Execute migration policy via storage tiering
- 1 tier group with 3 resources
- Default violation query (time-based)
Demo
Future work
- Annotate data objects with tiering group
- Restage data objects based on metadata
- Externalize migration mechanism (Celery)
Automated ingest
- Error resilience
- Additional configuration/expertise for Redis
- Containerization
Storage tiering
Questions?
Together at Last: Automated Ingest to Storage Tiering
By Alan King
Together at Last: Automated Ingest to Storage Tiering
- 1,141