Together at Last:

Automated Ingest to Storage Tiering

December 5, 2018

TRiRODS

UNC, Chapel Hill

Alan King

Developer, iRODS Consortium

Together at Last:

Automated Ingest to Storage Tiering

iRODS Capabilities

  • Packaged and supported solutions
  • Require configuration not code
  • Derived from the majority of use cases observed in the user community

Automated Ingest

  • Built on top of Python iRODS Client with Celery and Redis
  • Walks filesystem and populates/updates catalog in parallel
  • Extract and annotate metadata via user-defined callbacks
  • Provides a "front door" to iRODS namespace for your data

Automated Ingest

  • C++ rule engine plugin
  • Migrates data between storage resources in tier groups
  • Tiers and tier groups defined by metadata on resources
  • Configuration for violation criteria, data verification, etc.

Storage Tiering

Storage Tiering

  • Existing solutions are costly and create vendor lock-in
    • iRODS is free and compatible with many storage technologies
  • Packaged solution for bringing data under management from ingest to archive
    • Need a plan once data is ingested

Why it matters

Why it matters

  • Ingest 1,010 files on filesystem in iRODS namespace
    • REGISTER_SYNC with to_resource implemented
  • Execute migration policy via storage tiering
    • 1 tier group with 3 resources
    • Default violation query (time-based)

Demo

Future work

  • Annotate data objects with tiering group
  • Restage data objects based on metadata
  • Externalize migration mechanism (Celery)

Automated ingest

  • Error resilience
  • Additional configuration/expertise for Redis
  • Containerization

Storage tiering

Questions?

Together at Last: Automated Ingest to Storage Tiering

By Alan King

Together at Last: Automated Ingest to Storage Tiering

  • 1,141