S3 Architecture

Jason Coposky, Chief Technologist

iRODS Overview

Distributed virtual file system

  • Single iCAT catalog server
  • multiple resource servers

 

Four 'Pillars'

  • Workflow Automation - rule engine
  • Discovery Environment - metadata
  • Secure Collaboration - federation
  • Virtualization - plugin interfaces

iRODS Plugin Interfaces

Plugin Interfaces

  • Autentication
  • Network
  • Database
  • Resources
  • Microservices
  • RPC API

Resource Hierarchy

Follows a tree structure

  • coordinating resources - purely logical
  • storage resources - file systems or object stores

 

Coordinating Resources

  • replication, round robin, random, deferred, passthru, Compound

 

Storage Resources

  • Unix File System, S3, HPSS, WOS, Ceph-Rados

Compound Resources

Compound Resource

  • coordinating resource which manages two children
    • a file system cache
    • and an archive resource
  • ​Used to provide a POSIX interface to non-posix based storage

 

S3 acts as an archive resource within a Compound Resource Composition

iRODS S3 Resource Plugin

  • Can have any number of different instances
  • Configurable Regions
  • Key Pairs kept locally in a file
  • Configurable retry count and retry time-out
  • Does not currently support ranged gets
  • Does not currently support multi-part put

Possible next steps for S3

  • Multi-part put
  • S3 as a first class iRODS Resource
    • Lack of a POSIX interface removes some iRODS functionality: bundle operations
    • S3 does not support a partial-write which may be a challenge

Configuring the S3 iRODS Plugin

iadmin mkresc compResc compound
iadmin mkresc cacheResc unixfilesystem <hostname>:</full/path/to/Vault>
iadmin mkresc archiveResc s3 <hostname>:/<s3BucketName>/irods/Vault "S3_DEFAULT_HOSTNAME=s3.amazonaws.com;S3_AUTH_FILE=</full/path/to/AWS.keypair>;S3_RETRY_COUNT=<num reconn tries>;S3_WAIT_TIME_SEC=<wait between retries>"
iadmin addchildtoresc compResc cacheResc cache
iadmin addchildtoresc compResc archiveResc archive