Cacheless S3 Resource

February 20, 2019

Renaissance Computing Institute

UNC-Chapel Hill

Justin James

iRODS Consortium

Cacheless S3 Resource

Introduction (Legacy Operation)

The legacy S3 plugin must be use in conjunction with a compound resource and a unixfilesystem cache resource.

The following is a sample hierarchy of the S3 plugin.

s3compound:compound
├── s3archive:s3
└── s3cache:unixfilesystem

This required the iRODS administrator to create a cache cleanup rule.

The S3 plugin itself only implemented a few operations:

  • irods::RESOURCE_OP_UNLINK
  • irods::RESOURCE_OP_STAT
  • irods::RESOURCE_OP_RENAME
  • irods::RESOURCE_OP_STAGETOCACHE
  • irods::RESOURCE_OP_SYNCTOARCH

 

All of the other operations were handled by the cache resource.

Introduction (New Modes of Operation)

The new plugin now supports three operating modes

This mode is set using the HOST_MODE parameter in the resource context string.

 

If the HOST_MODE is not set, the default is archive_attached, which operates as the legacy S3 plugin. 

Archive Cacheless
Attached archive_attached
(default)
cacheless_attached
(demonstrated today)
Detached N/A coming soon

Note that "archive_detached" is not a valid entry.  

Introduction (Archive vs Cacheless)

  • Archive - The S3 resource acts in the archive role behind a compound resource.
    • Requires a cache resource which provides POSIX semantics.
    • Must be attached to a specific iRODS server.
       
  • Cacheless - The S3 resource can be standalone.
    • May be detached from any specific iRODS server (see next slide).
    • The S3 plugin provides POSIX semantics with no cache resource and no cache management.

 

Introduction (Attached vs Detached)

  • Detached - All iRODS servers may serve a request for an object.  This is appropriate if all servers have connectivity to the S3 backend.
     
  • Attached - only the server that is defined as the host in the resource configuration will serve the request. 

 

 

Creating a Cacheless S3 Resource

iadmin mkresc s3resc s3 `hostname`:/irods-bucket/irods/Vault "S3_DEFAULT_HOSTNAME=s3.amazonaws.com;S3_AUTH_FILE=/var/lib/irods/s3.keypair;S3_REGIONNAME=us-east-1;S3_RETRY_COUNT=1;S3_WAIT_TIME_SEC=3;S3_PROTO=HTTP;ARCHIVE_NAMING_POLICY=consistent;HOST_MODE=cacheless_attached"

Implementation Details

Implementation Details

Demonstration of the S3 Plugin

iadmin mkresc news3resc s3 `hostname`:/justinkylejames-irods1/irods/Vault "S3_DEFAULT_HOSTNAME=s3.amazonaws.com;S3_AUTH_FILE=/var/lib/irods/news3resc.keypair;S3_REGIONNAME=us-east-1;S3_RETRY_COUNT=1;S3_WAIT_TIME_SEC=3;S3_PROTO=HTTP;ARCHIVE_NAMING_POLICY=consistent;HOST_MODE=cacheless_attached"

Demonstration of the S3 Plugin

$ echo 'this is a test file' > test.txt
$ iput -R news3resc test.txt
$ aws s3 ls s3://justinkylejames-irods1/irods/Vault/home/rods/
2019-02-18 14:55:44         20 test.txt
$ iget test.txt -
this is a test file

Demonstration of the S3 Plugin

$ imv test.txt newname.txt
$ ils -L
/tempZone/home/rods:
  rods              0 news3resc           20 2019-02-18.14:55 & newname.txt
        generic    /justinkylejames-irods1/irods/Vault/home/rods/newname.txt
$ aws s3 ls s3://justinkylejames-irods1/irods/Vault/home/rods/
2019-02-18 15:23:24         20 newname.txt
$ irm -f newname.txt
$ ils
/tempZone/home/rods:
$ aws s3 ls s3://justinkylejames-irods1/irods/Vault/home/rods/

Demonstration of the S3 Plugin

$ iput -R news3resc 64Mfile
$ iget 64Mfile 64Mfile2 -f

$ diff 64Mfile 64Mfile2

$ cksum 64Mfile 64Mfile2
1941261876 67108864 64Mfile
1941261876 67108864 64Mfile2

Next Steps

TRiRODS February 2019 - Cacheless S3 Resource

By justinkylejames

TRiRODS February 2019 - Cacheless S3 Resource

TRiRODS February 2019 - Cacheless S3 Resource

  • 1,277