Justin James
Applications Engineer
iRODS Consortium
iRODS S3 API v0.2.0
with Multipart
May 28-31, 2024
iRODS User Group Meeting 2024
Amsterdam, Netherlands
iRODS S3 API - Goals
iRODS S3 API - History
iRODS S3 API - Architecture and Status
iRODS S3 API - Changes Since UGM 2023
iRODS S3 API - Status
iRODS S3 API - Configuration
Single file which defines two sections to help administrators understand the options and how they relate to each other.
Modeled after NFSRODS.
{
// Defines S3 options that affect how the
// client-facing component of the server behaves.
"s3_server": {
// ...
},
// Defines iRODS connection information.
"irods_client": {
// ...
}
}
iRODS S3 API - Configuration - s3_server
"s3_server": {
"host": "0.0.0.0",
"port": 9000,
"log_level": "info",
"plugins": {
"static_bucket_resolver": {
"name": "static_bucket_resolver",
"mappings": {
"<bucket_name>": "/path/to/collection",
"<another_bucket>": "/path/to/another/collection"
}
},
"static_authentication_resolver": {
"name": "static_authentication_resolver",
"users": {
"<s3_username>": {
"username": "<string>",
"secret_key": "<string>"
}
}
}
},
"region": "us-east-1",
"multipart_upload_part_files_directory": "/tmp",
"authentication": {
"eviction_check_interval_in_seconds": 60,
"basic": { "timeout_in_seconds": 3600 }
},
"requests": {
"threads": 3,
"max_size_of_request_body_in_bytes": 8388608,
"timeout_in_seconds": 30
},
"background_io": { "threads": 6 }
}
iRODS S3 API - Configuration - irods_client
"irods_client": {
"host": "<string>",
"port": 1247,
"zone": "<string>",
"tls": { /* ... options ... */ },
"enable_4_2_compatibility": false,
"proxy_admin_account": {
"username": "<string>",
"password": "<string>"
},
"connection_pool": {
"size": 6,
"refresh_timeout_in_seconds": 600,
"max_retrievals_before_refresh": 16,
"refresh_when_resource_changes_detected": true
},
"resource": "<string>",
"max_number_of_bytes_per_read_operation": 8192,
"buffer_size_in_bytes_for_write_operations": 8192
}
iRODS S3 API - Multipart Options Considered
A. Multiobject - Parts written as separate objects. On CompleteMultipartUpload, parts are concatenated on the iRODS server.
B. Store-and-Forward - Write each part to the mid-tier, then forward to iRODS on CompleteMultipartUpload.
C. Efficient Store-and-Forward - Write down / hold non-contiguous parts in the mid-tier, then send contiguous parts to iRODS when ready.
D. Store-and-Register - Write to a file accessible to iRODS and register when complete.
iRODS S3 API - Multipart Store-and-Forward
For now we have chosen the store-and-forward approach.
1. CreateMultipartUpload
2. UploadPart
iRODS S3 API - Multipart Store-and-Forward
3. CompleteMultipartUpload
4. AbortMultipartUpload (not yet implemented)
iRODS S3 API - Performance Comparison
The following compares transfers to/from iRODS via the S3 API with transfers to/from a local MinIO server. The Boto S3 client was used for all cases.
Notes:
iRODS S3 API - Multipart - Enhancement - Efficient Store and Forward
In the future we may migrate to the efficient store-and-forward approach.
The design is not finalized but the following is a possible approach.
Open question:
iRODS S3 API - Multipart - Enhancement - Store and Register
Another approach for an enhancement would be store-and-register where the initial part files are written to the iRODS server, combined into one file, then registered.
This approach has some challenges:
One approach is to offer the option of efficient store-and-forward and store-and-register with caveats on how store-and-register can be used.
iRODS S3 API - Next Steps
iRODS S3 API
Release v0.2.0
Thank you!
Questions?