iRODS S3 API
Alan King
Senior Software Developer
iRODS Consortium
November 17-22, 2024
Supercomputing 2024
Atlanta, GA
Overview
- Motivation / Goals
- History
- Status / Implementation
- Supported features / clients
- Configuration
- Multipart
- Future Work
Motivation / Goals
- Present iRODS as the S3 protocol
- Multi-user / Multi-bucket
- Load Balancer friendly
- Maintainable
History
- iRODS S3 Working Group formed in Q3 2021
- 2023 - Violet White (intern) implemented many of the endpoints
- v0.1.0 released Nov. 2023
- v0.2.0 released March 2024
- v0.3.0 released October 2024
Status / Implementation - Architecture
- Single binary
- Single configuration file
- Multi-user
- Multi-bucket
- Requires rodsadmin credentials
- Tests passing with:
- AWS CLI Client
- Boto3 Python Library
- MinIO Python Client
- MinIO CLI Client
Status / Implementation - Endpoints
- Investigating
- GetObjectAcl
- ListObjects(V1)
- ListMultipartUploads
- PutObjectAcl
- PutObjectTagging
- UploadPartCopy
- AbortMultipartUpload
- CopyObject
- CompleteMultipartUpload
- CreateMultipartUpload
- DeleteObject
- DeleteObjects
- GetBucketLocation
- GetObject
- GetObjectLockConfiguration (stub)
- GetObjectTagging (stub)
- HeadBucket
- HeadObject
- ListBuckets
- ListObjectsV2
- PutObject
- UploadPart
Implementation - Configuration
Single file which defines two sections to help administrators understand the options and how they relate to each other.
Modeled after NFSRODS.
{
// Defines S3 options that affect how the
// client-facing component of the server behaves.
"s3_server": {
// ...
},
// Defines iRODS connection information.
"irods_client": {
// ...
}
}
Implementation - Configuration
"s3_server": {
"host": "0.0.0.0",
"port": 9000,
"log_level": "info",
"plugins": {
"static_bucket_resolver": {
"name": "static_bucket_resolver",
"mappings": {
"<bucket_name>": "/path/to/collection",
"<another_bucket>": "/path/to/another/collection"
}
},
"static_authentication_resolver": {
"name": "static_authentication_resolver",
"users": {
"<s3_username>": {
"username": "<string>",
"secret_key": "<string>"
}
}
}
},
"region": "us-east-1",
"multipart_upload_part_files_directory": "/tmp",
"authentication": {
"eviction_check_interval_in_seconds": 60,
"basic": { "timeout_in_seconds": 3600 }
},
"requests": {
"threads": 3,
"max_size_of_request_body_in_bytes": 8388608,
"timeout_in_seconds": 30
},
"background_io": { "threads": 6 }
}
Implementation - Configuration
"irods_client": {
"host": "<string>",
"port": 1247,
"zone": "<string>",
"tls": { /* ... options ... */ },
"enable_4_2_compatibility": false,
"proxy_admin_account": {
"username": "<string>",
"password": "<string>"
},
"connection_pool": {
"size": 6,
"refresh_timeout_in_seconds": 600,
"max_retrievals_before_refresh": 16,
"refresh_when_resource_changes_detected": true
},
"resource": "<string>",
"put_object_buffer_size_in_bytes": 8192,
"get_object_buffer_size_in_bytes": 8192
}
Implementation - Multipart Implementations Considered
A. Multiobject - Parts written as separate objects. On CompleteMultipartUpload, parts are concatenated on the iRODS server.
- Efficient
- Unintentional execution of policy for each part
- Pollutes iRODS namespace
- Would require a concatenate API plugin
B. Store-and-Forward - Write each part to the mid-tier, then forward to iRODS on CompleteMultipartUpload.
- No extra policy triggered
- Requires a large amount of scratch space in the mid-tier
- Non-trivial CompleteMultipartUpload
C. Efficient Store-and-Forward - Write down / hold non-contiguous parts in the mid-tier, then send contiguous parts to iRODS when ready.
- Complicated - parts are not necessarily sent in order and can be resent
- Do not know part offsets so could only forward when all previous parts have been written
- Worst case almost the entire object would still need to be stored in the mid-tier
D. Store-and-Register - Write to a file accessible to iRODS and register when complete.
- Still requires writing individual part files since we do not know the part offsets
- Requires shared visibility between iRODS and S3 API
Implementation - B. Store-and-Forward (v0.2.0)
How does it work?
1. CreateMultipartUpload
- Generate upload_id (UUID) and return the upload_id in the response
2. UploadPart
- Write bytes to a local file (location determined by configuration)
Implementation - B. Store-and-Forward (v0.2.0)
3. CompleteMultipartUpload
- Reminder: In pure S3, CompleteMultipartUpload is trivial
- Create the object in iRODS
- Determine the offset for each part
- Iterate through the parts and create background I/O tasks to write parts to the iRODS object
- When all parts are written, remove part files and send response to the client
Status / Implementation - v0.2.0 Performance Comparison
The following compares transfers to/from iRODS via the S3 API with transfers to/from a local MinIO server. The Boto S3 client was used for all cases.
Notes:
- The tests consisted of transfers of files from 200 MB to 1800 MB.
- The median of five runs is reported for each file size.
- Multipart uploads require two read/write cycles with store-and-forward.
- The S3 API was configured with 30 threads handling requests and 30 background threads.
- Performance degraded with large files when there was an insufficient number of background threads.
Status / Implementation - C. Efficient Store-and-Forward (v0.3.0)
- Reminder: S3 protocol does not specify that parts need to be sent in order, nor be uniform in size
- Improvement: Track a map of part numbers to sizes for active upload IDs
- UploadPart can stream directly to iRODS object if we know all the preceding part sizes (offset can be calculated)
- CompleteMultipartUpload then only has to stream data for parts which did not stream directly to the iRODS object via UploadPart
- This improves the original implementation by reducing the number of intermediate part files (worst-case: all parts have part files)
- ~30% performance improvement for uploads versus v0.2.0
- Caveat: Parts should never be re-sent with a size different from what is in the part size map
Status / Implementation - Multipart Performance Improvement
- Average 27% improvement over original implementation
- Default configurations used for S3 clients and S3 API server
- Results may vary depending on the ordered-ness of parts being sent
- Worst-case performance: All parts had intermediate part file
- Best-case performance: All parts sent in order
Future Work - D. Store and Register
- Consider: Store-and-Forward transmits every part twice
- Improvement: Reduce to once
- Write part files to storage visible to the iRODS server
- Concatenate into a single file
- Register the combined file in the iRODS catalog
- Challenges:
- iRODS policy would only execute for registration
- CompleteMultipartUpload still has to wait until the part files are combined before registering
- Multipart upload "mode" could become a configuration option: store-and-forward or store-and-register
Future Work
- Additional improvements for multipart
- Use SQLite for tracking upload information
- Implement the other approaches
- Optimize multipart downloads
- Additional endpoints
- Tagging
- ACLs
- Dynamic bucket mappings
- Dynamic user mappings
Thank you!
SC24 - iRODS S3 API
By iRODS Consortium
SC24 - iRODS S3 API
- 18