New iRODS APIs:

Presenting as HTTP and S3

Justin James

Applications Engineer

iRODS Consortium

March 11-13, 2024

CS3 2024

CERN, Geneva, Switzerland

Our Membership

Consortium

Member

Consortium

Member

Consortium

Member

What is iRODS

Open Source

  • C++ client-server architecture
  • BSD-3 Licensed, install it today and try before you buy

 

Distributed

  • Runs on a laptop, a cluster, on premises or geographically distributed

 

Data Centric & Metadata Driven

  • Insulate both your users and your data from your infrastructure

iRODS as the Integration Layer

Why use iRODS?

People need a solution for:

  • Managing large amounts of data across various storage technologies
  • Controlling access to data
  • Searching their data quickly and efficiently
  • Automation

 

The larger the organization, the more they need software like iRODS.

Protocol Plumbing - Presenting iRODS as other Protocols

  • WebDAV
  • FUSE
  • HTTP
  • NFS
  • SFTP
  • K8s CSI
  • S3

Over the last few years, the ecosystem around the iRODS server has continued to expand.

 

Integration with other types of systems is a valuable way to increase accessibility without teaching existing tools about the iRODS protocol or introducing new tools to users.

 

With some plumbing, existing tools get the benefit of visibility into an iRODS deployment.

What is the iRODS HTTP API?

A redesign of the iRODS C++ REST API.

 

Goals of the project ...

  • Present a cohesive representation of the iRODS API over the HTTP protocol
    • Simplify development of client-side iRODS applications for new developers
  • Maintain performance close to the iCommands
  • Remove behavioral differences between different client-side iRODS libraries
    • Will build new language libraries to wrap the HTTP API
    • C, C++, Java, Python, etc.
  • Absorbed by the iRODS server if adoption is significant

iRODS HTTP API - Endpoints

Based on concepts and entities defined in iRODS:

Operations are specified via parameters

  • Keeps URLs simple (i.e. no nesting required)
  • Allows new/existing developers to easily find the endpoint of interest

 

For example

  • To modify a user, investigate /users-groups
  • To write data to a data object, investigate /data-objects
/authenticate /resources
/collections /rules
/data-objects /tickets
/info /users-groups
/query /zones

iRODS HTTP API - Configuration - Top Level

{
    // Defines HTTP options that affect how the
    // client-facing component of the server behaves.
    "http_server": {
        // ...
    },

    // Defines iRODS connection information.
    "irods_client": {
        // ...
    }
}

Single file which defines two sections to help administrators understand the options and how they relate to each other.

 

Modeled after NFSRODS.

iRODS HTTP API - Configuration - http_server

"http_server": {
    "host": "0.0.0.0",
    "port": 9000,

    "log_level": "info",

    "authentication": {
        "eviction_check_interval_in_seconds": 60,

        "basic": {
            "timeout_in_seconds": 3600
        },

        "openid_connect": { /* ... options ... */ }
    },

    "requests": {
        "threads": 3,
        "max_size_of_request_body_in_bytes": 8388608,
        "timeout_in_seconds": 30
    },

    "background_io": {
        "threads": 6
    }
}

iRODS HTTP API - Configuration - irods_client

"irods_client": {
    "host": "<string>",
    "port": 1247,
    "zone": "<string>",

    "tls": { /* ... options ... */ },

    "enable_4_2_compatibility": false,

    "proxy_admin_account": {
        "username": "<string>",
        "password": "<string>"
    },
 
    "connection_pool": {
        "size": 6,
        "refresh_timeout_in_seconds": 600,
        "max_retrievals_before_refresh": 16,
        "refresh_when_resource_changes_detected": true
    },

    "max_number_of_parallel_write_streams": 3,
    "max_number_of_bytes_per_read_operation": 8192,
    "buffer_size_in_bytes_for_write_operations": 8192,
    "max_number_of_rows_per_catalog_query": 15
}

iRODS HTTP API - Example - Stat'ing a collection

base_url="http://localhost:9000/irods-http-api/0.2.0"
bearer_token=$(curl -sX POST --user 'rods:rods' "$base_url/authenticate")

curl -s -G -H "Authorization: Bearer $bearer_token" \
  "$base_url/collections"                           \
  --data-urlencode 'op=stat'                        \
  --data-urlencode 'lpath=/tempZone/home/rods'      \
  | jq
{
  "inheritance_enabled": false,
  "irods_response": {
    "status_code": 0
  },
  "modified_at": 1699448576,
  "permissions": [
    {
      "name": "rods",
      "perm": "own",
      "type": "rodsadmin",
      "zone": "tempZone"
    }
  ],
  "registered": true,
  "type": "collection"
}

iRODS HTTP API

Release v0.2.0

iRODS S3 API - Goals

  • Present iRODS as the S3 protocol
    • Multi-user
    • Multi-bucket
  • Load Balancer friendly
  • Maintainable

iRODS S3 API - History

iRODS S3 API - History - Options

1. Update and maintain

https://github.com/bioteam/minio-irods-gateway

Go, wrapping iRODS C API.  Not going to be maintainable.

 

2. minio-irods-gateway converts to use

https://github.com/cyverse/go-irodsclient

Pure Go.  Limited by lack of multi-user support.

 

3. Add irods/gateway-irods.go to upstream

https://github.com/minio/minio/tree/master/cmd/gateway

Pure Go, upstream.  Limited by above AND MinIO removed support for the gateway.

 

4. New C++ implementation

https://github.com/irods/irods_client_s3_api

And here we are.

iRODS S3 API - History - Research

  • Option 1 - MinIO with GoRODS (wrapping C)
    • Limited, not maintainable (Jul 2021)
  • Option 2 - MinIO with pure go-irodsclient
    • Needed to add (anonymous) ticket functionality
      • Implemented TicketBooth/BoxOffice (Oct 2021)
        • But would need admin credentials
        • Might as well just use C++ REST API (Nov 2021)
    • Lacks multi-user functionality (Feb 2022)
      • Auth code is in MinIO core - gateway code fires 'too late'
  • Option 3 - Get work into upstream MinIO
    • MinIO announced deprecation of gateway (May 2022)
      • Too hard / not worth supporting 'legacy' POSIX

iRODS S3 API - History - Research

  • Option 4 - New C++ Implementation
    • Removes dependency on other codebase(s)
    • 1 collection -> 1 bucket
    • Framework selection (Aug 2022)
      • Pistache
      • Oat++
      • Drogon
      • Boost.Beast (Nov 2022)
    • Initial endpoints working (Jan 2023)
      • User mapping
      • Bucket mapping

iRODS S3 API - History - Research - Alternate Universes

  • Add S3 protocol support to SFTPGo (Aug 2022)
  • Searching for existing S3 server in Go (Sept 2022)
  • Add iRODS backend to Zenko (Oct 2022)
  • Add JuiceFS frontend to iRODS (Nov 2022)
  • Add JuiceFS frontend to SFTPGo (Nov 2022)
  • GarageHQ frontend to iRODS (Jan 2023)
  • In-memory IBM s3mem-go as inspiration (Mar 2023)

iRODS S3 API - Architecture and Status

  • Released v0.2.0
  • Single binary
  • Single configuration file
  • Multi-user
  • Multi-bucket
  • Requires rodsadmin credentials
  • Tests passing with:
    • AWS CLI Client
    • Boto3 Python Library
    • MinIO Python Client
    • MinIO CLI Client

iRODS S3 API - Status

  • Implemented Endpoints
    • CompleteMultipartUpload
    • CopyObject
    • CreateMultipartUpload
    • DeleteObject
    • DeleteObjects
    • GetBucketLocation
    • GetObject
    • GetObjectLockConfiguration (stub)
    • GetObjectTagging (stub)
    • HeadBucket
    • HeadObject
    • ListBuckets
    • ListObjectsV2
    • PutObject
    • UploadPart
  • Investigating
    • ListObjects
    • GetObjectAcl
    • PutObjectAcl
    • PutObjectTagging
    • UploadPartCopy

iRODS S3 API - Configuration

{
    // Defines S3 options that affect how the
    // client-facing component of the server behaves.
    "s3_server": {
        // ...
    },

    // Defines iRODS connection information.
    "irods_client": {
        // ...
    }
}

Single file which defines two sections to help administrators understand the options and how they relate to each other.

 

Modeled after NFSRODS.

iRODS S3 API - Configuration - s3_server

"s3_server": {
    "host": "0.0.0.0",
    "port": 9000,
    "log_level": "info",
    "plugins": {
        // Each key corresponds to a local shared object file
        "static_bucket_resolver": {
            "name": "static_bucket_resolver",
            "mappings": {
                "<bucket_name>": "/path/to/collection",
                "<another_bucket>": "/path/to/another/collection"
            }
        },
        "static_authentication_resolver": {
            "name": "static_authentication_resolver",
            "users": {
                "<s3_username>": {
                    "username": "<string>",
                    "secret_key": "<string>"
                }
            }
        }
    },
    "region": "us-east-1",
    "authentication": {
        "eviction_check_interval_in_seconds": 60,
        "basic": { "timeout_in_seconds": 3600 }
    },
    "requests": {
        "threads": 3,
        "max_size_of_request_body_in_bytes": 8388608,
        "timeout_in_seconds": 30
    },
    "background_io": { "threads": 6 }
}

iRODS S3 API - Configuration - irods_client

"irods_client": {
    "host": "<string>",
    "port": 1247,
    "zone": "<string>",

    "tls": { /* ... options ... */ },

    "enable_4_2_compatibility": false,

    "proxy_admin_account": {
        "username": "<string>",
        "password": "<string>"
    },
 
    "connection_pool": {
        "size": 6,
        "refresh_timeout_in_seconds": 600,
        "max_retrievals_before_refresh": 16,
        "refresh_when_resource_changes_detected": true
    },

    "resource": "<string>",
    "max_number_of_bytes_per_read_operation": 8192,
    "buffer_size_in_bytes_for_write_operations": 8192
}

iRODS S3 API - Next Steps

  • More Testing
  • Additional endpoints
    • Tagging
    • ACLs
  • Additional plugins
    • Other bucket mappings
    • Other user mappings

iRODS S3 API

Release v0.2.0

Questions?

Thank you.

https://irods.org/ugm2024

May 28-31, 2024

CS3 2024 - New iRODS APIs: Presenting as HTTP and S3

By iRODS Consortium

CS3 2024 - New iRODS APIs: Presenting as HTTP and S3

  • 220