Kory Draughn

Chief Technologist

iRODS Consortium

November 12-17, 2023

Supercomputing 2023

Denver, CO

iRODS HTTP API:

Presenting iRODS as HTTP

What is the iRODS HTTP API?

An experimental redesign of the iRODS C++ REST API.

 

Goals of the project ...

  • Present a cohesive representation of the iRODS API over the HTTP protocol, effectively simplifying development of client-side iRODS applications for new developers
  • Maintain performance close to the iCommands
  • Remove behavioral differences between client-side iRODS libraries by building new libraries on top of the HTTP API
    • C, C++, Java, Python, etc - all languages produce identical behavior and results
  • Absorbed by the iRODS server if adoption is significant

Why is this necessary?

The iRODS C++ REST API proves that presenting iRODS as HTTP is possible, however, usage of the project over time has uncovered some challenges.

 

Challenges ...

  • Too many open ports raise security concerns
  • Stability issues (e.g. crashing endpoints)
  • Separation of endpoints increases complexity due to multiple layers
    • e.g. Interns found it difficult to understand how things are composed
  • Pistache HTTP library lacks completeness/maturity/adoption
  • Names of existing endpoints are fairly general which leads to difficulty in naming of new endpoints

 

The iRODS HTTP API is aimed at resolving these issues by taking a different approach based on what we've learned from the community and the iRODS S3 API.

 

To view the original document which kick-started this effort, click here.

Design - Early Decisions

  • Single binary exposing one port
  • Boost.Beast
    • A C++ header-only library providing networking facilities for building high performance libraries and applications which need support for HTTP/1 and Websockets
    • First used by the iRODS S3 API
  • Fixed set of URLs
    • Easy for users and developers to remember
  • Renamed from REST to HTTP
    • The rules of REST do not map well to the iRODS API
    • iRODS is stateful
    • Focus on designing the best API we can

Design - API URLs

Named based on concepts and entities defined in iRODS.

Operations are specified via parameters. This decision keeps URLs simple (i.e. no nesting required) and allows new/existing developers to guess which URL exposes the behavior they are interested in.

 

For example, if you want to modify a user, look at /users-groups. Or, perhaps you need to write data to a data object, then you'd use /data-objects.

/authenticate /resources
/collections /rules
/data-objects /tickets
/info /users-groups
/query /zones

Design - API Parameters

All URLs, except /authenticate and /info, accept an op parameter.

  • Mapped to a function responsible for executing the requested operation
  • Shares common values where possible
    • e.g. stat, list, create, remove, etc

 

Common parameters used through out the API ...

  • lpath
  • replica-number
  • src-resource
  • dst-resource
  • offset
  • count

 

Parameter names are not final and may change in the future.

Configuration - Top Level

{
    // Defines HTTP options that affect how the
    // client-facing component of the server behaves.
    "http_server": {
        // ...
    },

    // Defines iRODS connection information.
    "irods_client": {
        // ...
    }
}

Defines two sections to help administrators understand the options and how they relate to each other.

 

Modeled after NFSRODS.

Parallel Writes

Fully supported through the use of a Parallel Write Handle.

 

This ultimately means, the iRODS HTTP API server maintains state on behalf of the client.

 

Performing a Parallel Write requires the use of two operations ...

  • parallel_write_init
    • Instructs the server to allocate memory for managing the state of the upload
  • parallel_write_shutdown
    • Instructs the server to deallocate memory obtained via a call to parallel_write_init

 

Large files must use multipart/form-data as the content type. Failing to honor this rule will result in an error or corrupt data.

Examples - Stat'ing a collection

base_url="http://localhost:9000/irods-http-api/0.1.0"
bearer_token=$(curl -sX POST --user 'rods:rods' "$base_url/authenticate")

curl -sG -H "Authorization: Bearer $bearer_token" \
  "$base_url/collections"                         \
  --data-urlencode 'op=stat'                      \
  --data-urlencode 'lpath=/tempZone/home/rods'    \
  | jq
{
  "inheritance_enabled": false,
  "irods_response": {
    "status_code": 0
  },
  "modified_at": 1686499669,
  "permissions": [
    {
      "name": "rods",
      "perm": "own",
      "type": "rodsadmin",
      "zone": "tempZone"
    }
  ],
  "registered": true,
  "type": "collection"
}

Examples - Listing available Rule Engine Plugins

base_url="http://localhost:9000/irods-http-api/0.1.0"
bearer_token=$(curl -sX POST --user 'rods:rods' "$base_url/authenticate")

curl -sG -H "Authorization: Bearer $bearer_token" \
  "$base_url/rules"                               \
  --data-urlencode 'op=list_rule_engines'         \
  | jq
{
  "irods_response": {
    "status_code": 0
  },
  "rule_engine_plugin_instances": [
    "irods_rule_engine_plugin-irods_rule_language-instance",
    "irods_rule_engine_plugin-cpp_default_policy-instance"
  ]
}

Future Work

  • Document the API in terms of OpenAPI
  • Add support for remaining API endpoints provided by iRODS
    • Bulk/Batch operations
    • Archive file operations
  • Validate the configuration on server startup
  • Harden the implementation
  • Improve performance

v0.1.0 is available today!

 

https://irods.org/2023/11/initial-release-of-the-irods-http-api

 

Help us make this project better for everyone.

Thank you!

Questions?

Made with Slides.com