Alan King and Derek Dong

iRODS Consortium

Technology Update

iRODS 4.3.1

November 12-17, 2023

Supercomputing 2023

Denver, CO

iRODS 4.2 Series

4.2.12 is the final release of the 4.2 series.

 

Limited to security fixes, bug fixes, and trivial enhancements.

Contributors - 4.2.12

iRODS Release Issues Closed
4.2.12 160
~/irods $ git shortlog --summary --numbered 4.2.11..4.2.12
    67  Alan King
    58  Kory Draughn
    13  Daniel Moore
     9  Justin James
     8  Markus Kitsinger (SwooshyCueb)
     6  Martin Jaime Flores Jr
     4  Felix A. Croes
     2  Alastair Smith
     1  Phillip Davis
     1  Terrell Russell

4.2.12 Core Server Improvements

  • Microservices for read-only access to JSON objects
    • Useful in iRODS Rule Language (NREP) with JSON-based inputs/outputs
  • Wider availability of admin keyword in various APIs and libraries
    • imeta
    • atomic ACLs/metadata endpoints
    • filesystem
    • msiDataObjChksum
  • Improved user/group/password management
  • Fixes and expansive tests for compound resource

What is iRODS?

Open source data management software used by research, commercial, and governmental organizations worldwide for over 20 years.

Virtualizes data storage resources, so users can take control of their data, regardless of where and on what device the data is stored.

Rule engines allow for composable policy implementation. Plugin architecture supports microservices, storage systems, authentication, networking, databases, rule engines, and an extensible API.

Contributors - 4.3.1

iRODS Release Issues Closed
4.3.1 236
~/irods $ git shortlog --summary --numbered 4.3.0..4.3.1
   204  Kory Draughn
   101  Alan King
    24  Markus Kitsinger (SwooshyCueb)
    15  Nishant Dash
    14  Martin Jaime Flores Jr
    12  Justin James
    10  Daniel Moore
     7  Violet White
     4  Felix A. Croes
     3  Terrell Russell
     2  Derek Dong
     2  Phillip Davis
     1  Awab Masroor
     1  Peter Verraedt
     1  June Releford
     1  Leonardo Lenoci

4.3.1 User Experience Updates

  • Removed setup for rsyslog/logrotate (syslog implementation assumptions)
  • Replaced log_facility with server_zone in log message output
  • Deprecated SimpleQuery
  • Exposed client connection information to acPreConnect()
  • ichmod honors the permission model
  • unixfilesystem resource plugin supports detached mode
  • Additional info added to izonereport; structure flattened for clarity
  • New configuration options: TCP keepalive, authentication
  • Newly packaged for Ubuntu 22, Debian 12, and Enterprise Linux 9

4.3.1 Core Server Enhancements

  • Approaching GCC compatibility
  • Added support for Address Sanitizer
  • New API plugin: rc_switch_user
  • iRODS Project Templates for C++
  • Improved documentation
  • Library feature tests
  • New API plugin: rc_check_auth_credentials
  • New zone administration library for C++
  • New ticket administration library for C++
  • New C++ library: process_stash

Build and Packaging

We continue to move towards a more Normal and Boring approach to build and packaging:

  • Build against libstdc++
  • Reduction / Elimination of irods-externals
  • Unprivileged build and packaging in development environment containers

 

This effort will enable iRODS to exist on more platforms and architectures, increasing accessibility for users and developers.

Address Sanitizer (ASan)

A very fast memory error detector for C/C++.

 

It detects several different issues such as memory leaks, use-after-free bugs, heap buffer overflows, etc.

 

Used to track down several memory leaks in iRODS 4.3.0.

 

Enabled via CMake by setting IRODS_ENABLE_ADDRESS_SANITIZER to YES.

 

For example:

    user@sc2023:~ $ cmake ... -DIRODS_ENABLE_ADDRESS_SANITIZER=YES ...

New API Plugin - rc_switch_user

Allows the user associated with a connection to be switched to a different user.

 

Designed for client applications which

  • act as servers (e.g. NFSRODS) and
  • require a proxied connection

 

Benefits

  • Avoids TCP connection setup and tear down
  • Allows a single connection to be reused for multiple users
  • Gets us closer to true connection pooling

New API Plugin - rc_switch_user (cont.)

Performance Testing Details

 

Setup

  • Two custom client applications
  • App A connects to a server N times as the same user
  • App B makes one connection and calls rc_switch_user N times

 

Test results show a 98% performance improvement.

iRODS Project Templates for C++

Using the GitHub template repository feature, the iRODS Consortium now offers template repositories which allow C++ developers to jump directly into writing code for iRODS.

 

The Consortium supports five template repositories today.

Improved Documentation - Policy Cookbook

An online resource dedicated to providing best practices and the latest techniques to various policy-based situations encountered in the iRODS ecosystem.

 

The cookbook covers topics such as ...

  • Synchronizing Delay Rules using Metadata
  • Naming Schemes and Conventions
  • Sharing data across PEPs
  • Simulating User Quotas
  • Implementing maintainable Policy through reusable rules

 

If you have suggestions on how to improve the cookbook, please reach out.

Improved Documentation - Data Objects

Information about data objects has been expanded.

 

Documentation for 4.3.1 includes details about ...

  • The meaning of each replica status (intermediate, write-locked, etc.)
  • Logical Locking
  • High-Level Operations (put, get, copy, replicate, etc.)
  • R_DATA_MAIN - The database table which holds all replica information

 

We'll continue to expand on these topics as improvements to the server are made.

Improved Documentation - Protocol Cookbook

Intern project documenting the iRODS protocol by demonstrating a basic client implementation of the iRODS control flow.

 

Meant to serve as a model for implementing new client libraries in various languages.

 

Implemented as a Jupyter Notebook.

UnixFileSystem Resource - Detached Mode

Allow multiple servers in an iRODS Zone to service requests made to a single UnixFileSystem resource. Only requires a mountpoint to a common backend filesystem accessible by all participating servers.

 

Useful for parallel and distributed filesystems.

 

Configure via context string:

  • Add "host_mode=detached" (any other value means attached)
  • Add comma-delimited list of hosts to host_list in context string (optional - all servers will service requests if host_list is excluded)
iadmin mkresc detached_resc unixfilesystem hostname.example.org:/common/mount/point \
	"host_mode=detached;host_list=host2.example.org,host3.example.org"

Audit AMQP Rule Engine Plugin

  • Modernization
    • Refactored to use nlohmann-json instead of jansson
    • Refactored to use qpid-proton's C++ API
    • Migrated to new logging framework
    • Miscellaneous other modernization
  • Housekeeping
    • Repository reorganized and code reformatted
    • RPM package installation less fussy
    • Removed unused amqp_options configuration setting
    • Miscellaneous other housekeeping
  • Removed JSON wrapper tokens
  • Fixed JSON types for some fields
  • More AMQP message metadata set
  • Better handling of default configuration

Audit AMQP Rule Engine Plugin - ELK Stack

  • Modernization
    • New Dockerfile syntax
    • Updated entire software stack
      • Container base image
      • Elasticsearch, Kibana, RabbitMQ
      • Temurin JDK
  • Housekeeping
    • Reduced number and size of intermediate container images
    • Excluded more unneeded files from container image
  • Updated for use with new version of the rule engine plugin
    • Workarounds for use with older/current versions of the plugin are togglable
  • Replaced logstash with a Python daemon using qpid-proton's Python API
  • Moved as much setup as possible to container build-time
  • Added argument for specifying Java heap size

iRODS Clients

Protocols

  • HTTP API
  • S3 API
  • NFSRODS
  • irodsfs (FUSE)
  • SFTPGo
  • k8s CSI Driver
  • Davrods

Libraries

  • irods-dev (C/C++)
  • python-irodsclient (Python)
  • go-irodsclient (Go)
  • Jargon (Java)
  • rirods (R)

CLIs

  • iCommands
  • iRODS CLI
  • gocommands

GUIs

  • Metalnx
  • ZMT
  • iBridges

iRODS CLI

Restart of a previous effort to make a modernized iRODS CLI:

  • A brand new client, not just an icommands rewrite
    • The aim is to write everything with idiomatic C++
  • Single executable instead of 50+ icommands
    • $ irods put testFile
      
  • Optionally modular, thanks to CMake magic (and a hint of preprocessor)
    • Reads .so plugins as sub-commands via boost:dll, or
    • Sub-commands compiled into main executable
  • Aims to make use of the newest libraries
    • irods::filesystem and friends for iRODS things
      • Avoiding icommand-specific libraries, too
    • A lot of boost, everywhere
  • Implementations for several of the simpler commands are in a working state
    • ls, mv, put, get, etc.
  • Still a WIP, first goal is to reach feature parity with icommands

Big Picture

Core

  • 4.3.x/4.4.x/5.0 - Satisfy Roadmap (Cloud-friendliness, Replace PackStruct, etc.)

 

https://irods.org/roadmap

 

Continue building out policy components (Capabilities).

 

We want installation and management of iRODS to become about policy design, composition, and configuration.

 

Please share your ...

  • Use cases

  • Pain points

  • Hopes and dreams

Open Source Community Engagement

Get Involved

  • Working Groups

  • GitHub Issues

  • Pull Requests

  • Chat List

  • Consortium Membership

 

Tell Others

  • Publish, Cite, Advocate, Refer

Technology Update - iRODS 4.3.1 (Booth Edition)

By Alan King

Technology Update - iRODS 4.3.1 (Booth Edition)

  • 113