April 2-4, 2025
BioIT World 2025
Boston, MA
Building an
Approachable Cost-Effective Data Management Platform
Kory Draughn
Chief Technologist
iRODS Consortium
Our Membership
Consortium
Member
Consortium
Member
Consortium
Member
Partners and Users: Past and Present
Mission
Long-term data management is best executed when policies are clear and infrastructure is abstracted and swappable.
iRODS has a desire to be normal and boring for the administrator and approachable and powerful for the user.
This talk will cover recent advances and interfaces which allow companies to sustain FAIR data practices, enforce consistency and reproducibility, and realize cost-savings through open source software.
What is iRODS?
Open Source
Distributed
Data Centric & Metadata Driven
iRODS as the Integration Layer
Why use iRODS?
People need a solution for:
The larger the organization, the more they need software like iRODS.
Where is iRODS used?
The Data Management Model
Our Approach
Flexible Authentication
iRODS 4.3 introduced a new authentication plugin framework which enables client-driven authentication flows.
This effort led to the development of the PAM Interactive Authentication plugin - a community-built plugin enabling dynamic authentication flows.
This plugin pushes the details of authentication from the server to the PAM stack, enabling various authentication schemes and flows.
Protocol Plumbing - Presenting iRODS as other Protocols
Over the last few years, the ecosystem around the iRODS server has continued to expand.
Integration with other types of systems is a valuable way to increase accessibility without teaching existing tools about the iRODS protocol or introducing new tools to users.
With some plumbing, existing tools get the benefit of visibility into an iRODS deployment.
iRODS + HTTP
The iRODS HTTP API is a client application designed to make iRODS approachable by presenting a cohesive representation of the iRODS API over the HTTP protocol.
It supports OpenID Connect, allowing adminstrators to leverage existing authentication services.
It's on track to be absorbed into the iRODS server.
Future
iRODS 5 is the next major release of the software and represents three years of significant effort towards our goal of being normal and boring.
Thank you!
Questions?
June 17-20, 2025
Ingest to Institutional Repository
As data matures and reaches a broader community, data management policy must also evolve to meet these additional requirements.
Data Virtualization
Combine various distributed storage technologies into a Unified Namespace
iRODS provides a logical view into the complex physical representation of your data, distributed geographically, and at scale.
Data Discovery
Attach metadata to any first class entity within the iRODS Zone
iRODS supports automated and user-provided metadata which makes your data and infrastructure more discoverable, operational, and valuable.
Workflow Automation
Policy Enforcement Points (PEPs) are triggered by every operation within the framework
The iRODS rule engine framework provides the ability to capture real world policy as computer actionable rules which may allow, deny, or add context to operations within the system.
Dynamic Policy Enforcement
The iRODS rule may:
Dynamic Policy Enforcement
A single API call expands to many plugin operations all of which may invoke policy enforcement.
Plugin Interfaces:
Secure Collaboration
iRODS allows for collaboration across administrative boundaries after deployment
iRODS provides the ability to federate namespaces across organizations without pre-coordinated funding or effort.
iRODS Clients
iRODS S3 Functionality
The iRODS S3 storage resource plugin allows iRODS to use any S3-compatible storage device or service to hold iRODS Data Objects, on-premises or in the cloud.
This plugin can work as a standalone "cacheless" resource or as an archive resource under the iRODS compound resource. Either configuration provides a POSIX interface to data held on an object storage device or service.
The following S3 services and appliances (in no particular order) have been tested:
Storage Tiering
Automated Ingest - Landing Zone
Automated Ingest - Filesystem Scanning
Core Competencies
Policy
Capabilities
Indexing
Core Competencies
Policy
Capabilities
Publishing
Deployment Patterns
Data to Compute
Compute to Data
Data Transfer Nodes
Filesystem Synchronization
Filesystem Synchronization
Data to Compute
Compute to Data
Data Transfer Nodes