iRODS - An Overview
Jason Coposky
@jason_coposky
Executive Director, iRODS Consortium
iRODS - An Overview
Annual Scientific Computing Workshop
Public Health England
Colindale, England
Jason Coposky
@jason_coposky
Executive Director, iRODS Consortium
What is iRODS
iRODS is
- Distributed
- Open source
- Metadata Driven
- Data Centric
A flexible framework for the abstraction of infrastructure
iRODS as the Integration Layer
Data Virtualization
Combine various distributed storage technologies into a Unified Namespace
- Existing file systems
- Cloud storage
- On premises object storage
- Archival storage systems
iRODS provides a logical view into the complex physical representation of your data, distributed geographically, and at scale.
Data Virtualization
Logical Path
Physical Paths(s)
Data Virtualization
$ ils -L /tempZone/home/rods/thefile.txt rods 0 demoResc 29606 2016-10-05.09:05 & thefile.txt generic /var/lib/irods/iRODS/Vault/home/rods/thefile.txt rods 1 repl;u2 29606 2016-10-05.09:06 & thefile.txt generic /tmp/u2vault/home/rods/thefile.txt rods 2 repl;u1 29606 2016-10-05.09:06 & thefile.txt generic /tmp/u1vault/home/rods/thefile.txt
Logical Path | /tempZone/home/rods/thefile.txt |
Physical Paths | /var/lib/irods/iRODS/Vault/home/rods/thefile.txt /tmp/u2vault/home/rods/thefile.txt /tmp/u1vault/home/rods/thefile.txt |
Data Discovery
Attach metadata to any first class entity within the iRODS Zone
- Data Objects
- Collections
- Users
- Storage Resources
- The Namespace
iRODS provides automated and user-provided metadata which makes your data and infrastructure more discoverable, operational and valuable.
Metadata Everywhere
Workflow Automation
Integrated scripting language which is triggered by any operation within the framework
- Authentication
- Storage Access
- Database Interaction
- Network Activity
- Extensible RPC API
The iRODS rule engine provides the ability to capture real world policy as computer actionable rules which may allow, deny, or add context to operations within the system.
Dynamic Policy Enforcement
- restrict access
- log for audit and reporting
- provide additional context
- send a notification
The iRODS rule may:
Dynamic Policy Enforcement
A single API call expands to many plugin operations all of which may invoke policy enforcement
- Authentication
- Database
- Storage
- Network
- Rule Engine
- Microservice
- RPC API
Plugin Interfaces:
Secure Collaboration
iRODS allows for collaboration across administrative boundaries after deployment
- No need for common infrastructure
- No need for shared funding
- Affords temporary collaborations
iRODS provides the ability to federate namespaces across organizations without pre-coordinated funding or effort.
iRODS Service Interface
Federation - Shared Data and Services
Institutional repositories
As data matures and reaches a broader community, data management policy must also evolve to meet these additional requirements.
Use Cases
iRODS
On Premises to Any Cloud Infrastructure
Data to Compute Use Case
Compute to Data Use Case
The Wellcome Trust Sanger Institute
Sanger - Replication
- Data preferentially placed on resource servers in the green data center (fallback to red)
- Data replicated to the other room.
- Checksums applied
- Green and red centers both used for read access.
Sanger - Metadata
attribute: library
attribute: total_reads
attribute: type
attribute: lane
attribute: is_paired_read
attribute: study_accession_number
attribute: library_id
attribute: sample_accession_number
attribute: sample_public_name
attribute: manual_qc
attribute: tag
attribute: sample_common_name
attribute: md5
attribute: tag_index
attribute: study_title
attribute: study_id
attribute: reference
attribute: sample
attribute: target
attribute: sample_id
attribute: id_run
attribute: study
attribute: alignment
- Example metadata attributes
- Users query and access data from local compute clusters
- Users access iRODS locally via the command line interface
Sanger - Federation
University College London
- UK sponsored research requirements: last date of access request plus 10 years
- iRODS tiers data across storage technologies
- Enables federated access from other centers
Roadmap
iRODS Software
The Roadmap
- iRODS 4.3
- Packaged iRODS Capabilities
- Multipart Transfer
- Cacheless Object Storage
- Query Arrow
- Metadata Templates
- Filesystem Integration
The Roadmap - iRODS 4.3
- Hardening Release
- Logging
- iRODS Monitor
- Delegate Checksum to Storage Plugins
Packaged iRODS Capabilities
Multipart Transfer
Provide reliable transfer with restart - object parts tracked in the catalog
Later versions will provide fast, first class access to object storage
iRODS 4.2 and Beyond - The Scatter
Next Generation Query Interface
iRODS 4.3 and Beyond - The Gather
Shared Data - Shared Infrastructure
Metadata Templates
Business Model
iRODS Consortium
The iRODS Consortium
Our Mission
- Write Good Software
- Grow the Community
- Show Value to our Membership
Why Open Source
- Transparency
- Quality
- Persistence
- Vendor Neutrality
- Customization
- Community
- Try before you buy
Our Membership
Our Business Model
Consortium Membership
- Participate in roadmap development
- Participate in consortium governance
- Direct support from the team
- Tier 3 support agreements
- Discount for support agreements
Our Business Model
Service & Support Contracts
- Billed hourly
- Implement Proofs of Concept
- Custom rule and plugin development
- Expand to new use cases
- Discounted rate for consortium members
Membership Committees
Technology Working Group
- Monthly web conferences
- Build iRODS Roadmap
- Propose new technology direction
- Propose inclusion of new software
- Propose new working groups
Membership Committees
Planning Committee
- Monthly web conferences
- Discuss consortium policy and business practices
- Propose conferences and workshops
- Vote on inclusion of new software
- Vote on roadmap
Membership Committees
Executive Board
- Meets twice yearly
- Votes on consortium budget and bylaw changes
- Determines the thematic priorities of the consortium
Additional working groups are formed as required
Our Consortium Participation
An Introduction to iRODS at PHE
By jason coposky
An Introduction to iRODS at PHE
An executive overview of iRODS, its use cases, and the roadmap
- 1,596