iRODS Overview
Jason Coposky
@jason_coposky
Executive Director, iRODS Consortium
iRODS is open source software for…
• Working with data distributed across storage technologies
• Annotating and searching data with rich metadata
• Implementing access control, auditing, preservation, organization, and data movement policies
• Providing a single interface to share data between organizations
Data Virtualization
- Standard file systems: Any mount point
- Archival storage: HPSS, TSM
- Object stores: Cleversafe, DDN WOS, Ceph/Rados
- Cloud-based storage: Amazon S3
-
Separates Logical and Physical
- Logical - entry in the catalog
- Physical - a single replica on a storage resource
iRODS presents multiple separate storage technologies in a unified namespace.
Data Virtualization
Logical Path
Physical Path(s)
Data Virtualization
Logical Path | /tempZone/home/rods/thefile.txt |
Physical Path(s) (replicas) |
/var/lib/irods/iRODS/Vault/home/rods/thefile.txt /tmp/u2vault/home/rods/thefile.txt /tmp/u1vault/home/rods/thefile.txt |
$ ils -L /tempZone/home/rods/thefile.txt rods 0 demoResc 29606 2016-10-05.09:05 & thefile.txt generic /var/lib/irods/iRODS/Vault/home/rods/thefile.txt rods 1 repl;u2 29606 2016-10-05.09:06 & thefile.txt generic /tmp/u2vault/home/rods/thefile.txt rods 2 repl;u1 29606 2016-10-05.09:06 & thefile.txt generic /tmp/u1vault/home/rods/thefile.txt
Data Discovery
- Metadata can be system- or user-generated.
- Users can find data using features such as description, study ID, access date.
- Metadata can be used to link processed results to raw data (i.e., tracking provenance).
- Administrators can use metadata to control policy, such as archiving and access control policies.
iRODS provides a catalog, the iCAT, that links data and metadata.
Workflow Automation
- API calls, database, resource and authentication operations
- iRODS rule engines execute PEP implementations
- PEP implementations can influence, deny or provide additional context to each operation
iRODS lets you use any operation within the system to trigger a programmatic action
Secure Collaboration
- Described as a Federation of iRODS Zones
- Users may access data in resources in other Zones anywhere
- A user from a remote zone must be granted access after federation
- A remote zone's data management policy is enforced for data accessed within that zone
iRODS lets you share data across administrative units at any time after deployment
HPC - Data to Compute
Title of some sort, here \/
Questions?
iRODS Server Architecture
- Metadata Catalog
- Where we write everything down
- Catalog Service Provider
- Server which provides access to the metadata catalog
- Catalog Service Consumer
- Distributed nodes to provide access to storage and other resources
Catalog Service Consumer
Servers which provide access to storage resources
- Connect to the Catalog Service Provider for
- resource configuration
- authentication
- system metadata
- user assigned metadata
- Provide scalable access to iRODS services
- May be geographically distributed
- May have an arbitrary number of resources attached
Catalog Service Provider
Same capabilities as the Consumer with the addition of a database plugin
- May serve storage capabilities
- Provides access to the metadata catalog
- May be placed in a High Availability configuration for failover and load balancing
The iRODS Metadata Catalog
- Relational Database
- postgres, mysql, or oracle
- Single source of truth for the Zone
- Holds users, groups, resources, system metadata, user metadata
- Co-resident with iRODS or a clustered server farm
- Referenced by a database plugin implemented with odbc
iRODS Data Flow
iRODS Clients
- Command Line
- iCommands
- Web interfaces
- Cloud Browser
- Metalnx
- Desktop
- Kanki
- Cyberduck
- Services
- NFS
- WebDAV
Questions?
The iRODS Plugin Architecture
iRODS Overview
By jason coposky
iRODS Overview
An overview of iRODS capabilities, architecture and the plugin interface
- 1,677