iRODS Meet and Greet (and Eat)

RENCI

 

January 20, 2016

First of All...

THANK YOU

Why are we here?

Why should I care?

• Curiosity

 

• Collegiality

 

• Collaboration

Agenda

• iRODS Consortium Overview

 

• Meet the Team

 

• Where We're Going

 

• Discussion

 

 

Overview

          Policy-Based Data Management

The Integrated Rule-Oriented Data System:

 

• Developed for working with massive collections of files

• Finding, securing, organizing, analyzing, preserving, and sharing data

 

• Example applications:

   • Virtual collections - alternate presentations of stored data sets

   • Federated access to data stored in remote systems

   • Rich metadata for discoverability, access control, and integrity checking

   • Programmable distribution of data to file systems and object stores

   • Combining data from multiple processes

 

Data

Virtualization

Data

Discovery

Workflow

Automation

Secure

Collaboration

iRODS Clients

• Web-based and Standalone GUIs

  - iRODS Cloud Browser, MetaLnx, Kanki, Cyberduck

 

• Portals, External Systems

  - iPlant Discovery Environment, Islandora, Fedora Commons

 

• WebDAV for drag-and-drop access built in to the OS

• APIs: Python, REST, Qt, Java, C++

• Command Line Interface

iRODS is free, open source software owned by a foundation called the iRODS Consortium.

  • Goal is to sustain iRODS as free open source software by:

    ▹ Building good software.  ▹ Growing the iRODS community.  ▹ Demonstrating value.

 

  • Funds a team of 10+ developers, application engineers, documentation, support staff

The iRODS Consortium and Sustainability

Contract Customers

and more ...

Initial Trial

  • Google Group
  • Blog posts, social media
  • Cloud images
  • Documentation
  • Training workshops
  • iRODS Hub: The iRODS App Store

Proof of Concept to Pilot

  • Occasional 1-on-1 Support
  • iRODS Consortium Members
  • iRODS Partners
  • iRODS Consortium Service Contracts

Production

  • iRODS Consortium Membership

Building Community and Demonstrating Value

https://irods.org/documentation/

Getting Plugged In with iRODS...

Meet the Team

Meet the Team

Where We're Going

Other Goals

• Membership: retention, growth, verticals

 

• User Group Meeting 2016

 

• User-generated reference designs

 

• iRODS Partners

 

• iRODS Hub

 

• Certification

 

Discussion

Thank You!

Use Cases

User Profile: NASA Atmospheric Science Data Center

• 2 PB of archived satellite data

• Publicly available, subsetting on demand

• In-house ingest and archiving software: ANGe (Archive Next Generation)

User Profile: NASA Atmospheric Science Data Center

Federation

 

 

 

 

 

 

 

 

Virtual Collections

 

ls –l

/CER_100100.2012053100
/CER_100100.2012053100.met
/CER_100100.2012053101
/CER_100100.2012053101.met
/CER_100100.2012053102
/CER_100100.2012053102.met

Visibility determined by "visibility attribute"

Logical collection of files spread across physical storage resources.

Single Interface to Multiple Clients

 

WebDAV, FUSE, Web UI, Cyberduck

REST, Python, R, Java C++

(And more!)

User Profile: Wellcome Trust Sanger Institute

• Key genomics research centre

• 7 PB of storage managed by iRODS

Rich Metadata

 

attribute: library

attribute: total_reads

attribute: type

attribute: lane

attribute: is_paired_read

attribute: study_accession_number

attribute: library_id

attribute: sample_accession_number

attribute: sample_public_name

attribute: manual_qc

attribute: tag

attribute: sample_common_name

attribute: md5

 

Replication and Federation

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

User Profile: University College London

• Repository for research data that spans social science, physics, and genomics

• UK sponsored research requirements: last date of access request plus 10 years

• iRODS spans storage technologies and enables federated access from other centres

User Profile:

National Institute of Environmental Health Sciences

• Viral Vector Core creates designer viruses:

    request⟶transfection and amplification⟶sample delivery⟶reports

 

• Uses iRODS to combine, organize, and analyze sets of requests and instrument results

   • Produces packaged results in response to researcher requests

   • Quarterly cost reports for chargeback and trend analysis for quality control