Training Overview
Jason M. Coposky
@jason_coposky
Executive Director, iRODS Consortium
Training Overview
January 14-16, 2020
CINES
Montpellier, France
Our Membership
Consortium
Member
Consortium
Member
Following Along
Today's Training
https://slides.com/jasoncoposky
Other Resources
https://slides.com/irods
https://docs.irods.org
https://irods.org/documentation
What is Data Management
A Definition of Data Management
"The development, execution and supervision of plans, policies, programs and practices that control, protect, deliver and enhance the value of data and information assets."
Organizations need a future-proof solution to managing data and its surrounding infrastructure
What is a Policy
A Definition of Policy
A set of ideas or a plan of what to do in particular situations that has been agreed to officially by a group of people...
So how do we do it?
The iRODS Data Management Stack
Core Competencies
Policy
Capabilities
Patterns
Starting at the bottom :: Core Competencies
The underlying iRODS technology categorized into four areas
Data Virtualization
Combine various distributed storage technologies into a Unified Namespace
iRODS provides a logical view into the complex physical representation of your data, distributed geographically, and at scale.
Projection of the Physical into the Logical
Logical Path
Physical Path(s)
Data Discovery
Attach metadata to any first class entity within the iRODS Zone
iRODS provides automated and user-provided metadata which makes your data and infrastructure more discoverable, operational and valuable.
Metadata Everywhere
Workflow Automation
Integrated scripting language which is triggered by any operation within the framework
The iRODS rule engine provides the ability to capture real world policy as computer actionable rules which may allow, deny, or add context to operations within the system.
Dynamic Policy Enforcement
The iRODS rule may:
Dynamic Policy Enforcement
A single API call expands to many plugin operations all of which may invoke policy enforcement
Plugin Interfaces:
Secure Collaboration
iRODS allows for collaboration across administrative boundaries after deployment
iRODS provides the ability to federate namespaces across organizations without pre-coordinated funding or effort.
iRODS as a Service Interface
Federation - Shared Data and Services
Ingest to Institutional repository
As data matures and reaches a broader community, data management policy must also evolve to meet these additional requirements.
iRODS Policies
The reflection of real world data management decisions in computer actionable code.
(a plan of what to do in particular situations)
Possible Policies
Policy Composition
Consider policy as building blocks towards capabilities
Follow proven software engineering principles:
Favor composition over monolithic implementations
Rules and Dynamic Policy Enforcement Points can be overloaded and fall through
Implement or configure several rule bases or rule engine plugins to achieve complex use cases
Policy Composition across rule bases
For example: pep_data_obj_put_post(...)
Rather than one monolithic implementation, separate the implementations into individual rule bases, or plugins, and allow the rule(s) to fall through
Policy Composition and Capabilities
For example - Storage Tiering
The storage tiering capability - implemented as a composite which delegates each requirement out to separate policies.
Policy Composition and Capabilities
Policies composed into a Capability framework delegate by naming convention:
Each policy may be overridden by another rule engine, or rule base to customize to future use cases or technologies
Each policy may now be reused and combined into new Capabilities
iRODS Capabilities
Deployment Patterns
Data to Compute
Compute to Data
Data Transfer Nodes
Filesystem Synchronization
The Data Management Model
Questions?