Overview of the Trusted Workflow Run Crate profile

Stian Soiland-Reyes,
The University of Manchester

Approaching 5-safe with RO-Crate

RO-Crate

FAIR packaging of data, metadata and methods

https://www.researchobject.org/ro-crate/

https://w3id.org/ro/crate/1.1

https://www.researchobject.org/workflow-run-crate/

Trusted Workflow Run Crate

https://trefx.uk/trusted-wfrun-crate/

Safe Data

How to identify sensitive source data within the TRE?
TRE-specific identifiers not necessarily globally unique
Is each workflow TRE-customized for accessing a particular data source (e.g. database)?
Inputs may be implicit; workflow not portable across TREs
Do the users know in advance what file paths the source data will have within the TRE?
Identifiers may be mapped before execution
May need to inject database connection credentials
What restrictions apply before permitting user-provided input data and parameters?
TREs may review workflow and how it is being called, e.g. inspect "query" input

Questions on Safe Data

Safe People

..

Is it unreasonable to expect a global ORCID identifier?
How can TREs link a global identifier to their local user identifiers?
Should the crate also include the local user identifier?
Are there GDPR concerns with disclosing the researcher names?

Questions on Safe People

Safe Projects

Is a TRE-specific identifier string sufficient to evaluate safe project?
TRE need to verify submitter is actually member of said project
Should the Agreement Policy be injected/explicit?
Made public or on Intranet?
How can we map projects across multiple TREs?
Who provides the grant information?
A grant is likely larger than a single TRE project;
need consistent grant identifiers.

Questions on Safe Projects

Safe Settings

How do reviewers analyse the workflow?
Can a workflow run in more than one TRE?
Can a workflow execute outside an TRE (e.g. using synthetic data)
What workflow systems need to be supported?
Can the workflow execute without needing using interactions?
Can the tools of the workflow run as command line tools from containers?
What TRE restrictions may prevent workflow executions?

Questions on Safe Settings

Safe Outputs

..

How do reviewers view/analyse if the output data can be disclosed?
Can some outputs be made sensitive and propagate to another TRE?
What file formats are used for current data outputs?
What are the file sizes involved in output data? Many files or large files?

Questions on Safe Output

Review process

RFC 8493

https://trefx.uk/trusted-wfrun-crate/0.3/example-hutch/data/ro-crate-preview.html

Questions on review process:

Which phases should be done manually?
Are there phases missing? Loops?
Which phases might become optional?
What UI is needed for the review process?
Where to store crates that are "in flight"?
Current prototype use queues