Dataverse Community MEEting
10 Years Sharing Data with Dataverse
#dataverse2017
<2006
Once there was the VDC
2
2006
And then came the Dataverse Network
2015
14
2
2006
Now we have the Dataverse
2015
2017
2
14
23
Researchers ARe Sharing and Using Data
200 datasets/month
4,000
files/month
60,000
downloads/month
Harvard Dataverse
> 70,000 datasets
> 2.5 M downloads
> 340,000 files
< 2006
When we started, there were very few journals with data policies,
no data requirements from funders
2006
2015
2017
weak = recommend
strong = require
Weak data sharing and strong data sharing vs. disciplines
Castro, Crosas, Garnett, Sheridan, Altman, 2017, Journal of Scholarly Publishing, Forthcoming
Now, Journals
across disciplines
start supporting data policies
Genetics Journals
Biomedical Journals
Computational Sciences
Economics
Open Access Journals
Ecology
33%
4%
2006
2015
2017
And Funders require data sharing
PRIVATE RESEARCH FUNDERS
- Bill and Melinda Gates Foundation Information Sharing Approach
- Sloan Foundation Data Sharing Policy
- Wellcome Trust Data Sharing Policy
- Arnold Foundation
- Moore Foundation
- Robert Wood Johnson Foundation
- HHMI Policy on the Sharing of Publication-Related Materials, Data and Software
PUBLIC RESEARCH FUNDERS
- Department of Agriculture
- Department of Commerce
- Department of Defense
- Department of Education
- Department of Energy
- Department of Health and Human Services
- Agency for Healthcare Research and Quality (AHRQ)
- Assistant Secretary for Preparedness and Response (ASPR)
- Center for Disease Control and Prevention (CDC)
- Food and Drug Administration (FDA)
- National Institutes of Health (NIH)
- Department of Homeland Security
- Department of Housing and Urban Development
- Department of Interior
- Department of Labor
- Department of Transportation
- Department of Veterans Affairs
- Environmental Protection Agency (EPA)
We are Experiencing a Cultural Change
We are Experiencing a Cultural Change
We Are the Cultural Change!
King, 1995, Replication, Replication
Altman and King, 2007, A Proposed for the Scholarly Citation of Quantitative Data
Altman et al, 2001, A Digital Library for the Dissemination and Replication of Quantitative Social Science
King, 2007, An Introduction to the Dataverse Network as an Infrastructure for Data Sharing
Crosas, Honaker, King, Sweeney, 2015, Automating Open Science for Big Data
Crosas, 2012, The Dataverse Network: an open source application for sharing, discovering, and preserving research data
Altman and Crosas, 2013, The Evolution to Data Citation: from principles to implementation
Crosas, 2013, A Data Sharing Story
2014, Joint Declaration of Data Citation Principles
Pepe et al, 2014, How Do Astronomers Share Data?
Goodman et al, 2014, Ten Simple Rules for the Care and Feeding of Scientific Data
Castro et al, 2015, Achieving Human and Machine Accessibility of Cited Data
Sweeney, Crosas, Bar-Sinai, 2015, Sharing Sensitive Data with Confidence: The DataTags System
Meyer et al. 2016, Data Publication with the Structural Biology Data Grid Supports Live Analysis
Wilkinson et al, 2016, The FAIR Guiding Principles for Scientific Data Management and Stewardship
Bierer, Crosas, Pierce, 2017, Data Authorship as an Incentive to Data Sharing
The Dataverse project and team leading many aspects of data sharing
2017
Metrics from Last year, june 2016 to june 2017
AN Active TEAM and Community
22 Community Calls
190 ATTENDEES
25 ORGANIZATIONS/UNIVERSITIES
10 countries
Community
975 Google Group
Messages
Community
7,114 IRC
Messages
Community
245 unique users
12 Sprints
(STARTED IN JANUARY 2017)
IQSS Dataverse Team
220 standup
Meetings
IQSS Dataverse Team
52,000 Slack Messages
IQSS Dataverse Team
43 GITHUB contributors
Code
334 Pull Requests
Code
8,335 github commits
Code
1,153 support tickets
Support
Dataverse CUp 2017
A VISION:
Dataverse as a key part of the FULL Research Data Lifecycle
Towards a DatA-CENtric Research Lifecycle
Data Collection
Lab
E-Notebooks
Instruments
Surveys
...
Assign DUA &
metadata
Cloud Computing and Storage
Run data & code
Explore & Visualize data
Track Provenance
Journals & Funders
Data Citation
Work with Sensitive Data
From Data Collection, to Computing And Sharing
Research Collaborations
Data Privacy
Big Data
Data Policies
Replication
...
Community
Standards and Best PRactices
Institutions Requirements
JOurnals
Requirements
Funders
Requirements
TEchnology Advances
Dataverse Community Meeting 2017
By Mercè Crosas
Dataverse Community Meeting 2017
- 2,547