Dataverse Community MEEting

10 Years Sharing Data with Dataverse

 

#dataverse2017

<2006

Once there was the VDC

2

2006

And then came the Dataverse Network

2015

14

2

2006

Now we have the Dataverse

2015

2017

2

14

23

Researchers ARe Sharing and Using Data

200 datasets/month


4,000
files/month

60,000
downloads/month

Harvard Dataverse

> 70,000 datasets

> 2.5 M downloads


> 340,000 files

< 2006

When we started, there were very few journals with data policies,

no data requirements from funders

2006

2015

2017

weak = recommend

strong = require

Weak data sharing and strong data sharing vs. disciplines

Castro, Crosas, Garnett, Sheridan, Altman, 2017, Journal of Scholarly Publishing, Forthcoming

Now, Journals 

across disciplines

start supporting data policies

Genetics Journals

Biomedical Journals

Computational Sciences

Economics

Open Access Journals

Ecology

33%

4%

2006

2015

2017

And Funders require data sharing 

PRIVATE RESEARCH FUNDERS

  • Bill and Melinda Gates Foundation Information Sharing Approach
  • Sloan Foundation Data Sharing Policy
  • Wellcome Trust Data Sharing Policy
  • Arnold Foundation
  • Moore Foundation
  • Robert Wood Johnson Foundation
  • HHMI Policy on the Sharing of Publication-Related Materials, Data and Software

 

PUBLIC RESEARCH FUNDERS

  • Department of Agriculture
  • Department of Commerce
  • Department of Defense
  • Department of Education
  • Department of Energy
  • Department of Health and Human Services
    • Agency for Healthcare Research and Quality (AHRQ)
    • Assistant Secretary for Preparedness and Response (ASPR)
    • Center for Disease Control and Prevention (CDC)
    • Food and Drug Administration (FDA)
    • National Institutes of Health (NIH)
  • Department of Homeland Security
  • Department of Housing and Urban Development
  • Department of Interior
  • Department of Labor
  • Department of Transportation
  • Department of Veterans Affairs
  • Environmental Protection Agency (EPA)

We are Experiencing a Cultural Change

We are Experiencing a Cultural Change

We Are the Cultural Change!

King, 1995, Replication, Replication

Altman and King, 2007, A Proposed for the Scholarly Citation of Quantitative Data

Altman et al, 2001, A Digital Library for the Dissemination and Replication of Quantitative Social Science

King, 2007, An Introduction to the Dataverse Network as an Infrastructure for Data Sharing

Crosas, Honaker, King, Sweeney, 2015, Automating Open Science for Big Data

Crosas, 2012, The Dataverse Network: an open source application for sharing, discovering, and preserving research data

Altman and Crosas, 2013, The Evolution to Data Citation: from principles to implementation

Crosas, 2013, A Data Sharing Story

2014, Joint Declaration of Data Citation Principles

Pepe et al, 2014, How Do  Astronomers Share Data?

Goodman et al, 2014, Ten Simple Rules for the Care and Feeding of Scientific Data

Castro et al, 2015, Achieving Human and Machine Accessibility of Cited Data

Sweeney, Crosas, Bar-Sinai, 2015, Sharing Sensitive Data with Confidence: The DataTags System

Meyer et al.  2016, Data Publication with the  Structural Biology Data Grid Supports Live Analysis

Wilkinson et al, 2016, The FAIR Guiding Principles for Scientific Data Management and Stewardship

Bierer, Crosas, Pierce, 2017, Data Authorship as an Incentive to Data Sharing

The Dataverse project and team leading many aspects of data sharing

2017

Metrics from Last year, june 2016 to june 2017

AN Active TEAM and Community

 

22 Community Calls

 

190 ATTENDEES
25 ORGANIZATIONS/UNIVERSITIES

10 countries

Community

975 Google Group
Messages

Community

7,114 IRC
Messages

Community

245 unique users

12 Sprints 

(STARTED IN JANUARY 2017)

IQSS Dataverse Team

220 standup
Meetings

IQSS Dataverse Team

52,000 Slack Messages

IQSS Dataverse Team

43 GITHUB contributors

Code

334 Pull Requests

Code

8,335 github commits

Code

1,153 support tickets

Support

Dataverse CUp 2017

A VISION:

Dataverse as a key part of the FULL Research Data Lifecycle 

Towards a DatA-CENtric Research Lifecycle

Data Collection

Lab

E-Notebooks

Instruments

Surveys

...

Assign DUA &

metadata

 

Cloud Computing and Storage

 

Run data & code

Explore & Visualize data

Track Provenance

Journals & Funders

Data Citation

Work with Sensitive Data

From Data Collection, to Computing And Sharing

Research Collaborations

Data Privacy 

Big Data

Data Policies

Replication

...

Community

Standards and Best PRactices
 

Institutions Requirements

JOurnals
Requirements

Funders
Requirements

TEchnology Advances

Dataverse Community Meeting 2017

By Mercè Crosas

Dataverse Community Meeting 2017

  • 2,387