
Welcome to the Dataverse Community Meeting
Dataverse Cup 2018

Dataverse,
PRESENT AND FUTURE
#dataverse2018
The PRESENT

33 dataverse installations since 2006


+10 new installations since last Community Meeting
Dataverse Google Groups Members

+ 128 members since last Community Meeting
https://groups.google.com/forum/#!forum/dataverse-community

DATAVERSE Google Group TOPICS
+ 265 topics since last Community Meeting

https://groups.google.com/forum/#!forum/dataverse-community

GITHUb Dataverse REPO
65 contributors since 2013
+ 22 since last Community Meeting
632 pull requests since 2013
+ 298 since last Community Meeting
11,203 commits since 2013
+ 2,868 since last Community Meeting

Dataverse TwiTTEr Followers
4,114 followers since 2012
+ 467 since last Community Meeting
@dataverseorg
#dataverse2018

DATAVERSE COMMUNITY CALLS
Since last Community Meeting:
21 Community Calls
180 participants
DATAVERSE IRC
Since last Community Meeting:
10,463 messages
358 unique users
Prior Year (June 2016-June 2017):
7, 114 messages
245 unique users

IQSS DATAVERSE TEAM
Since last Community Meeting:
20 sprints
223 standup meetings
156,165 Slack messages
964 support tickets
The Future
A growing community needs to become self-organized and leverage economies of Scale

what should we Pay attention to?
RESEARCH Data are becoming more Complex: large-scale, Streaming, Sensitive
Local, National, International Data Platforms Are BEing Built on the CLOUD
- NIH Data Commons, with AWS, Google Cloud, MS Azure:
NIH Data Commons pilot phase explores using the cloud to access and share FAIR biomedical data
- European Open Science Cloud, with open source clouds
The EOSC will allow for universal access to data and new level playing field for EU researchers
- Massachusetts Open Cloud, built on OpenStack
It will serve as a marketplace for industry partners as well as a place for researchers and industry to innovate and expose innovation to real users.
Data Citation, Reuse, and replication ARe growing, but slowly
Snapshot of the current state of Data citation
Garza, K., Fenner, M., DataCite Blog, June 2018

Out of the 22,000 links provided via Crossref DOIs, only 16% or 3,657 are links between literature and data.
But 40% increase in data citations (from 2,599 to 3,657) between March 2017 and March 2018.
Data Policies for Highly-Ranked social Science Journals
Crosas M, Gautier J, Karcher S, Kirilova D, Otalora G, Schwartz A, SocArxiv, March 2018

Does Not Have Data Policy
Has Data Policy
More than half of the journals have a data policy (except in History)
Data Policies for Highly-Ranked social Science Journals
Crosas M, Gautier J, Karcher S, Kirilova D, Otalora G, Schwartz A, SocArxiv, March 2018
No Data Policy
Encourage Data Sharing

Require Data Sharing
Economics, Political Science, and Psychology have higher # of journals requiring data sharing
Personal data Need to be protected

- Right of access, of rectification, to be forgotten, etc
- Informed consent as basis of use personal data

- Facebook will provide privacy-preserving data and access (through Dataverse)
- Seven nonprofit foundations will fund the research
- An eight will oversee the peer review process
what does this all mean?
Dataverse must be ready to :
-
Provide more options for data deposit, storage, and access to support large, streaming, and sensitive data
-
Integrate with data enclaves, cloud storage and computing, and local and global research clouds
-
Be compliant with new data regulations
-
Build incentives to integrate with journals and connect data to literature, via curation, exploration, and replication tools
-
Ensure compliance with data citation recommendations to "make data count"
Thank You
Dataverse Community Meeting 2018
By Mercè Crosas
Dataverse Community Meeting 2018
Introduction to the 2018 Dataverse Community Meeting. https://projects.iq.harvard.edu/dcm2018/agenda
- 2,459