Copyright versus open science: a story of data integration

November 15, 2015

Brussels, Belgium

dhimmel on:

#opencon

—Daniel Himmelstein

1. copyright

limited by:

  • fair use

  • originality (excludes facts)

2. contract

agreement entered into to receive access to a resource

  • can impose restrictions beyond copyright

restrictions on data

automatically granted to "original works of authorship" giving the exclusive right to:

  • copy​​

  • distribute

  • create derivatives

Chia-Jung Tsay

Samuel Mehr

Copyright prevents reproducible science

Sources: WaPo & Gaurdian

2013: Finds people use sight over sound for scoring music competitions

Study based on 6-second clips from 10 YouTube videos

Interested researchers cannot replicate her findings using different clips

3 of the original clips are no longer online

Time: 18 months

Result: Tsay claims she cannot provide the 3 removed videos due to copyright law 

Network for drug repurposing

  • 50k nodes
    10 types
     
  • 3M edges
    27 types
     
  • 28 public resources
     
  • thinklab.com
    DOI: 10.15363/thinklab.4
  • open for reuse & reproducibility

1. ∅ license

3. ∅ distribute

  • MSigDB — publicly-funded project from the Broad
  • publication data supplements

complications

4. standard

  • 9 resources
  • all rights reserved
  • upon contact:
    • 1 permission
    • 0 licenses added

2. unclear

  • 4 resources
  • clarification after laborious and slow permission requests
  • 11 resources
  • incompatibilities

5. government

  • 4 resources
  • public domain

Resolution after months & 5000+ word discussion: mixed approach

Recommendation:

release data as CC0

(public domain)