The REUSE Initiative

Peter Moser, Andrea Janes

 

Developer's Thursday

12.12.2019 @ NOI Techpark Südtirol 

Introduction to the REUSE initiative

Reusing existing software...

  • ...is productive: one does not need to develop everything from scratch
    • If it is Free Software you can even:
      • evaluate the quality of the reused component
      • understand how maintained it is evaluating the community that is behind and looking at how often it is updated
  • ...can become a nightmare because of the license!

Risk of license conflicts

  • Is not easy for a developer to understand the legal consequences of reusing an existing Free Software component
  • Possible reasons:
    • Many licenses exist, e.g., the Software Package Data Exchange (SPDX) considers 348 licenses with 32 possible exceptions.
    • Domain specific language: the rights and obligations of each type of license are difficult to understand.
    • Complexity: it complex to understand all licenses in use in a project and how they can be combined.
    • Uncertainty: the consequences of violating a license are unclear as removing a component.

The REUSE Initiative

  • Initiated by the Free Software Foundation Europe
  • Has the goal to provide clear guidelines on how to publish Free Software so that license and copyright are clearly defined and machine readable.
  • The desired effects:
    • ​Make it easier to understand the license
    • Allow the creation of tools that find reusable components with compatible licenses
  • First release October 11/2017, second release 12/2017, third release 8/2019

The REUSE Initiative

  • The initiative defines 3 main practices:
    1. Choose and provide licenses
    2. Add copyright and licensing information to each file
    3. Confirm REUSE compliance
  • It also defines how these practices have to be implemented (pay attention on the next 3 slides)

Choose and provide licenses

  • You create a LICENSES directory in your project root
  • Download all used licenses and add them to the LICENSES folder:
    • Find the SPDX License Identifier of your your license in the SPDX License List (e.g., "GPL-3.0-or-later")
    • Download the license from https://github.com/spdx/license-list-data/tree/master/text

Add copyright and licensing information to each file

  • Add header to each file, indicating these tags:
    • SPDX-FileCopyrightText, to record the publication year and copyright holder of the contents of the file.
    • SPDX-License-Identifier, to indicate a SPDX License Expression
  • Or: add the header in a separate file with the same name but with a “.license” extension (used for binary files). 
  • Suggestion
    • Add build artifacts into .gitignore
    • License insignificant files with the SPDX Identifier CC0-1.0 

Confirm REUSE compliance

  • Confirm (Check) REUSE compliance using a linter tool

The study

  • If we look at GITHub repositories, how well do they respect the REUSE initiative rules?

How we conducted it

  • GitHub hosts 85 million repositories
  • If we would write a tool that can scan 1 repo/second it would have taken us 2.7 years (SFSCon 2022)
  • We took a sample of 1000 repositories of all repositories that had a release in the month before the study (from July to August 2017)
    • 416,776 repos
    • We used random.org to generate true random numbers. The randomness comes from atmospheric noise.
    • 1000 repos with 282,232 files
    • 70 repos did not exist anymore (deleted between sampling and actual download)

Architecture

Reuse checker in action

Results (1/2)

  • Of 930 repositories:
    • a single license file exists: 571 (61.4%)
    • the SPDX license file exists: 0 (0%)
    • the license folder exists: 2 (0.2%)
    • the debian license folder exists: 2 (0.2%)
    • the all used licenses are present: 568 (61.1%)
    • the readme file exists: 822 (88.4%
    • the authors file exists: 20 (2.2%)

Results (2/2)

  • Of 282,232 files:
    • with copyright information: 89,328 (31.7%)
    • with license found in file: 264,790 (93.8%)
    • with license found in .license: 1 (0.0%)
    • with license found in debian format: 642 (0.2%)
    • with license found in .spdx: 0 (0%)
    • where a SPDX license expression exists: 14,661 (5.2%)
    • with a valid SPDX license expression: 14,019 (5.0%)

Conclusion (2017)

  • The REUSE initiative goes to the right direction
  • Unfortunately (in our opinion) too many ways to be compliant
  • Have a look at the "Flight Rules" of IDM-Südtirol at https://github.com/idm-suedtirol/reuse that contain simple recipies to be REUSE compliant. It might be easier to digest.
  • The full specification is at https://reuse.software/

Conclusion (2019)

  • The REUSE initiative goes even more to the right direction
  • Unfortunately (in our opinion) too many ways to be compliant Now its easy to be compliant
  • Have a look at the "Flight Rules" of IDM-Südtirol at https://github.com/idm-suedtirol/reuse that contain simple recipies to be REUSE compliant. It might be easier to digest. (No more inventory :))
  • The full specification is at https://reuse.software/
  • I think we will do another study... ;)

The REUSE Initiative

By Andrea Janes

The REUSE Initiative

Presentation of the REUSE initiative during the Software Developers' Thursday

  • 162