Software in research: risks and reproducibility

By Simon Hettrick

 12 December 2019 - ReproducibiliTea, University of Southapton   @sjh5000  ORCID: 0000-0002-6809-5195

Software?

These slides: http://bit.ly/35nTYzX

Research software is any software that is used to generate, process or analyse results that you intend to appear in a publication.

 

It can be anything from a few lines of code written by yourself, to a professionally developed software package.

What about Excel?

Is software important to research?

Part 1

These slides: http://bit.ly/35nTYzX

Use

software

Fundamental to

results

Develop own code

69%

92%

56%

Percentage of cross-council funding in software-reliant research

Software in publications

65%

31 UK institutions ~ 600k papers

Is software important to research?

part 2

These slides: http://bit.ly/35nTYzX

[1]

Growth in a Time of Debt

"All I can hope is that future historians note that one of the core empirical points providing the intellectual foundation for the global move to austerity in the early 2010s was based on someone accidentally not updating a row formula in Excel”

Mike Konczal, Roosevelt Institute

Further info: [2, 3]

[4]

Ancient Ethiopian genome reveals extensive Eurasian admixture throughout the African continent

the geographic extent of the genetic impact of this migration was overestimated... the Yoruba and Mbuti do not show higher levels of Western Eurasian ancestry

[5]

The error is in the code that converts a 64-bit floating-point number to a 16-bit signed integer... the 64-bit numbers [are] larger... than [the previous code], triggering an overflow condition...

 

...[the] backup computer crashes, followed 0.05 seconds later by a crash of the primary computer.

[6]

What does the future hold?

"Not at all"

"Vital"

Reproducibility

These slides: http://bit.ly/35nTYzX

The ability to reproduce an experiment and/or analysis and generate the same result as previously found

How can research be reproducible if a key contribution is hidden?

...it takes skill

Who writes the software?

These slides: http://bit.ly/35nTYzX

No career path for software developers

Online courses and books

Attended a course

No training

29%

46%

50%

25%

0%

25%

"Professional"

"Beginner"

Training

Research

Software

Engineering

Researcher

Software

Engineer

Researcher

developer

Research

Software

Engineer

www.society-rse.org

RSE Conference

rse.ac.uk/conf/2019

RSE Groups at

28 organisations

bit.ly/RSEGroupsUK

Software sustainability

These slides: http://bit.ly/35nTYzX

The software you use today is available for use in the future

  • Software engineering

  • Citation

  • Reward

  • Licensing

  • Community building

  • Training

  • Access to specialist skills

  • Funding

[7]

  • Software is vital to research

  • If you care about reproducibility, you must care about software

  • Ensuring that your software is available in the future is the goal of software sustainability 

These slides: http://bit.ly/35nTYzX

Thank you!

@sjh5000

ORCID: 0000-0002-6809-5195

Licence

 © Simon Hettrick

These slides are licensed under a Creative Commons Attribution 4.0 International 

https://creativecommons.org/licenses/by/4.0/

 These slides: http://bit.ly/35nTYzX 

Studies

  • National software survey:
    https://www.software.ac.uk/blog/2017-09-06-journey-reproducibility-excel-pandas

  • Southampton software survey:
    https://github.com/Southampton-RSG/soton_software_survey_analysis_2019/blob/master/report/Research%20software%20at%20the%20University%20of%20Southampton.pdf

  • Grants data analysis:
    https://github.com/softwaresaved/software_in_grants_GTR

  • Eprints publication analysis coming soon

 

Links

  1. Reinhart, Carmen M.; Rogoff, Kenneth S. (2010).  "Growth in a Time of Debt".  American Economic Review. 100 (2): 573–78.  doi:10.1257/aer.100.2.573

  2. https://qz.com/75035/fixing-this-excel-error-transforms-high-debt-countries-from-recession-to-growth/

  3. http://www.nytimes.com/2013/04/26/opinion/debt-growth-and-the-austerity-debate.htm

  4. Llorente et al. Science, 350, 6262, doi:10.1126/science.aad2879

  5. Llorente et al. 10.1126/science.aaf3945

  6. http://www-users.math.umn.edu/~arnold/disasters/ariane5rep.html

  7. https://www.ukri.org/files/infrastructure/the-uks-research-and-innovation-infrastructure-opportunities-to-grow-our-capacity-final-low-res/

 

Software: risks, reproducibility and sustainability

By Simon Hettrick

Software: risks, reproducibility and sustainability

Presented at the Reproducibilitea workshop on 12 December 2019 at the University of Southampton.

  • 1,334