Intentional OSS in science:
because OSS ≠ “one size”

Sidney Bell - CZI Foundation, Science Initiative

@sidneymbell (everywhere)

October 2023

Agenda

  • Motivation & goals

  • Overview of framework components

  • Overview of draft archetypes

  • Next steps & discussion

Motivations

There isn't a single model there. You have to sort of say, ‘What's the right interaction mode with the community you're dealing with and what supports your objectives?”

"I don't think [people] have a very good perspective on what it even means... Just because you go build in a public GitHub repo and slap a MIT or Apache license on it..it's not the same thing as when people talk about open source being a force in the world."

Quirks of developing scientific OSS

"So, they [Python, Linux] somehow built a movement appropriate for the size of their problem. And so to me, I don't look at a lot of the stuff in science, which is always going to be by its very nature, kind of very small scale and nichey and never get a lot of momentum, both because of the size of the ecosystem and the use cases, but also just because it obsoletes itself very quickly. I see that as very distinct from the big platform building open source world. It's just a completely different thing."

Framework target outcomes

  • Teams can articulate different models of OSS development in science and which one makes sense for their strategic goals

  • Teams have the strategic guidance they need to incorporate their OSS model(s) into their roadmaps & long-term sustainability plans

  • Teams start to integrate the most relevant OSS
    best practices for them into their workflows

How to use this framework

Build a holistic understanding of which OSS practices are key to setting your project up for success; take inventory of strengths & opportunities

2: Self Assessment

Make a concrete plan to bring your goals & your practices into alignment

3: Growth Plan

For each component of your project, identify which OSS Archetype aligns with your
strategic goals

1: OSS Archetypes

1. OSS Archetypes

  • Based on the Mozilla archetypes, adapted for scientific software

  • See the full matrix here

2. Self Assessment

  • We use the OSS Needs Assessment because it is clear, actionable and pragmatic.
     

  • There are no wrong answers. A higher score is not always “better.”
     

  • Holistic evaluation of the alignment between your goals & your practices.
     

2. Self Assessment

3. Growth Plan

SWOT analysis workshop
(strengths, weaknesses, opportunities, threats)

Deep dive:

Scientific OSS archetypes

Feedback requested! Especially:

- Clarify and differentiation

- Practices

- Examples

- Metrics

https://tinyurl.com/oss-archetypes

Archetypes overview: institutionally supported projects

Scientific
platform

Hosted applications with a single canonical instance

Controlled ecosystem

Core + plug-in model

Multi-institutional

collaboration

Projects with multiple formal host institutions; often larger in scale

Archetypes overview:
more flexible project types

Scientific Library
/ Pipeline

More mature specialized projects where any expert can contribute

Rocket ship to Alpha Centauri

More mature projects still owned
by the founder

Rocket ship to
the moon

Prototypes with rapid iteration
by the founder

Wide
Open

Democratic projects open to anyone

Bathwater

"Might as well" make it open
(abandoned rocket ships)

Scientific Platforms

Hosted applications with a single canonical instance

  • Strategic goals:

    Democratizing access to analysis tools / data (often via a centralized GUI), which may have a standard setting effect for a field.

  • Key practices:
    • Code accessibility (trust)
    • Licenses & copyright
    • Quality
    • User friendliness
  • Risks: Sustainability, alignment with community
  • Decision making: Host institution
  • Execution: Host institution
  • Markers of success: Adoption, quality

Nextstrain (web)

UCSC Genome Browser
Genbank

Rocket ship to the moon

Prototypes with rapid iteration by the founder

  • Strategic goals:

    Fast and focused development to test a scientific or technological hypothesis.

  • Key practices:
    • Code accessibility
    • Licenses & copyright
    •  Transparency & expectation management
  • Risks:
    • Everything depends on success of original vision
    • Missing out on partnerships
  • Decision making: Founder(s)
  • Execution: Founder(s)
  • Markers of success:
    • Speed of development
    • Achieving original technical goals and testing hypothesis

Seurat (early days)

Rocket ship to Alpha Centauri

More mature projects still owned by the founder

  • Strategic goals:

    Fast and focused development of a high-confidence solution to a high-priority problem.
    Founders retain strong influence.

  • Key practices:
    • + User friendliness
    • + Quality
    • (and all from Rocket Ship to the Moon)
  • Risks:
    • + Low “bus factor” is a concern for long-term sustainability
    • (and all from Rocket Ship to the Moon)
  • Markers of success:
    • + Adoption
    • + "Standard setting"

Seurat (now)

Scientific library / pipeline

More mature specialized project where any expert may contribute

  • Strategic goals:

    Pool development efforts to solve shared core problems & standardize analysis.

  • Key practices:
    • Transparency & Consensus building
    • Governance

    • Contributor community management
    • Quality & Releases
  • Risks:
    • Small developer pool & high maintenance costs in rapidly evolving scientific fields
  • Decision making: Core developer-driven
  • Execution:  Open to anyone with domain expertise
  • Markers of success:
    • User adoption, contribution quality
    • Code quality

Vispy

Wide Open

Democratic projects open to anyone

  • Strategic goals:

    Large-scale collaboration; community can become self-sustaining

  • Key practices:
    • Transparency & Consensus building
    • Governance
    • DEI & Contributor community management
    • Quality & Releases
  • Risks: founders' willingness to share decision making; high maintenance of onboarding pathways
  • Decision making: Group-based, consensus / democratic
  • Execution:  Open to anyone
  • Markers of success:
    • Contributor onboarding efficiency &
      community growth
    • Diversity of voices in decision making

Pandas, numpy

Acknowledgements

Questions?

@sidneymbell

https://sidneymbell.science

Intentional OSS in science: because OSS ≠ “one size”

By Sidney Bell

Intentional OSS in science: because OSS ≠ “one size”

  • 259