How to make SPDX industry standard for AI/ML

Grab the slides:
 slides.com/cheukting_ho/spdx-for-ai-ml/

Hello I am Cheuk

  • Open-Source contributor


     
  • Organisers of community events


     
  • PSF director and fellow
     
  • Community manager at OpenSSF

Who has looked at the ingredient list?

Think of last time you opened a pack of snacks

There is a need to know what your food is made of

same as other consumer goods we used everyday

including software that we used everyday

Software Bill of Materials (SBOMs)

  • list of all the open source and third-party components present in a codebase
     
  • lists the licenses that govern those components
     
  • the versions of the components used in the codebase and patch status
     
  • like ingredients list of a food product

Do you know there is a stardard format of how to list the ingredients?

According to Food labelling and packaging in gov.uk

  • If your food or drink product has 2 or more ingredients (including any additives), you must list them all
     
  • Ingredients must be listed in order of weight, with the main ingredient first
     
  • You must highlight allergens on the label using a different font, style or background colour.

So there should be some standard format of how to list the SBOMs, right?

Software Package Data Exchange (SPDX)

  • open standard describing SBOMs
     
  • common format to reduce redundant work sharing important release data
     
  • freely available international open standard
    (ISO/IEC 5692:2021)
     
  • formats that are both machine- and human-readable
     
  • efficient exchange of metadata in the supply chain

SPDX 2.3 (current release)

  • external security information reference

  • reference to a Common Vulnerabilities and Exposures (CVE) advisory

  • satisfying US Executive Order 14028 Minimum Elements for an SBOM

  • verify the provenance and integrity of the software

  • an ISO Standard: ISO/IEC 5962:2021

  • Signing an SPDX SBOM with Sigstore’s Cosign

SPDX 2.3 is a pretty good for software

But we can make it better

Hardware

Software

System

🎯

SPDX 3.0
(release candidate)

  • new Security, Build, Data and AI profiles
     
  • support database better
     
  • capture domain-specific information
     
  • capture AI/ML models and dataset provenance

How to get the AI/ML and data community to adopt SPDX 3.0

This is great! But...

AI/ML risks

  • data breach and privacy risks - rely on data
     
  • the system is more complex - lots of black boxes
     
  • AI bloom - less careful
     
  • new vulnerability - prompt injections

Quick adoption

  • Thorough profile
     
  • Less burden to start
     
  • Universial standard
     
  • Satisfying policies

Good tool

Outreaching

  • Examples
     
  • Usecases
     
  • Education

&

Where to start

  • SPDX 3.0 profileis quite thorough
  • Communicate with policies makers

Where to start

  • To create a univerersial standard
     
  • More outreaching
     
  • Go to where the AI/ML community is
     
  • Understand their needs

Good tool

Outreaching

&

Call to action

  • Adopt SPDX 2.3 right now
     
  • Contribute to SPDX 3.0 model - try the 3.0-rc
     
  • Engaging in outreaching activities
     
  • Keep communicating with policies makers and users

Let's make SPDX industry standard for AI/ML

Thank you!

Grab the slides:
 slides.com/cheukting_ho/spdx-for-ai-ml/

How to make SPDX industry standard for AI/ML

By Cheuk Ting Ho

How to make SPDX industry standard for AI/ML

  • 80