The state of OBO Foundry and Wikidata integration

Introduction

Small detour: On licensing of ontologies and knowledge graphs

  • Creative Commons is a huge progress, but was not made for data

  • Wikidata is released under CC0 - no legal restrictions

  • Other seemingly open licenses (like CC-BY) are opaque w.r.t data reuse

  • OBO ontologies are community efforts → complicated to change the licenses

Most  used licenses by  OBO Foundry Ontologies

37 OBO Foundry Ontologies are in public domain

These ontologies could be fully imported  into Wikidata

136 OBO Foundry Ontologies are CC-BY-licensed

  • Cannot be imported in full into Wikidata

  • Very unclear what parts can be reused

    • Labels? Descriptions? XREFs? Subclasses?

Wikidata properties for OBO Foundry ontologies

Mapping of entities is a pacific point, where no IP rights can be infringed

Wikidata properties for OBO Foundry ontologies

  • OBO identifiers were present in 3 different formats

    • Just the numeric part

    • Prefix + numeric, separated by "_"

    • Prefix + numeric, separated by ":"

      • Recently fixed by TiagoLubianaBot 3

  • Ids are encoded  as strings what makes federated queries way harder

  • Some tricks are available, as wikidata generates IRIs in the backend

    • E.g. the "wdtn:" and the "psn:" domains, generated using the "formatter URI for RDF resource (P1921)"

see https://w.wiki/7mCv, thanks Andrawaag

How are those identifiers mapped?

  • Manually in the Wikidata interface

  • Manually via crowdcuration in the Mix'n'Match platform

  • Automatically or semi-automatically through scripts and Wikidata bots.

https://github.com/lubianat/obo_to_mixnmatch

Title Text

Many OBO ontologies are (almost) fully mapped on Wikidata

Title Text

Many OBO ontologies link to Wikipedia

Example: CL on Wikidata

Identification of gaps

Modelling comparison

SSSOM Wikipedia maps

OBO Foundry and Wikidata connections

By Tiago Lubiana

OBO Foundry and Wikidata connections

  • 34