The state of OBO Foundry and Wikidata integration
Introduction
Small detour: On licensing of ontologies and knowledge graphs
Creative Commons is a huge progress, but was not made for data
Wikidata is released under CC0 - no legal restrictions
Other seemingly open licenses (like CC-BY) are opaque w.r.t data reuse
OBO ontologies are community efforts → complicated to change the licenses
Most used licenses by OBO Foundry Ontologies
37 OBO Foundry Ontologies are in public domain
These ontologies could be fully imported into Wikidata
136 OBO Foundry Ontologies are CC-BY-licensed
Cannot be imported in full into Wikidata
Very unclear what parts can be reused
Labels? Descriptions? XREFs? Subclasses?
Wikidata properties for OBO Foundry ontologies
Mapping of entities is a pacific point, where no IP rights can be infringed
Wikidata properties for OBO Foundry ontologies
OBO identifiers were present in 3 different formats
Just the numeric part
Prefix + numeric, separated by "_"
Prefix + numeric, separated by ":"
Recently fixed by TiagoLubianaBot 3
Ids are encoded as strings what makes federated queries way harder
Some tricks are available, as wikidata generates IRIs in the backend
E.g. the "wdtn:" and the "psn:" domains, generated using the "formatter URI for RDF resource (P1921)"
see https://w.wiki/7mCv, thanks Andrawaag
Manually in the Wikidata interface
Manually via crowdcuration in the Mix'n'Match platform
Automatically or semi-automatically through scripts and Wikidata bots.
https://github.com/lubianat/obo_to_mixnmatch
Many OBO ontologies are (almost) fully mapped on Wikidata
Many OBO ontologies link to Wikipedia