Semantic MediaWiki as a platform for lab management and biological annotation
Context
Work in laboratories or
core facilities


ProteoWiki
LIMS: Lab Information Management System
Proteomics Unit, CRG
ProteoWiki

ProteoWiki

ProteoWiki

Form input
Mail communication
- Based on Semantic Tasks extension
- Asking user for action (bring samples to the lab)
- Informing user about request status
- Users can opt out verbose communication
User satisfaction tracking
- When request closed
- Email sent. User directed to a Special Page form
- Valid for a limited time (e. g., 2 weeks max)
- Only editable a few times (or only once)
User satisfaction tracking

Lab operators extra input
- Wiki-way. Flexible. Some info structured, some not
- Documentation
- Standard Operation Procedures (SOP)
- Informal instrument queue

Biocore Wiki
Task management system
Bioinformatics Unit, CRG
Biocore Wiki

Biocore Wiki
Task input

Biocore Wiki
Task view

Biocore Wiki
Hour & costs list

Example of biological data Content Management System (CMS)
VastDB, Manuel Irimia's lab (CRG)
Biological data CMS
VastDB

Biological data CMS
VastDB

VastDB overview

Different data handling in MediaWiki as a CMS
- User import via specific extensions
- Using modified External data extension
- Extensions accessing file system
- Mirror of PDB structures
Semantic Data Import
Data from CSV input

Output view handled with
Semantic Data Import

Output view handled with
Rickshaw (D3.js)
CouchDB + Lucene
Making search faster
- CouchDB: NoSQL Document DBMS
- Lucene: Information retrieve library. ElasticSearch or Solr based on it
- Mapping SMW Templates to JSON documents
- Indexing for coordinates and full-text search
- It might be ported to ElasticSearch
CouchDB + Lucene
Coordinate search

CouchDB + Lucene
Full-text search

Genome Annotation
Wiki framework
AnnoWiki
Genome Annotation
AnnoWiki


Import and export formats
- FASTA files (sequences)
- GFF or GTF (feature, relationship, location)
- Others: chromosome sizes, etc.
- Raw text files
- When convenient external tools:
- NCBI-Blast
- SAMTools
- etc.
Import and export formats

Import and export formats
FASTA

Import and export formats
GFF
##gff-version 3
##sequence-region ctg123 1 1497228
ctg123 . gene 1000 9000 . + . ID=gene00001;Name=EDEN
ctg123 . TF_binding_site 1000 1012 . + . ID=tfbs00001;Parent=gene00001
ctg123 . mRNA 1050 9000 . + . ID=mRNA00001;Parent=gene00001;Name=EDEN.1
Integrating a genome browser

Integrating a genome browser

Linking pages,
conceptual hierarchies
- By using specific properties
- SMWParent extension
- Quick retrieval of linked elements
- Parent, ancestors
- Children, descendants
- Number of hops
- Filter by another property value
- Quick retrieval of linked elements
Linking pages,
conceptual hierarchies

Acknowledgements
Biocore Wiki
Carlos Company
Julia Ponomarenko
Luca Cozzuto
Sarah Bonnin
Guglielmo Roma
et al.
ProteoWiki
Eduard Sabidó
Francesco Mancuso
Cristina Chiva
Eva Borràs
Guadalupe Espadas
et al.
VastDB
Manuel Irimia
Javier Tapial
Luca Cozzuto
AnnoWiki
Luca Cozzuto
Carlos Company

... and all involved open-source community
Questions?
Semantic MediaWiki as a platform for lab management and
By Similis.cc
Semantic MediaWiki as a platform for lab management and
Presentation about several usages of Semantic MediaWiki framework within different lab management environments and biological annotation projects
- 5,650