Example Bioinformatics application in Neo4j
Toni Hermoso Pulido
isnobig, BCN - 2015
@toniher
Objectives
- Store
- Relate
- Retrieve
- Calculate
- ... take decisions!
Directed Acyclic Graphs (DAGs)
NCBI Taxonomy - Gene Ontology (3 DAGs)
Simple Graph concepts
- Node
- Relationship
Neo4j
- Easy Installation
- Java based
- REST API
Neo4j organisation concepts
- 1 server instance -> 1 database
- Labels can allow having multiple non-linked graphs
- In our case we used: GO_TERM, TAXID
- Handling multiple instances: Neobox
Data import with py2neo
Data import with py2neo
- Base Files: NCBI Taxonomy
- Base Files: Gene Ontology
- Code
Querying the graphs
Problems:
Shortest path between 2 nodes and Lowest Common Ancestor
Solving problem with Cypher
Solving problem with JAVA libraries
Putting everything in an unmanaged extension
- Server Configuration in conf
- Plugin creation
- Maven (Java Package Manager)
- pom.xml (definition and dependencies)
- To be placed in plugins directory
- Maven (Java Package Manager)
- Extra libraries
- e. g. minimal-json (placed in system/lib)
Extension as a REST service
- REST API Landing path (conf/neo4j-server.properties)
org.neo4j.server.thirdparty_jaxrs_classes=cat.cau.neo4j.biorelation.rest=/biodb
- Using JERSEY for REST API: javax.ws.rs
- Similar to NodeJS Express, Python Flask or Perl Mojolicious
- Documentation