Example Bioinformatics application in Neo4j

Toni Hermoso Pulido

isnobig, BCN - 2015

@toniher

Objectives

  • Store
  • Relate
  • Retrieve
  • Calculate
  • ... take decisions!

Directed Acyclic Graphs (DAGs)

NCBI Taxonomy - Gene Ontology (3 DAGs)

Simple Graph concepts

  • Node
  • Relationship

Neo4j

 

Neo4j organisation concepts

 

  • 1 server instance -> 1 database
  • Labels can allow having multiple non-linked graphs
    • In our case we used: GO_TERM, TAXID
  • Handling multiple instances: Neobox

Data import with py2neo

 

  • py2neo: Python wrapper
  • Based on Neo4j REST API
  • Offers Batch import
  • Recommendation:

Data import with py2neo

 

Querying the graphs

 

Problems:

Shortest path between 2 nodes and Lowest Common Ancestor

Solving problem with Cypher

Solving problem with JAVA libraries

Putting everything in an unmanaged extension

  • Server Configuration in conf
  • Plugin creation
    • Maven (Java Package Manager)
      • pom.xml (definition and dependencies)
    • To be placed in plugins directory
  • Extra libraries
    • e. g. minimal-json (placed in system/lib)

Extension as a REST service

  • REST API Landing path (conf/neo4j-server.properties)

 

org.neo4j.server.thirdparty_jaxrs_classes=cat.cau.neo4j.biorelation.rest=/biodb

  • Using JERSEY for REST API: javax.ws.rs
    • Similar to NodeJS Express, Python Flask or Perl Mojolicious
    • Documentation

 

Extension as a REST service

@Path("/parent")

 

        @GET
        @Path("/helloworld")

        @GET
        @Path("/distance/go/{acc1}/{acc2}")

 

Code

 

Questions?