Example Bioinformatics application in Neo4j

Toni Hermoso Pulido

isnobig, BCN - 2015



  • Store
  • Relate
  • Retrieve
  • Calculate
  • ... take decisions!

Directed Acyclic Graphs (DAGs)

NCBI Taxonomy - Gene Ontology (3 DAGs)

Simple Graph concepts

  • Node
  • Relationship



Neo4j organisation concepts


  • 1 server instance -> 1 database
  • Labels can allow having multiple non-linked graphs
    • In our case we used: GO_TERM, TAXID
  • Handling multiple instances: Neobox

Data import with py2neo


  • py2neo: Python wrapper
  • Based on Neo4j REST API
  • Offers Batch import
  • Recommendation:

Data import with py2neo


Querying the graphs



Shortest path between 2 nodes and Lowest Common Ancestor

Solving problem with Cypher

Solving problem with JAVA libraries

Putting everything in an unmanaged extension

  • Server Configuration in conf
  • Plugin creation
    • Maven (Java Package Manager)
      • pom.xml (definition and dependencies)
    • To be placed in plugins directory
  • Extra libraries
    • e. g. minimal-json (placed in system/lib)

Extension as a REST service

  • REST API Landing path (conf/neo4j-server.properties)



  • Using JERSEY for REST API: javax.ws.rs
    • Similar to NodeJS Express, Python Flask or Perl Mojolicious
    • Documentation


Extension as a REST service