Python in Life Sciences

@juliocesar_io

https://juliocesar.io

Federating Data for Science

  • About federation
  • Architecture
  • Tech Stack
    • Ariadne GraphQL
    • Apollo Gateway
    • Nvidia Clara
  • Hands-on demo

 

Agenda

About Federation

It's been a challenge in bioinformatics.

Federation is the ability of multiple independent resources to act like a single resource. Cloud computing itself is a federation of resources, so the many assets, identities, configurations and other details of a cloud computing solution must be federated to make cloud computing practical’.

Architecture

High Level Components

The Stack

  • Gateway API
    • Implementing Services Ariadne/GraphQL
    • Federated Server Apollo
    • DynamoDB Managed Schemas
  • Aggregation Service
    • HDFS
    • Async tasks
    • Merge results
  • Computing Service
    • Nvidia Clara SDK (Models, TensorRT Server, DICOM/PACS Interface)
    • Kubernetes / Docker / Helm
    • Management API
    • Nvidia Render Server

Implementing Services

Ariadne GraphQL

  • Schema First SDL
  • Federated Schema
  • Apollo Support
  • The Python you know and love :)

Services Definition

Ariadne GraphQL

The Project structure

Ariadne GraphQL

type Query {
    models(id: ID, name: String, platform: String): [Model]
}

type Model @key(fields: "id"){
    id: ID!
    name: String!
    platform: String!
    data_type: String!
    kind: String!
    url: String!
    version: String!
}

Schema Definition

Ariadne GraphQL

type Query {
    medicalImage(id: ID): [MedicalImage]
}

type Model @key(fields: "id") @extends {
    id: ID! @external
    medicalImages: [MedicalImage]
}

type MedicalImage @key(fields: "id") {
    id: ID!
    type: String
    format: String
    url: String!
    model_id: Int!
}

AI Model Service

Medical Imaging Service

with the @extends and @external we reference the relationship with the AI Model Service

model = FederatedObjectType("Model")
query = QueryType()

@model.reference_resolver
def resolve_model_reference(_, _info, representation):
    return get_model_by_id(representation.get("id"))
  
  
schema = make_federated_schema("schema_file.graphql", [query, model])

Resolvers Reference

Ariadne GraphQL

AI Model Service

Ariadne GraphQL

 query = QueryType()
 model = FederatedObjectType("Model")
 medicalImage = FederatedObjectType("MedicalImage")
  
@model.resolve_reference
 def resolve_model(representation):
    model_id = representation.get('id')
    return Model(id=model_id)
  
  
@model.field("medicalImages")
def resolve_model_images(obj, info):
    kwargs = {
       "model_id": obj.id
    }

    return self.ds.get_medical_image(**kwargs)
  
schema = make_federated_schema("schema_file.graphql", [query, model, medicalImage])

Medical Imaging Service

Resolvers Reference

Making Queries on each service

Ariadne GraphQL

Federated Gateway on Apollo

Apollo Server 

Single data graph that provides a unified interface for querying all of your backing data sources. This allows clients to fetch data from any number of sources simultaneously, without needing to know which data comes from which source.

Federated Gateway on Apollo

Apollo Server 

Project Structure

Apollo Server 

Registering Services

Apollo Server 

const gateway = new ApolloGateway({
  serviceList: [
    { name: 'ai_model_service', url: 'http://localhost:4001' },
    { name: 'medical_imaging_service', url: 'http://localhost:4002' },
    { name: 'metrics_service', url: 'http://localhost:4003' }
  ]
});

const server = new ApolloServer({ gateway });
server.listen();

Schemas and resolvers live in the implementing services. The gateway serves only to plan and execute GraphQL operations across those implementing services

A managed way to register services with DynamoDB

Apollo Server 

A managed way to register services with DynamoDB

Apollo Server 

Making Federated Queries

Apollo Server 

The Response

Apollo Server 

{
    "data": {
        "models": [
            {
                "id": "100",
                "name": "segmentation_mri_brain_tumors_br16_t1c2tc_v1",
                "platform": "tensorflow_graphdef",
                "data_type": "TYPE_FP32",
                "kind": "KIND_GPU",
                "url": "https://api.ngc.nvidia.com/v2/resources/nvidia/clara/clara_ai_brain_tumor_pipeline/versions/0.6.0-2006.4/zip",
                "medicalImages": [
                    {
                        "id": "666",
                        "type": "MRI_T1c",
                        "format": "DICOM",
                        "url": "https://medical-imaging-service.s3.amazonaws.com/mock-sources/source-01/IMG0000.dcm"
                    },
                    {
                        "id": "667",
                        "type": "MRI_T1c",
                        "format": "DICOM",
                        "url": "https://medical-imaging-service.s3.amazonaws.com/mock-sources/source-01/IMG0001.dcm"
                    },
                    {
                        "id": "668",
                        "type": "MRI_T1c",
                        "format": "DICOM",
                        "url": "https://medical-imaging-service.s3.amazonaws.com/mock-sources/source-01/IMG0002.dcm"
                    }
                    .....
                    ...
                    ..
                    .

Computing Service

Nvidia Clara 

Managing and scaling Imaging, Genomics, and Video Processing workloads. It uses Kubernetes under the hood to define a multi-staged container-based pipeline.

Architecture

Nvidia Clara 

Service Interface Definition

Nvidia Clara 

Project Deployment in AWS

Nvidia Clara 

Using the services to run a model with Federated data

Nvidia Clara 

{
    "pipeline_tag": "brain-tumor-pipeline",
    "input_tag": "dcm",
    "use_cache": true,
    "query": "query {\r\n  models(name:\"segmentation_mri_brain_tumors_br16_t1c2tc_v1\"){\r\n    id\r\n    name\r\n    platform\r\n    data_type\r\n    kind\r\n    url\r\n    medicalImages{\r\n      id\r\n      type\r\n      format\r\n      url\r\n    }\r\n  }\r\n}\r\n"
}
{
    "status": "RUNNIG",
    "jobs": "http://ec2-18-220-78-85.us-east-2.compute.amazonaws.com:32002/jobs/",
    "render": "http://ec2-18-220-78-85.us-east-2.compute.amazonaws.com:8080/renderserver"
}

Response

The Clara Process 

Nvidia Clara 

The Render Server

Nvidia Clara 

Hands-on Demo

  • Showcase several services of the architecture

 

  • Use the gateway to get federated data

 

  • Compute a model on a Nvidia Tesla Tensor Core GPU & Visualize the results

What's next?

  • Federated Learning
  • More models
  • Add Business Logic
  • Distributed Computing
  • Build a scientific collaboration platform

on Github

@juliocesar_io

https://juliocesar.io

Python in life sciences, federating data for science

By Julio César

Python in life sciences, federating data for science

A data federated architecture for training AI models using Ariadne, Apollo and Nvidia Clara computing platform in Kubernetes for life sciences applications.

  • 683