HPC Day 2022

Processing of                 

Biological Datasets at UMass Boston

MASSIVE

MACHINE

PSYCHOLOGY

DANIELHAEHN.COM

Biomedical Imaging

Chart Question Answering

We

Computers

GONZO

MONSTER

VAMPIRE

ZOMBIE

GARGOYLE

2x NVIDIA DGX A100

Thank you!

The Oregon-Massachusetts Mammography Database

OMAMA-DB

Daniel Haehn

Nurit Haspel

Marc Pomplun

Dan Simovici

Bill Lotter

Greg Sorensen

Ryan Zurrin

Pablo Bendiksen

Neha Goyal

Kendrick Khaev

Muskaan Manocha

Jeff Dusenberry

World's Largest Mammography Dataset

publicly available!

with labels!

World's Largest Mammography Dataset

we started with ~1 million images!

2000 x 2000 pixels

100 x 2000 x 2000 voxels

~8 MB

~500 MB

~8 TB

Intelligent Annotation Framework

Two Artificial Neural Networks..

..work together for Quality control

Discriminator finds error pattern of Classifier

Augmented Intelligence

Human + AI

import omama as O

pred = O.DeepSight.run(imgs)

+

=

Drained nodes...

...but it worked in the end!

Browser

Editor

Jupyter Notebook

SSHFS mounted Home Folder

TMUX on head node

SSH Tunnels (hop via head node to compute node)

Fly Retina Project

Jens Rister

Barry Stein

Daniel Haehn

Kunal Jain

Looking at the color-sensing photoreceptors

using an Electron Microscope

Kiran Balivada

Jeff Dusenberry

6144 x 4096 x 862 voxels

10 x 10 x 40 nm^3

How to transfer large datasets across institutions?

260+ GB

curl 'https://eastus1-mediap.svc.ms/transform/zip?cs=fFNQTw' \
  -H 'authority: eastus1-mediap.svc.ms' \
  -H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
  -H 'accept-language: en-US,en;q=0.9,de;q=0.8' \
  -H 'cache-control: max-age=0' \
  -H 'content-type: application/x-www-form-urlencoded' \
  -H 'origin: https://liveumb-my.sharepoint.com' \
  -H 'sec-ch-ua: "Chromium";v="104", " Not A;Brand";v="99", "Google Chrome";v="104"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "Linux"' \
  -H 'sec-fetch-dest: iframe' \
  -H 'sec-fetch-mode: navigate' \
  -H 'sec-fetch-site: cross-site' \
  -H 'upgrade-insecure-requests: 1' \
  -H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36' \
  --data-raw 'zipFileName=VSDataTileSet2.zip&guid=00c22792-c697-4def-b45e-ec0560018fe4&provider=spo&files=%7B%22items%22%3A%5B%7B%22name%22%3A%22VSDataTileSet2%22%2C%22size%22%3A0%2C%22docId%22%3A%22https%3A%2F%2Fliveumb-my.sharepoint.com%3A443%2F_api%2Fv2.0%2Fdrives%2Fb%21ZHO7Y3eH0EOqfeg_5VdUmSUuEmzqastElTPzN9v8GRKXPTMoP_jwQ44q0Km6lnis%2Fitems%2F01OSC7C5I2RBH7FQHPWFCY7B2FKLTC2YFF%3Fversion%3DPublished%26access_token%3DeyJ0eXAiOiJKV1QiLCJhbGciOiJub25lIn0.eyJhdWQiOiIwMDAwMDAwMy0wMDAwLTBmZjEtY2UwMC0wMDAwMDAwMDAwMDAvbGl2ZXVtYi1teS5zaGFyZXBvaW50LmNvbUBiOTcxODg3MS0xZWU5LTQ0MjUtOTUzYy0xYWNlMTM3M2ViMzgiLCJpc3MiOiIwMDAwMDAwMy0wMDAwLTBmZjEtY2UwMC0wMDAwMDAwMDAwMDAiLCJuYmYiOiIxNjYyNjgxNjAwIiwiZXhwIjoiMTY2MjcwMzIwMCIsImVuZHBvaW50dXJsIjoiV1lwT2cvdC9xbFllYXByYlVEa1JVNURBSFVQZGszYXRxZ0l5SDhveE5EQT0iLCJlbmRwb2ludHVybExlbmd0aCI6IjExNyIsImlzbG9vcGJhY2siOiJUcnVlIiwidmVyIjoiaGFzaGVkcHJvb2Z0b2tlbiIsInNpdGVpZCI6Ik5qTmlZamN6TmpRdE9EYzNOeTAwTTJRd0xXRmhOMlF0WlRnelptVTFOVGMxTkRrNSIsInNpZ25pbl9zdGF0ZSI6IltcImttc2lcIl0iLCJuYW1laWQiOiIwIy5mfG1lbWJlcnNoaXB8ZGFuaWVsLmhhZWhuQHVtYi5lZHUiLCJuaWkiOiJtaWNyb3NvZnQuc2hhcmVwb2ludCIsImlzdXNlciI6InRydWUiLCJjYWNoZWtleSI6IjBoLmZ8bWVtYmVyc2hpcHwxMDAzMjAwMDU1OGJkNmQ5QGxpdmUuY29tIiwic2lkIjoiNzA5NzEwZGYtOTViOC00YWVkLThkZWEtYTUwNDJhZjMwMGRkIiwidHQiOiIwIiwidXNlUGVyc2lzdGVudENvb2tpZSI6IjMiLCJpcGFkZHIiOiI3My4xNDkuMjMuMjUifQ.bTFzRnRSV3RaNE5MU2l2bUI3d3dOdW9tUm5yNmRJUTh3WUQvWUsybjhPaz0%22%2C%22isFolder%22%3Atrue%7D%5D%7D&oAuthToken=' \
  --compressed

20 GB Transfer limit

Jeff Dusenberry

Director of Research Computing

at UMass Boston

SFTP

no shell access

MACHINE

PSYCHOLOGY

DANIELHAEHN.COM

Thank you!