The Oregon-Massachusetts Mammography Database

Bits to Bytes

Daniel Haehn

Nurit Haspel

Alexey Tonyushkin

Marc Pomplun

Dan Simovici

Bill Lotter

Greg Sorensen

5/5/2020

Haehn et al.: The Oregon-Massachusetts Mammography Database (OMAMA-DB)

World's Largest Publicly-Available Annotated Mammography Dataset

70,000 imaging studies with ground truth labels

acquired data over 9 years

GE (63%), Hologic (37%)

50,000 mammograms

20,000 tomosynthesis

no tomosynthesis

> 95% Hologic

proprietary

Haehn et al.: The Oregon-Massachusetts Mammography Database (OMAMA-DB)

manual annotation of 70,000 studies is not feasible

Intelligent Annotation Framework

Two Artificial Neural Networks..

..work together for Quality control

Discriminator finds error pattern of Classifier

Haehn et al.: The Oregon-Massachusetts Mammography Database (OMAMA-DB)

Intelligent Annotation Framework

can tune classifiers beyond breast cancer

open source from the start

algorithmic contribution

Breast Cancer Database

will spur automatic detection advances

challenge datasets

raw data available right now

70,000 annotations in the first 2 years

Newest GPUs in UMass Boston's GPU Cluster are 6 years old

Excellent training opportunities for the most diverse student population in New England

Haehn et al.: The Oregon-Massachusetts Mammography Database (OMAMA-DB)

3 PhD Students

1 Data Scientist

Undergraduate Students

DeepHealth successfully works with a team of 4 students

Haehn et al.: The Oregon-Massachusetts Mammography Database (OMAMA-DB)

Dr. Marco Nolden

Dr. Ron Kikinis

Dr. Regina Barzilay

Dr. Jill Macoska

Dr. Gordon Harris

Dr. Mansi Saksena

Genomics Data

Breast MRI Data

Lymphoma Data

Knowledge Transfer (Year 3)

Haehn et al.: The Oregon-Massachusetts Mammography Database (OMAMA-DB)

Algorithmic Research at UMass Boston

AI Implementation at UMass Memorial

More Data

Intelligent Annotations

Questions?

Haehn et al.: The Oregon-Massachusetts Mammography Database (OMAMA-DB)

More Annotations...

Dr. Steve Pieper

paired with expert validations

Haehn et al., CVPR 2018

post-training optimization of GP acc.

from 0.5982 to 0.9087

Preliminary Results

Haehn et al.: The Oregon-Massachusetts Mammography Database (OMAMA-DB)

Not-for-profit sharing after project completion!

NVidia DGX-2 AI Supercomputer $303,851.00
5x Annotation Workstations $32,527.95
5x Wacom Tablets $2,049.80
18 Terabyte Dedicated Storage $28,244.74
DeepSight Software $374,000.00
Total Requested $740,673.00

Haehn et al.: The Oregon-Massachusetts Mammography Database (OMAMA-DB)

B2B

By Daniel Haehn

B2B

The Oregon-Massachusetts Mammography Database

  • 246

More from Daniel Haehn