Hyperparameter optimization using
Bayesian optimization on transfer learning for medical image classification
Rune Johan Borgli
Motivation
Methodology
Experiments
Conclusion
Introduction
Start
Finish
Introduction
Hyperparameter optimization
Bayesian optimization
Transfer learning
Medical image classification
Deep convolutional neural networks (CNNs)
Transfer Learning
Solve the problem of generalization in small datasets
Transfer learning has many hyperparameters.
We want to explore automatic hyperparameter optimization of these hyperparameters.
Our use case is datasets of gastroenterological observations.
Can hyperparameters for transfer learning be optimized automatically? If so, how does it influence the performance and what should be optimized?
How should a system be build that automatically can fulfill the task of automated hyperparameter optimization for transfer learning?
Start
Finish
Introduction
Motivation
The gastrointestinal (GI) tract is prone to many different diseases
Doctors use sensor and image data to diagnose
The procedure's success is heavily dependent on the doctor
Image of a polyp from a colonoscopy
Datasets for medical images are small
This makes them hard to generalize and they will often overfit on the dataset
CNN
CNN
Sequential
Used on black-box functions
Uses standard Gaussian process for creating a surrogate model
Uses expected improvement for the Aquisition function
In the medical field of Gastroenterology equipment for visualization of the gastrointestinal tract is used for diagnosis
The doctor's detection rate during examinations are heavily dependent on their experience and state-of-mind.
The generated data can be used with deep convolutional neural network models for assisting doctors by automatically detecting abnormalities and diseases.
The generated data can be used with deep convolutional neural network models for assisting doctors by automatically detecting abnormalities and diseases.
Annotated data for training is difficult to obtain, leading to small datasets.
CNNs are hard to train on small datasets from scratch as they tend to overfit.
Transfer learning is a solution. The technique revolves around transferring knowledge from a pre-trained model to a new task on a different domain.
When transfer learning is used, hyperparameter optimization is often ignored. We want to do automatic hyperparameter optimization.
Bayesian optimization is the method of optimization we will use.
Start
Finish
Introduction
Motivation
Methodology
Dyed lifted polyps
Dyed resection margins
Polyps
Ulcerative colitis
Normal cecum
Esophagitis
Normal z-line
Normal pylorus
BBPS 3
BBPS 2
BBPS 1
BBPS 0
Continuous from 1 to 10-4
Any layer from the first layer to the last layer of the model
1. We remove the classification block of the pre-trained model and replace it with a classification block set for the number of classes in our dataset.
The block is then trained.
2. We select a default delimiting layer of 2/3rds of the model length.
All layers after the selected layer then is trained.
3. We do step 2 again, but only optimizing the delimiting layer.
All layers after the selected layer then is trained.
Built on Keras with TensorFlow backend
Runs experiments on given datasets automatically
Results are written after each optimization run
Start
Finish
Introduction
Motivation
Methodology
Experiments
We split the dataset into 70% training data and 30% validation data
We remove the classification block of a CNN model pre-trained on ImageNet available from Keras, and replace it with a pooling layer and a softmax classification layer for the amount of classes in our dataset
We train the new classification block on the dataset
Hyperparameters are chosen by the optimizer
We tune all layers after the delimiting layer on the dataset
The delimiting layer is default 2/3rd of all layers and not chosen by the optimizer
Hyperparameters are chosen by the optimizer
Train again with the best hyperparameters from part 2
This time the optimizer chooses the best delimiting layer
Shared hyperparameters
Separate hyperparameters
Separate optimizations
Model
Layer
Layer
Tuning
Block
Layer
Model
Nonconvergence filtering
Competitive results
Model optimization
Layer optimization
Nonconvergence filtering
Competitive results
Esophagitis
Normal z-line
0.88
0.97
0.87
0.91
0.89
0.98
Our results
Shared hyperparameters
Separate hyperparameters
Separate optimizations
Model
Layer
Layer
Tuning
Block
Layer
Model
Model optimization
Layer optimization
Nonconvergence filtering
Competitive results
Runs failing to converge
Competitive results
Runs failing to converge
1.00
1.00
1.00
1.00
1.00
1.00
Our results
Start
Finish
Introduction
Motivation
Methodology
Experiments
Conclusion
Can hyperparameters for transfer learning be optimized automatically? If so, how does it influence the performance and what should be optimized?
Yes! The performance for both results increase performance of about 10 percent. The hyperparameters we chose are important, but the flaw uncovered should be avoided in future work.
How should a system be build that automatically can fulfill the task of automated hyperparameter optimization for transfer learning?
Using Bayesian optimization we proposed a system for automatic hyperparameter optimization for given datasets using three different hyperparameter optimization strategies.
Automatic hyperparameter optimization is an effective stategy for increasing performance in transfer learning use cases
Adjusting the delimiting layer reveals layers that are nontrivial to select manually
We show the usefulness of automatic hyperparameter optimization and present a system capable of running optimizations on present transfer learning solutions
A 3-minute lightning talk was held and a poster presented at Autonomy Day 2018 about the work from the thesis
The poster won the best poster award at the event
Try other optimizations methods
Optimize the Bayesian optimization
Remove the dependency between the pre-trained model and the delimiting layer
Try other hyperparameters
Start
Finish
Introduction
Motivation
Methodology
Experiments
Conclusion