Detecting masked faces in the pandemic world
Packaging your Machine Learinng message
Vladimir Iglovikov
Sr. Software Engineer at Lyft, Level5
Ph.D. in Physics
Kaggle Grandmaster
Career ladder
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635011/Screenshot_from_2020-08-13_09-36-51.png)
Fresh grad without PhD
Fresh grad with PhD
(could have 100500 papers at NeurIPS, CVPR, ECCV, etc)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635048/glass_ceiling.jpeg)
Glass ceiling
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635011/Screenshot_from_2020-08-13_09-36-51.png)
- You work harder than everyone else.
- You train State Of The Art models.
- You publish papers at top conferences.
Most likely (90% chance) you will never get to L6!
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635048/glass_ceiling.jpeg)
Data Scientist skill tree
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635085/rpg_tree.jpg)
Ownership
Technical
Communication
Your level
Not your level
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635234/glasses-removebg-preview.png)
Models are not everything!
You need to train
ownership
and
communication!
What SMALL improvements can you add to your modeling process to train ownership and communication?
Example:
Face Mask Detector
Target audience:
- Kagglers
- Academics
- Junior Data Scientists
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635614/trump.png)
Three ways
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635602/trump.jpeg)
Detector with 2 classes:
mask / no mask
Detector with two heads:
face and face attributes
Detector + classifier
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635614/trump.png)
Slowest but the most accurate.
Our choice.
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635636/happy.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635602/trump.jpeg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636120/trump1.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636121/trump2.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636122/trump3.jpg)
Face Detector
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636120/trump1.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636121/trump2.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636122/trump3.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636120/trump1.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636121/trump2.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636122/trump3.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636120/trump1.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636121/trump2.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636122/trump3.jpg)
Face classifier
0.99
0.99
0.01
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635614/trump.png)
Stage I
Stage II
Face mask detection in two stages
Face detector: simplified RetinaFace
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635646/retuinanet.png)
class + bbox + landmarks
class + bbox + landmarks
class + bbox + landmarks
RetinaFace: Single-stage Dense Face Localisation in the wild arXiv:1905.00641
Face Detection: Widerfaces dataset
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635774/wider.jpg)
- http://shuoyang1213.me/WIDERFACE/
- 32,203 images
- 393,703 faces
Training
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636097/Screenshot_from_2020-08-13_16-42-01.png)
Predictions: boxes
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635769/retinaface_preds.jpeg)
You have a model. What is next?
- +5 min: Publish your code at GitHub as is.
-
+10 min: Add code formatters and style checkers.
- +20 min: Create a clear readme.
- +20 min: Create a collab notebook with an example.
- +20 min: Make a library and upload it to PyPI
- +20 min: Build a web app.
- +4 hours: Write a blog post.
- +2 hours: Create video with a demo.
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635941/Screenshot_from_2020-08-13_15-32-50.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635953/Screenshot_from_2020-08-13_15-35-42.png)
pip install retinaface-pytorch
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7635997/Screenshot_from_2020-08-13_15-43-49.png)
GitHub
PyPi
Collab
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636015/Screenshot_from_2020-08-13_15-58-05.png)
import numpy as np
import streamlit as st
from PIL import Image
from retinaface.pre_trained_models import get_model
from retinaface.utils import vis_annotations
import torch
st.set_option("deprecation.showfileUploaderEncoding", False)
@st.cache
def cached_model():
m = get_model("resnet50_2020-07-20", max_size=1024, device="cpu")
m.eval()
return m
model = cached_model()
st.title("Detect faces and key points")
uploaded_file = st.file_uploader("Choose an image...", type="jpg")
if uploaded_file is not None:
image = np.array(Image.open(uploaded_file))
st.image(image, caption="Before", use_column_width=True)
st.write("")
st.write("Detecting faces...")
with torch.no_grad():
annotations = model.predict_jsons(image)
if not annotations[0]["bbox"]:
st.write("No faces detected")
else:
visualized_image = vis_annotations(image, annotations)
st.image(visualized_image, caption="After", use_column_width=True)
Web app. 35 lines of code.
Mask classifier
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636072/taylor_swift.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636073/klimova.jpg)
Network
0.9999
0.0001
Task
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636140/Screenshot_from_2020-08-13_17-08-32.png)
Training mask classifier
-
Mask classification collab: https://colab.research.google.com/drive/1VkSK5MKIuGPIA31KJpGiFe_FafYC4xfD
-
Detection + classification collab: https://colab.research.google.com/drive/13Ktsrx164eQHfDmYLyMCoI-Kq0gC5Kg1
-
WebApp: https://facemaskd.herokuapp.com/
The same deliverables
Summary
There are small incremental steps that make your work:
- make your work better
- more visible
Will boost your ownership and comminication.
- Code to Github
- Add readme
- Clean code
- Make a demo notebook
- Build a library
- Build WebApp
- Give a talk
- Write a blogpost
Thank you!
![](https://s3.amazonaws.com/media-p.slid.es/uploads/647006/images/7636413/KLP_9393.jpg)
Blog: http://ternaus.blog
Twitter: @viglovikov
Kaggle: https://www.kaggle.com/iglovikov
LinkedIn: https://www.linkedin.com/in/iglovikov/
Packaging your message
By Vladimir Iglovikov
Packaging your message
- 1,341