Detecting masked faces in the pandemic world

Packaging your Machine Learinng message

Vladimir Iglovikov

Sr. Software Engineer at Lyft, Level5

Ph.D. in Physics

Kaggle Grandmaster

Career ladder

Fresh grad without PhD

Fresh grad with PhD

(could have 100500 papers at NeurIPS, CVPR, ECCV, etc)

Glass ceiling

You work harder than everyone else.
You train State Of The Art models.
You publish papers at top conferences.

Most likely (90% chance) you will never get to L6!

Data Scientist skill tree

Ownership

Technical

Communication

Your level

Not your level

Models are not everything!

You need to train

ownership

and

communication!

What SMALL improvements can you add to your modeling process to train ownership and communication?

Example:

Face Mask Detector

Target audience:

Kagglers
Academics
Junior Data Scientists

Three ways

Detector with 2 classes:

mask / no mask

Detector with two heads:

face and face attributes

Detector + classifier

Slowest but the most accurate.

Our choice.

Face Detector

Face classifier

0.99

0.01

Stage I

Stage II

Face mask detection in two stages

Face detector: simplified RetinaFace

class + bbox + landmarks

RetinaFace: Single-stage Dense Face Localisation in the wild arXiv:1905.00641

Face Detection: Widerfaces dataset

http://shuoyang1213.me/WIDERFACE/
32,203 images
393,703 faces

Training

https://github.com/ternaus/retinaface

Predictions: boxes

You have a model. What is next?

+5 min: Publish your code at GitHub as is.
+10 min: Add code formatters and style checkers.
+20 min: Create a clear readme.
+20 min: Create a collab notebook with an example.
+20 min: Make a library and upload it to PyPI
+20 min: Build a web app.
+4 hours: Write a blog post.
+2 hours: Create video with a demo.

https://github.com/ternaus/retinaface

pip install retinaface-pytorch

GitHub

PyPi

Collab

https://github.com/ternaus/retinaface_demo

http://retinaface.herokuapp.com/

import numpy as np
import streamlit as st
from PIL import Image
from retinaface.pre_trained_models import get_model
from retinaface.utils import vis_annotations
import torch

st.set_option("deprecation.showfileUploaderEncoding", False)

@st.cache
def cached_model():
    m = get_model("resnet50_2020-07-20", max_size=1024, device="cpu")
    m.eval()
    return m

model = cached_model()

st.title("Detect faces and key points")

uploaded_file = st.file_uploader("Choose an image...", type="jpg")

if uploaded_file is not None:
    image = np.array(Image.open(uploaded_file))
    st.image(image, caption="Before", use_column_width=True)
    st.write("")
    st.write("Detecting faces...")
    with torch.no_grad():
        annotations = model.predict_jsons(image)

    if not annotations[0]["bbox"]:
        st.write("No faces detected")
    else:
        visualized_image = vis_annotations(image, annotations)

        st.image(visualized_image, caption="After", use_column_width=True)