Train. Serve. Deploy!
Story of a NLP Model ft. PyTorch, Docker, Uwsgi and Nginx
whoami
- Data Scientist at GoDaddy
- Work with deep learning based language modeling
- Dancing, hiking, (more recently) talking!
- Meme curator, huge The Office fan
ML Modeling
Training
Testing
ML Modeling
THEN WHAT?
ML Modeling
DEPLOYMENT
What This Talk IS
Machine Translation Systems: seq2seq model
Data
$ head orig/de-en/train.tags.de-en.de
<url>http://www.ted.com/talks/lang/de/stephen_palumbi_following_the_mercury_trail.html</url>
Das Meer kann ziemlich kompliziert sein.
Und was menschliche Gesundheit ist, kann auch ziemlich kompliziert sein.
.
.
$ head orig/de-en/train.tags.de-en.en
<url>http://www.ted.com/talks/stephen_palumbi_following_the_mercury_trail.html</url>
It can be a very complicated thing, the ocean.
And it can be a very complicated thing, what human health is.
de
en
Fairseq
Quick prototyping of seq2seq models
Fairseq: Preprocessing
TRAIN=$tmp/train.en-de
BPE_CODE=$prep/code
rm -f $TRAIN
for l in $src $tgt; do
cat $tmp/train.$l >> $TRAIN
done
echo "learn_bpe.py on ${TRAIN}..."
python $BPEROOT/learn_bpe.py -s $BPE_TOKENS < $TRAIN > $BPE_CODE
for L in $src $tgt; do
for f in train.$L valid.$L test.$L; do
echo "apply_bpe.py to ${f}..."
python $BPEROOT/apply_bpe.py -c $BPE_CODE < $tmp/$f > $prep/$f
done
done
Fairseq: Training
CUDA_VISIBLE_DEVICES=0 fairseq-train \
data-bin/iwslt14.tokenized.de-en \
--arch transformer \
--share-decoder-input-output-embed \
--optimizer adam \
--adam-betas '(0.9, 0.98)' \
--clip-norm 0.0 \
--lr 5e-4 --lr-scheduler inverse_sqrt \
--warmup-updates 4000 \
--dropout 0.3 --weight-decay 0.0001 \
--criterion label_smoothed_cross_entropy \
--label-smoothing 0.1 \
--max-tokens 4096 --max-epoch 50 \
--encoder-embed-dim 128 \
--decoder-embed-dim 128
Model Serving
$ fairseq-interactive data-bin/iwslt14.tokenized.de-en \
> --path checkpoints/checkpoint_best.pt \
> --bpe subword_nmt \
> --remove-bpe \
> --bpe-codes iwslt14.tokenized.de-en/code \
> --beam 10
2020-07-15 00:23:09 | INFO | fairseq_cli.interactive | Namespace(all_gather_list_size=16384, beam=10, bf16=False, bpe='subword_nmt', bpe_codes='iwslt14.tokenized.de-en/code', bpe_separator='@@',
.
.
2020-07-15 00:23:09 | INFO | fairseq.tasks.translation | [de] dictionary: 8848 types
2020-07-15 00:23:09 | INFO | fairseq.tasks.translation | [en] dictionary: 6640 types
2020-07-15 00:23:09 | INFO | fairseq_cli.interactive | loading model(s) from checkpoints/checkpoint_best.pt
2020-07-15 00:23:10 | INFO | fairseq_cli.interactive | NOTE: hypothesis and token scores are output in base 2
2020-07-15 00:23:10 | INFO | fairseq_cli.interactive | Type the input sentence and press return:
danke
S-3 danke
H-3 -0.1772392988204956 thank you .
D-3 -0.1772392988204956 thank you .
P-3 -0.2396 -0.1043 -0.2216 -0.1435
Flask
Python code without an API
Python code with a Flask API
Flask App
from flask import Flask
from flask import Response
from flask import jsonify
from flask import make_response
from flask import request
from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
'checkpoints',
'checkpoint_best.pt',
data_name_or_path='data-bin/iwslt14.tokenized.de-en',
bpe='subword_nmt',
bpe_codes='iwslt14.tokenized.de-en/code',
remove_bpe=True,
source_lang='de',
target_lang='en',
beam=30,
nbest=1,
cpu=True
)
Flask App
@app.route('/translate', methods=['GET'])
def translate():
timers = {'total': -time.time()}
logger.info(str(datetime.now()) + ' received parameters: ' + str(request.args))
query = request.args.get('q', '').strip().lower()
if not query:
raise BadRequest('Query is empty.')
res = model.translate(query, verbose=True)
rv = {'result': res}
timers['total'] += time.time()
resp = make_response(jsonify(rv))
return resp
if __name__ == "__main__":
app.run(debug=False, host='0.0.0.0', port=5002, use_reloader=True, threaded=False)
uwsgi
Assistant Regional Manager
to the
Flask Server
uwsgi + flask
from flask_app import app as application
if __name__ == "__main__":
application.run()
wsgi.py:
[uwsgi]
module = wsgi:application
listen = 128
disable-logging = false
logto = /var/log/uwsgi/api.log
lazy-apps = true
master = true
processes = 3
threads = 1
buffer-size = 65535
socket = /app/wsgi.sock
chmod-socket = 666
enable-threads = true
threaded-logger = true
uwsgi.ini:
nginx
http requests
Route http requests to the uwsgi server
nginx
error_log /var/log/nginx/error.log warn;
events {
worker_connections 8192;
}
http {
.
.
access_log /var/log/nginx/access.log apm;
server {
listen 8002;
server_name 127.0.0.1;
location /translate {
include uwsgi_params;
uwsgi_pass unix:/app/wsgi.sock;
}
location / {
include uwsgi_params;
uwsgi_pass unix:/app/wsgi.sock;
}
}
}
supervisord
Me getting happy after setting nginx + uwsgi config files
Who will coordinate them?
Supervisord
supervisord
[supervisord]
nodaemon=true
logfile=/var/log/supervisor/supervisord.log
[program:uwsgi]
command=uwsgi --ini uwsgi.ini # --pyargv "inference/32k/"
directory=/app
stopsignal=TERM
stopwaitsecs=10
autorestart=true
startsecs=10
priority=3
stdout_logfile=/var/log/uwsgi/out.log
stdout_logfile_maxbytes=0
stderr_logfile=/var/log/uwsgi/err.log
stderr_logfile_maxbytes=0
[program:nginx]
command=nginx -c /etc/nginx/nginx.conf
autorestart=true
stopsignal=QUIT
stopwaitsecs=10
stdout_logfile=/var/log/nginx/out.log
stdout_logfile_maxbytes=0
stderr_logfile=/var/log/nginx/err.log
stderr_logfile_maxbytes=0
Docker
World's best containerization software
I'm ready to start coding again
No doubt about it
Dockerfile
FROM ubuntu:18.04
RUN apt-get update && \
apt-get dist-upgrade -y && \
apt-get install -y build-essential curl g++ gettext-base git libfreetype6-dev libpng-dev libsnappy-dev \
libssl-dev pkg-config python3-dev python3-pip software-properties-common supervisor nginx \
unzip zip zlib1g-dev openjdk-8-jdk tmux wget bzip2 libpcre3 libpcre3-dev vim systemd ca-certificates && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
ENV LANG C.UTF-8
RUN pip3 install torch==1.4.0+cpu torchvision==0.5.0+cpu -f https://download.pytorch.org/whl/torch_stable.html && \
rm -rf /root/.cache /tmp/*
COPY ./requirements.txt /tmp/
RUN pip3 install -r /tmp/requirements.txt && \
rm -rf /root/.cache /tmp/*
COPY . /app
COPY nginx.conf /etc/nginx/
RUN mkdir -p /var/log/supervisor
RUN mkdir -p /var/log/uwsgi
# ========= setup uwsgi
RUN touch /app/wsgi.sock
RUN chmod 666 /app/wsgi.sock
WORKDIR /app
CMD ["bash", "entrypoint.sh"]
EXPOSE 8002
Docker
$ docker build -t mock_app:latest .
$ docker run -d -p 8002:8002 --name mock mock_app:latest
.
.
2020-07-19 09:16:59,130 CRIT Supervisor running as root (no user in config file)
2020-07-19 09:16:59,133 INFO supervisord started with pid 1
2020-07-19 09:17:00 | INFO | fairseq.file_utils | loading archive file checkpoints
2020-07-19 09:17:00 | INFO | fairseq.file_utils | loading archive file data-bin/iwslt14.tokenized.de-en
2020-07-19 09:17:00,138 INFO spawned: 'uwsgi' with pid 19
2020-07-19 09:17:00,145 INFO spawned: 'nginx' with pid 20
Does it work?
$ curl http://localhost:8002/health
OK
$ curl http://localhost:8002/translate?q=Mein+name+ist+shreya
{"result":"my name is shreya ."}
Some Good Practices
- Check logs frequently:
- Caching
- Unittests
When you think the talk is over, but the presenter keeps babbling something
$ docker logs -f <container-name>
$ docker exec -it <container-id> bash
Questions?
Discord channel: talk-nlp-model
LinkedIn:
GitHub: ShreyaKhurana
Code: https://github.com/ShreyaKhurana/europython2020
deck
By Shreya Khurana
deck
- 1,390