Docker Multi-stage Builds and Other Best Practices.
Josh Finnie
What is Docker?
- Docker is a bundle of software that allows one to run OS-level virtualization easily.
- This is handled through packages of software called "containers."
- These containers are mostly isolated and can interact independently with you, your code and themselves.
Why Docker?
There are a few great reasons to use Docker:
- It's a great way to isolate code from other code.
- It makes transfer of state between developers easy.
- The use of containers allows for some interesting deployment strategies (i.e. ECS, Kubernetes, Fargate)
Example: Padding Hello World
Below is a simple Python script, we want to encapsulate the script in a Docker container for use. We're going to create a Dockerfile and see how we can improve it.
#!/usr/bin/env python3
from left_pad import left_pad
print(left_pad("Hello World", 5, "♥"))
#requirements.txt
left-pad==0.0.3
Let's build a Dockerfile (Part 1)
Below is our first attempt at a Dockerfile. It is about as basic as you can get with still having a successful outcome.
FROM python:3
ADD requirements.txt /
ADD script.py /
RUN pip install -r requirements.txt
CMD python ./script.py
$ docker build -t pyscript .
$ docker run pyscript
♥♥♥♥♥Hello World
Let's build a Dockerfile (Part 2)
Already there are some improvements we can make right-away to the Dockerfile. Here is what we've updated.
FROM python:3-alpine
COPY requirements.txt /requirements.txt
COPY script.py /script.py
RUN pip install -r requirements.txt
CMD ["python", "./script.py"]
$ docker build -t pyscript2 .
$ docker run pyscript2
♥♥♥♥♥Hello World
Let's build a Dockerfile (Part 2)
What did we do above?
- We updated the image to use the `alpine` version of Python.
- `python-alpine` is based off the Alpine image; a minimal Docker image based on Alpine Linux with a complete package index and only 5 MB in size!
Let's build a Dockerfile (Part 2)
What did we do above?
-
We changed `ADD` to `COPY`
- `COPY` is a safer, though similar, outcome to `ADD` as it only copies files from the local filesystem.
Let's build a Dockerfile (Part 2)
What did we do above?
- We updated the `CMD` to be fit the JSON format instead of the string format.
- The JSON format of `CMD` disallows accidental command-line signal collision.
But can we do better?
Let's build a Dockerfile (Part 3)
Even though the Dockerfile in Part 2 is an excellent upgrade from the initial Dockerfile we wrote, there are still some improvements we can make. The next Dockerfile is its final form and what I feel is the best practice towards building your Dockerfiles.
$ docker build -t pyscript3 .
$ docker run pyscript3
♥♥♥♥♥Hello World
Let's build a Dockerfile (Part 3)
FROM python:3-slim as base
ENV PYTHONUNBUFFERED=1
WORKDIR /code
FROM base as builder
ENV PIP_DEFAULT_TIMEOUT=100 \
POETRY_VERSION=1.0.9
RUN pip install "poetry==$POETRY_VERSION"
RUN python -m venv /venv
COPY pyproject.toml poetry.lock ./
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
RUN poetry export -f requirements.txt | /venv/bin/pip install -r /dev/stdin
FROM base as final
COPY --from=builder /venv /venv
ENV PATH="/venv/bin:$PATH"
COPY . /code
RUN chmod +x /code/docker-entrypoint.sh
CMD ["./docker-entrypoint.sh"]
Let's build a Dockerfile (Part 3)
What did we do above?
- We updated the image to use the `slim` version of Python.
- For Python specifically, there's actually a pretty sever cost when using the `alpine` version of an image since Alpine Linux does not support wheel builds of Python packages.
Let's build a Dockerfile (Part 3)
What did we do above?
- We introduced multi-stage builds
- Multi-stage builds were introduced in Docker 17.05 and they allow you to leverage the built-in Docker caching of layers.
Let's build a Dockerfile (Part 3)
What did we do above?
- We introduced a "start-up script"
- Using a "start-up script" is seen as a best practice because it removes a lot of logic from your Docker container. For instance we do not need to build into our Dockerfile the activation of our virtual environment.
Let's build a Dockerfile (Part 3)
What did we do above?
- We also updated to Poetry, but that's more of a quality-of-life improvement and not directly related to Docker. 😁
What is a Multi-stage Build?
- A multi-stage build is a Dockerfile that breaks up concerns into multiple stages of a single Dockerfile.
- A multi-stage build improves build times.
- A multi-stage build improves image sizes.
- A multi-stage build can improve your workflow allowing for testing and deployment being on the same image.
- A multi-stage build can set your team up for success with regards to company-wide base images.
A multi-stage build is a Dockerfile that breaks up concerns into multiple stages of a single Dockerfile.
# Copies in our code and runs NPM Install
FROM node:latest as builder
WORKDIR /app
COPY package* ./
COPY src/ src/
RUN [“npm”, “install”]
# Lints Code
FROM node:latest as linting
WORKDIR /app
COPY --from=builder /app/ .
RUN [“npm”, “lint”]
# Runs Unit Tests
FROM node:latest as unit-tests
WORKDIR /app
COPY --from=builder /app/ .
RUN [“npm”, “test”]
# Starts and Serves Web Page
FROM node:latest as serve
WORKDIR /usr/src/app
COPY --from=builder /app/dest ./
COPY --from=builder /app/package* ./
RUN [“npm”, “start”]
A multi-stage build improves build times.
&&
A multi-stage build improves image sizes.
FROM golang:1.14 as base
# Install dependencies
RUN apt-get update && \
apt-get install -y wget
# Install gorson
RUN wget https://github.com/pbs/gorson/releases/download/4.2.0/gorson-4.2.0-linux-amd64 && \
mv gorson-4.2.0-linux-amd64 /bin/gorson && \
chmod +x /bin/gorson
FROM base as builder
WORKDIR /src/go
COPY hello.go ./
RUN CGO_ENABLED=0 go build -a -ldflags '-s' -o hello
FROM scratch
COPY --from=builder /src/go/hello /hello
CMD ["/hello"]
A multi-stage build improves build times.
&&
A multi-stage build improves image sizes.
$ docker run goscript
hello world
$ docker images
goscript | 1.46MB
A multi-stage build can improve your workflow allowing for testing and deployment being on the same image.
# Copies in our code and runs NPM Install
FROM node:latest as builder
WORKDIR /app
COPY package* ./
COPY src/ src/
RUN [“npm”, “install”]
# Runs Unit Tests
FROM node:latest as unit-tests
WORKDIR /app
COPY --from=builder /app/ .
RUN [“npm”, “test”]
# Starts and Serves Web Page for Development
FROM node:latest as devserve
WORKDIR /usr/src/app
COPY --from=builder /app/dest ./
COPY --from=builder /app/package* ./
RUN npm install nodemon
RUN [“npm”, “start:dev”]
# Starts and Serves Web Page in Production
FROM node:latest as prodserve
WORKDIR /usr/src/app
COPY --from=builder /app/dest ./
COPY --from=builder /app/package* ./
RUN npm install pm3
RUN ["pm2", "start", "index.js"]
$ docker build --target prodserver -t pbs-server:v1 .
A multi-stage build can set your team up for success with regards to company-wide base images.
FROM golang:1.14 as base
# Install dependencies
RUN apt-get update && \
apt-get install -y wget
# Install gorson
RUN wget https://github.com/pbs/gorson/releases/download/4.2.0/gorson-4.2.0-linux-amd64 && \
mv gorson-4.2.0-linux-amd64 /bin/gorson && \
chmod +x /bin/gorson
FROM base as builder
WORKDIR /src/go
COPY hello.go ./
RUN CGO_ENABLED=0 go build -a -ldflags '-s' -o hello
FROM scratch
COPY --from=builder /src/go/hello /hello
CMD ["/hello"]
Things that help write better Dockerfiles
- Hadolint (https://github.com/hadolint/hadolint)
- Dockerfile Best Practices by Docker (https://docs.docker.com/develop/develop-images/dockerfile_best-practices/)
- Only install what you need!
FROM python:3.9-slim-buster
# Update to latest packages and add build-essential and python-dev
RUN apt-get update && \
apt-get install --no-install-recommends -y \
build-essential=12.6 \
python3-dev=3.7.3-1 \
wget=1.20.1-1.1 && \
rm -fr /var/lib/apt/lists/*
Thank You
Questions?
Docker Multi-stage Builds and other Best Practices.
By Josh Finnie
Docker Multi-stage Builds and other Best Practices.
- 1,681