Docker Multi-stage Builds and Other Best Practices.

Josh Finnie

What is Docker?

  • Docker is a bundle of software that allows one to run OS-level virtualization easily. 
  • This is handled through packages of software called "containers."
  • These containers are mostly isolated and can interact independently with you, your code and themselves.

Why Docker?

There are a few great reasons to use Docker:

  1. It's a great way to isolate code from other code.
  2. It makes transfer of state between developers easy.
  3. The use of containers allows for some interesting deployment strategies (i.e. ECS, Kubernetes, Fargate)

Example: Padding Hello World

Below is a simple Python script, we want to encapsulate the script in a Docker container for use. We're going to create a Dockerfile and see how we can improve it.

#!/usr/bin/env python3

from left_pad import left_pad

print(left_pad("Hello World", 5, "♥"))
#requirements.txt
left-pad==0.0.3

Let's build a Dockerfile (Part 1)

Below is our first attempt at a Dockerfile. It is about as basic as you can get with still having a successful outcome.

FROM python:3

ADD requirements.txt /
ADD script.py /

RUN pip install -r requirements.txt

CMD python ./script.py
$ docker build -t pyscript .
$ docker run pyscript
♥♥♥♥♥Hello World

Let's build a Dockerfile (Part 2)

Already there are some improvements we can make right-away to the Dockerfile. Here is what we've updated.

FROM python:3-alpine

COPY requirements.txt /requirements.txt
COPY script.py /script.py

RUN pip install -r requirements.txt

CMD ["python", "./script.py"]
$ docker build -t pyscript2 .
$ docker run pyscript2
♥♥♥♥♥Hello World

Let's build a Dockerfile (Part 2)

What did we do above?

  • We updated the image to use the `alpine` version of Python.
    • `python-alpine` is based off the Alpine image; a minimal Docker image based on Alpine Linux with a complete package index and only 5 MB in size!

Let's build a Dockerfile (Part 2)

What did we do above?

  • We changed `ADD` to `COPY`
    • `COPY` is a safer, though similar, outcome to `ADD` as it only copies files from the local filesystem.

Let's build a Dockerfile (Part 2)

What did we do above?

  • We updated the `CMD` to be fit the JSON format instead of the string format.
    • The JSON format of `CMD` disallows accidental command-line signal collision.

But can we do better?

Let's build a Dockerfile (Part 3)

Even though the Dockerfile in Part 2 is an excellent upgrade from the initial Dockerfile we wrote, there are still some improvements we can make. The next Dockerfile is its final form and what I feel is the best practice towards building your Dockerfiles.

$ docker build -t pyscript3 .
$ docker run pyscript3
♥♥♥♥♥Hello World

Let's build a Dockerfile (Part 3)

FROM python:3-slim as base

ENV PYTHONUNBUFFERED=1

WORKDIR /code

FROM base as builder

ENV PIP_DEFAULT_TIMEOUT=100 \
    POETRY_VERSION=1.0.9

RUN pip install "poetry==$POETRY_VERSION"
RUN python -m venv /venv

COPY pyproject.toml poetry.lock ./
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
RUN poetry export -f requirements.txt | /venv/bin/pip install -r /dev/stdin

FROM base as final

COPY --from=builder /venv /venv
ENV PATH="/venv/bin:$PATH"

COPY . /code
RUN chmod +x /code/docker-entrypoint.sh
CMD ["./docker-entrypoint.sh"]

Let's build a Dockerfile (Part 3)

What did we do above?

  • We updated the image to use the `slim` version of Python.
    • For Python specifically, there's actually a pretty sever cost when using the `alpine` version of an image since Alpine Linux does not support wheel builds of Python packages.

Let's build a Dockerfile (Part 3)

What did we do above?

  • We introduced multi-stage builds
    • Multi-stage builds were introduced in Docker 17.05 and they allow you to leverage the built-in Docker caching of layers.

Let's build a Dockerfile (Part 3)

What did we do above?

  • We introduced a "start-up script"
    • Using a "start-up script" is seen as a best practice because it removes a lot of logic from your Docker container. For instance we do not need to build into our Dockerfile the activation of our virtual environment.

Let's build a Dockerfile (Part 3)

What did we do above?

  • We also updated to Poetry, but that's more of a quality-of-life improvement and not directly related to Docker. 😁

What is a Multi-stage Build?

  1. A multi-stage build is a Dockerfile that breaks up concerns into multiple stages of a single Dockerfile.
  2. A multi-stage build improves build times.
  3. A multi-stage build improves image sizes.
  4. A multi-stage build can improve your workflow allowing for testing and deployment being on the same image.
  5. A multi-stage build can set your team up for success with regards to company-wide base images.

A multi-stage build is a Dockerfile that breaks up concerns into multiple stages of a single Dockerfile.

# Copies in our code and runs NPM Install
FROM node:latest as builder
WORKDIR /app
COPY package* ./
COPY src/ src/
RUN [“npm”, “install”]

# Lints Code
FROM node:latest as linting
WORKDIR /app
COPY --from=builder /app/ .
RUN [“npm”, “lint”]

# Runs Unit Tests
FROM node:latest as unit-tests
WORKDIR /app
COPY --from=builder /app/ .
RUN [“npm”, “test”]

# Starts and Serves Web Page
FROM node:latest as serve
WORKDIR /usr/src/app
COPY --from=builder /app/dest ./
COPY --from=builder /app/package* ./
RUN [“npm”, “start”]

A multi-stage build improves build times.

&&

A multi-stage build improves image sizes.

FROM golang:1.14 as base
# Install dependencies
RUN apt-get update && \
    apt-get install -y wget

# Install gorson
RUN wget https://github.com/pbs/gorson/releases/download/4.2.0/gorson-4.2.0-linux-amd64 && \
    mv gorson-4.2.0-linux-amd64 /bin/gorson && \
    chmod +x /bin/gorson

FROM base as builder
WORKDIR /src/go
COPY hello.go ./
RUN CGO_ENABLED=0 go build -a -ldflags '-s' -o hello

FROM scratch
COPY --from=builder /src/go/hello /hello
CMD ["/hello"]

A multi-stage build improves build times.

&&

A multi-stage build improves image sizes.

$ docker run goscript
hello world

$ docker images
goscript | 1.46MB

A multi-stage build can improve your workflow allowing for testing and deployment being on the same image.

# Copies in our code and runs NPM Install
FROM node:latest as builder
WORKDIR /app
COPY package* ./
COPY src/ src/
RUN [“npm”, “install”]

# Runs Unit Tests
FROM node:latest as unit-tests
WORKDIR /app
COPY --from=builder /app/ .
RUN [“npm”, “test”]

# Starts and Serves Web Page for Development
FROM node:latest as devserve
WORKDIR /usr/src/app
COPY --from=builder /app/dest ./
COPY --from=builder /app/package* ./
RUN npm install nodemon
RUN [“npm”, “start:dev”]

# Starts and Serves Web Page in Production
FROM node:latest as prodserve
WORKDIR /usr/src/app
COPY --from=builder /app/dest ./
COPY --from=builder /app/package* ./
RUN npm install pm3
RUN ["pm2", "start", "index.js"]
$ docker build --target prodserver -t pbs-server:v1 .

A multi-stage build can set your team up for success with regards to company-wide base images.

FROM golang:1.14 as base
# Install dependencies
RUN apt-get update && \
    apt-get install -y wget

# Install gorson
RUN wget https://github.com/pbs/gorson/releases/download/4.2.0/gorson-4.2.0-linux-amd64 && \
    mv gorson-4.2.0-linux-amd64 /bin/gorson && \
    chmod +x /bin/gorson

FROM base as builder
WORKDIR /src/go
COPY hello.go ./
RUN CGO_ENABLED=0 go build -a -ldflags '-s' -o hello

FROM scratch
COPY --from=builder /src/go/hello /hello
CMD ["/hello"]

Things that help write better Dockerfiles

  1. Hadolint (https://github.com/hadolint/hadolint)
  2. Dockerfile Best Practices by Docker (https://docs.docker.com/develop/develop-images/dockerfile_best-practices/)
  3. Only install what you need!
FROM python:3.9-slim-buster

# Update to latest packages and add build-essential and python-dev
RUN apt-get update && \
    apt-get install --no-install-recommends -y \
            build-essential=12.6 \
            python3-dev=3.7.3-1 \
            wget=1.20.1-1.1 && \
    rm -fr /var/lib/apt/lists/*

Thank You

Questions?

Docker Multi-stage Builds and other Best Practices.

By Josh Finnie

Docker Multi-stage Builds and other Best Practices.

  • 1,391