Build Small Docker Images with multi-stage builds

Photo by Ian Taylor on Unsplash

Build Small Docker Images with multi-stage builds

One of the hardest things to do when working with containers is maintaining the image size small.
Adding resources and commands to Dockerfile is easy and fast, but cleaning the artifacts these commands produce is not.
To keep the size small we can try to remove all unused packages, and files and clean every cache we can think of with some scripts, but Docker offers us a better and simpler way to achieve these results: multi-stage builds.

What are multi-stage builds?

Multi-stage builds are a feature Docker has implemented to offer a better and cleaner organization of the Dockerfile. This is achieved using multiple sections inside a single Dockerfile, those sections are called stages.
Each stage allows you to use a different FROM statement (a different base image) and selectively copy artifacts from one stage to another.
When you build a Dockerfile containing multiple stages, Docker will build all those different stages separately. Before multi-stage builds the only way to achieve this, was to use, and maintain, different Docker files.

How we can use a multi-stage build to reduce image size?

This feature allows you to bring your own artifacts between stages, so you can easily build your application in a different stage than the one that would run it.

Why this is an advantage?

When building an application we need to install a lot of development dependencies, which maybe are not required to run the resulting artifacts. So we can easily discard all of them.
Surely you can try to remove all of those dependencies with some inverse commands but I will show you how simpler is to do that with multi-stage builds. An example can be packages installed with a package manager like apk or apt, which usually brings with them a lot of dependencies.

Dockerfile
FROM golang:1.13-alpine3.10 AS build
RUN apk --no-cache add git
WORKDIR /app
COPY . .
ENV GO111MODULE=on
RUN go mod download
RUN GOOS=linux go build -ldflags="-s -w" -o ./test ./main.go
CMD ["/app/test"]

FROM scratch
WORKDIR /
COPY --from=build /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=build /app/test /app
CMD ["/app"]

main.go
package main

import "log"

func main() {
    log.Println("Test")
}

How it works

In this multi-stage Dockerfile, we build our Go package in a stage called build and we get the resulting executable in a clean stage based on a scratch image.
So the actual image produced is only the size of the base image (scratch) and the executable that, in this case, is only some KB large.

Results

  • Image produced with multi-stage build: 1.74MB

  • Image produced with classical single stage build from Go image: 79MB

With some optimization, the classic build size can also be reduced, example of those optimizations are cache cleaning, and file removal...
Even with those optimizations, the size difference is huge and definitively notable.