Isolated Docker Multistage Images

Often when building applications, I will use a multistage docker build for output container size and efficiency, but will run the build in two halves, to make use of the extra assets in the builder container, something like this:

docker build \
  --target builder \
  -t builder:$GIT_COMMIT \
  .

docker run --rm \
  -v "$PWD/artefacts/tests:/artefacts/tests" \
  builder:$GIT_COMMIT \
  yarn ci:test

docker run --rm \
  -v "$PWD/artefacts/lint:/artefacts/lint" \
  builder:$GIT_COMMIT \
  yarn ci:lint

docker build \
  --cache-from builder:$GIT_COMMIT \
  --target output \
  -t app:$GIT_COMMIT \
  .

This usually works fine, but sometimes the .dockerignore file won’t have everything set correctly, and docker will decide that when it runs the last build command, that it needs to rebuild the builder container too, which is pretty irritating.

The first solution is to try and figure out what you need to add to your .dockerignore file, which depending on your repository structure and container usage, might be more hassle than it’s worth.

The second solution is to prevent docker invalidating the first layers at all, by splitting the build into separate files.

Splitting the Dockerfile

Let’s start with an example docker file, which is a generic yarn based application with multistage build configured:

FROM node:15.0.1-alpine3.12 as builder
WORKDIR /app

COPY . ./
RUN yarn install --frozen-lockfile && yarn cache clean

RUN yarn ci:build


FROM node:15.0.1-alpine3.12 as output
WORKDIR /app

COPY package.json yarn.lock /app
RUN yarn install --frozen-lockfile --production && yarn cache clean

COPY --from builder /app/dist /app

The first file will be our Docker.builder, which is a direct copy paste:

FROM node:15.0.1-alpine3.12 as builder
WORKDIR /app

COPY . ./
RUN yarn install --frozen-lockfile && yarn cache clean

RUN yarn ci:build

The second file can also be a direct copy paste, saved as Dockerfile.output, but it has a problem:

FROM node:15.0.1-alpine3.12 as output
WORKDIR /app

COPY package.json yarn.lock /app
RUN yarn install --frozen-lockfile --production && yarn cache clean

COPY --from builder /app/dist /app

We want to copy from a different container, not a different stage, and while the COPY command does let you specify another container in the --from parameter, but we really want to specify which container it is at build time. The first attempt at solving this was using a buildarg:

ARG builder_image
COPY --from ${builder_image} /app/dist /app

But alas, this doesn’t work either, as the --from parameter doesn’t support variables. The solution turns out to be that FROM command does support parameterisation, so we can (ab)use that:

ARG builder_image
FROM ${builder_image} as builder

FROM node:15.0.1-alpine3.12 as output
WORKDIR /app

COPY package.json yarn.lock /app
RUN yarn install --frozen-lockfile --production && yarn cache clean

COPY --from builder /app/dist /app

Now our build script can use the --build-arg parameter to force the right container:

docker build \
-  --target builder \
+  --file Dockerfile.builder \
  -t builder:$GIT_COMMIT \
  .

docker run --rm \
  -v "$PWD/artefacts/tests:/artefacts/tests" \
  builder:$GIT_COMMIT \
  yarn ci:test

docker run --rm \
  -v "$PWD/artefacts/lint:/artefacts/lint" \
  builder:$GIT_COMMIT \
  yarn ci:lint

docker build \
-  --cache-from builder:$GIT_COMMIT \
+  --build-arg "builder_image=builder:$GIT_COMMIT" \
+  --file Dockerfile.output \
  -t app:$GIT_COMMIT \
  .

We can now safely modfiy the working directory to our heart’s content without worring about invalidating the layer caches.

Splitting the Dockerfile#

Splitting the Dockerfile