The Minuscule Docker Images That Could

4 min read Original article ↗

Why Size Matters

By reducing the number of things that we put inside our Docker image we are reducing the number of possible security vulnerabilities lingering around in our Docker container. Additionally, it enables the images to just be much cleaner, only containing what they need to make the application run.

There is also the minor advantage of the images downloading a bit quicker, but in my opinion, this is not too important.

Note: Alpine images are already very small and will probably be sufficient if you care about size.

Distroless Images

The Distroless project offers a collection of “distroless” base images which do not contain any package managers, shells, or other utilities you’d normally expect to have in your command line. As a result of this, we cannot use package managers like pip and apt:

A Dockerfile using the Python 3 distroless image
Pip is not present in the image

Typically this problem would be solved through multi-stage builds:

Using a multi-stage build

The size of the resulting image is 130MB. Not too bad! For comparison, the default Python image is 929MB, the slimmed-down variant, 3.7-slim, is 179MB, the alpine image, 3.7-alpine, is 98.6MB, and finally, the distroless base image used in the example is 50.9MB.

Now, one may correctly point out that in the previous example we are copying the entire /usr/local/lib/python3.7/site-packages directory, which may contain dependencies we do not need. Though, it is clear that the difference in the size of all existing Python base images varies.

As of writing this, Google distroless does not support many images: Java and Python are experimental and Python only exists for 2.7 and 3.5.

Minuscule Images

Back to my obsession with creating small images.

Originally, I wanted to see how these distroless images were built. The distroless project makes use of Google’s bazel build tool. However, setting up Bazel and writing my own images required a bit of work (and let’s be honest, re-inventing the wheel can be extremely fun and educative). I wanted to be able to create smaller images more easily, the act of making an image should be extremely simple, it should be trivial. No configuration files, just one line in the terminal: just build an image for <application>.

Now, if you want to build your own images, you should be aware of a unique docker image: scratch . Scratch is an ‘empty’ image — it does not contain any files (though, it is a whopping 77 bytes by default).

A scratch image

The idea of a scratch image is that we can copy any dependencies in from our host machine and either use these dependencies inside the Dockerfile (like copying in apt and installing dependencies from scratch) or later when the Docker image is materialized. This gives us full control of what we put inside our Docker container, and thus, also full control of the size of the image.

Now, we need some way to gather these dependencies. Existing tools like apt allow you to download packages but they are only constrained to your current machine and, after all, would not support Windows or MacOS.

So, I set out to build my own tool that would be able to automagically build the smallest possible base image to run any application. I would use Ubuntu/Debian packages, fetch them (accessing the package servers directly), and recursively find their dependencies. The tool should always download the latest version stable version of a package, mitigating as many security risks as possible.

I called the tool fetchy, because… it fetches… things… The tool works through a command line interface, though, it also offers an API.

In order to build an image with fetchy (let’s take Python here), all you have to do is use the CLI as follows: fetchy dockerize python. You may be prompted for your target operating system and codename, as fetchy can only use Debian and Ubuntu-based packages for now.

Now, optionally, some dependencies may not be used at all (in our context), and we could also exclude them. For example: Python depends on perl, though, Python will run fine without Perl installed.