The ultimate test of your Docker Image: Running in GitHub Actions

5 min read Original article ↗

I thought it would be simple…

Peter Flook

Press enter or click to view image in full size

I would like to think I know Docker by now. Credit to Docker.

I sometimes think that I almost know everything that I need for running and deploying Docker images. I’ve run them via docker run, docker compose, inside Minikube, used Helm, in Kubernetes clusters with strict security requirements, as jobs and applications. You name it. But then came along GitHub Actions and it made me rethink my Docker knowledge. I encountered many problems which I will go through below along with solutions.

Background

I have an existing Docker image that can be run as a job or application that I wanted to run as a job within GitHub Actions. To help make it easier for those to use my image, I have created a composite GitHub Action that first checks a repository, then runs some Javascript code that executes the Docker run command for my image. When running my action locally, everything works as expected. Then once run in GitHub Actions, many errors came along.

For reference, you can view all the code in this repository.

GitHub Actions are a great way to run all sorts of things, especially builds and releases. Credit GitHub.

Errors I Faced

Access Denied

This is where my confusion began. I was getting access denied/permission denied when writing to a folder part of the Docker volume.

docker run -d \
...
-v ${sharedFolder}:/opt/app/shared \
...
datacatering/data-caterer-basic:0.11.7

Okay, we should be able to solve this by ensuring that we have write permission to the folder when we create the folder. Given we have created the folder via Node.js, we can define the permissions via:

fs.mkdirSync(sharedFolder, { recursive: true, mode: 0o755 })

I then run and try again but still get the same error! This prompted me to check the Node.js documents and I found that it uses 0o777 by default. Hmm okay, I started to think now that it is due to the user on the host vs the user in the Docker image.

Alter user

We can alter the user inside of the Docker image via the docker run command by using the argument --user. Essentially, you can pass in any user to it and it will try to run as that user (you can check here for the documentation). A snippet of the code is below:

const uid = process.getuid()
const gid = process.getgid()
const user = `--user ${uid}:${gid}`
...
docker run -d ${user} \
...
datacatering/data-caterer-basic:0.11.7

No more permission denied error! But now a new error.

Name is null

Now we are getting an exception from Hadoop stating that name is null (same as this error).

2020-07-05T12:19:40.863653699Z Caused by: \
javax.security.auth.login.LoginException: java.lang.NullPointerException: \
invalid null input: name

After a bit of investigation, I found that it tries to get the username of the user running the process. It must be getting null because we have set a user ID without an associated username within our Docker image. When we run our image with --user, it doesn’t add a new entry into the /etc/passwd so there is no associated username for the GitHub Action user ID.

Get Peter Flook’s stories in your inbox

Join Medium for free to get updates from this writer.

I found a useful resource relating to using volume mounts in the GitHub Action Docker run. My Docker image by default was using a non-root user with ID 1000 whilst GitHub Action pseudo-enforces through the use of user 1001.

To help give it a username, I altered my Dockerfile to give my non-root user the user ID 1001.

USER root
RUN addgroup -S app \
&& adduser -S app -G app --uid 1001
...

USER app

Once done, the world is back where it should be as my Docker image can run smoothly in GitHub Actions.

How do others deal with this problem?

Spark has its own way of dealing with the problem by inserting the user into the /etc/passwd file directly through the ENTRYPOINT script. I tried replicating this but ran into a permission denied error when trying to insert into /etc/passwd given I was running as non-root (as I expected). Not sure how it works on the official Spark image but I assume this script must be run as root user before changing to a non-root user (EDIT: runs as root from seeing their Dockerfile here).

Conclusion

This takes us back to the concept of Docker where it should be able to run everywhere. If you have a volume mount, your image may not work everywhere though. If you are using a user on your host that is different to the user in the Docker image, it may not work.

It also takes me to another point about knowing Linux. I’m not anywhere close to knowing how Linux works and what are best practices but have a somewhat working knowledge of how to get things working. I would love to hear any opinions from those who are more familiar with Linux and how their knowledge could be applied in this situation.

This made me make a change to insta-infra to not persist data by default. I suspect it was the reason why a user was getting permission denied when trying to run Kafka with persisted data.