A Practice of Sharing Data Between Docker Containers and the Host

DONG Yuxuan @ May 19, 2020 Asia/Shanghai

Docker makes deployments of programs very handy. It makes a isolated container for your application and the environment it needs. However, things could become painful with Docker when you want to dismiss some isolatation. These things are sharing files with the host, especially handling permissions of shared files, and using static system data with large size like fonts. I have struggled few days to containerize my project with Docker and resulted some useful solutions.

Sharing Files

When we need to access data on the host inside the container, we mount directories to the container.

docker run -v ~/local/lib/app:/usr/local/var/app appimg

If the owner of the ~/local/var/app is the user smith with UID 501 on a macOS host and has the permission rwxr-xr-x and the container is ran as the user www-data with UID 33 inside the container. Your program will not be permitted to write to /usr/local/var/app.

Trying to run the containerized program as root is not a good idea. Even if you don’t care about security issues, running as root is sometimes impossible or difficult. For examle, the image of my project needs to inherit from the ubuntu:18.04 image and installing apache2 with apt while building the image. apache2 from apt forbids workers to be ran as root by default. If I really want my program ran as root, I must compile Apache httpd from source while building. That’s painful.

Except the web server, my project also includes a daemon program. The daemon and the web server must both have full permissions to /usr/local/var/app. Since apache2 runs its workers as www-data by default, I decided to run the daemon as www-data too. However, docker run --user www-data appimg will fail because www-data has no permission to launch Apache httpd though workers are ran as it.

My final solution is that run the container as root and use sudo to run the app daemon as www-data in the entrypoint. Before all this, I use usermod to align the UID and the GID of www-data between the owner of ~/local/var/app on the host.

CMD usermod -u $UID www-data \
	&& usermod -G $GID www-data \
	&& sudo -Ebu www-data appd \
	&& apache2ctl -D FOREGROUND

To run the container on my machine, just use docker run -e to set $UID and $GID.

docker run \
	-e UID=501 \
	-e GID=20 \
	-v ~/local/var/app:/usr/local/var/app \
	appimg

During the development, the mounted directory is often owned by your current account. Thus you don’t need to explicity sepecify the numberic UID and GID. You can use the environment variables as the following.

docker run \
	-e UID=$UID \
	-e GID=$GID \
	-v ~/local/var/app:/usr/local/var/app \
	appimg

The advantage of this solution is the portability. UID and GID are not hardcoded. They can be set at runtime. The disadvantage is that the app can’t be ran as root or usermod will fail for conflict.

Static Data

My project processes PDF files in the container inherited from ubuntu:18.04. There’re a poor number of fonts installed in the base image but my program needs more.

Putting fonts in the code repository is a bad idea. Firstly, the repository will become too large. Secondly, there may be some law issues.

The first idea came to my mind is to download fonts at build time. But the shortcoming is apparent. It solves the issue of the size of the repository but doesn’t solve the law issue. We can’t download copyrighted fonts.

How do regular programs handle fonts? To avoid law issues applications like web apps usually don’t ship with fonts but just use what are already installed in the system. Containerized programs could do the same.

Thus the second idea is COPY system fonts into the container. But it fails too. Because fonts are not in the building context. If we put fonts in the context we put fonts in the repository.

My final solution is just mounting the directory of system fonts into the container and executing fc-cache -vf to update the font cache at the beginning of the entrypoint script.

Combining the way I handle permissions, the whole CMD directive becomes the following.

CMD usermod -u $UID www-data \
	&& usermod -G $GID www-data \
	&& fc-cache -vf \
	&& sudo -Ebu www-data appd \
	&& apache2ctl -D FOREGROUND

A typical command to run the container in Linux could be:

docker run \
	-e UID=$UID \
	-e GID=$GID \
	-v ~/local/var/app:/usr/local/var/app \
	-v /usr/share/fonts:/usr/share/fonts \
	appimg

The advantage of the solution is that it neither bloats the repository nor causes law issues. The disadvantage is that fc-cache -vf is slow thus the launch of the container becomes slow.