An operating system process is a single execution of task. This execution is dependent on an environment which contains the necessary resources to ensure a successful run. To isolate a process like this takes a few steps. Isolating a processâs view to the outside world, then isolating necessary dependencies for the process, then the resources required by it to run, then more isolation based on access to different features of the underlying system itself, isolated communication between isolated processes, we are getting a little out of hand here.
We are going to perform the first step of the process isolation with isolating the view of a process.
Containerization at itâs core is process isolation. At any given point, a process will contain the program that is running, the memory allocated to the process, the CPU state a list of open files and other resources such as IO devices. To isolate a process we can use tools provided by the operating system kernel.
15th November 2024
Process isolation in Linux requires certain Linux Kernel features to ensure isolation of views between proceses.
16th November 2024
Letâs create a new root directory for our isolated process. From now on we are also going to refer to this process as a container for the sake of brevity. As we are getting starting from scratch from a new environment we will ensure we have some of the basics atleast. After creating the directory structure for the container, we are copying the bash and ls commands commands for some initial navigation through the environment.
mkdir new_root
mkdir -p new_root/{bin,lib,lib64}
# copy the commands you want in your container
cp /bin/{bash,ls} new_root/bin/Letâs remove the access from all our current resources in our current root directory and then jump into the new root directory where we have a view of nothing except the commands ls and bash.
Letâs use chroot to change our root to a new directory called new_root.
sudo chroot ./new_root /bin/bash
chroot: failed to run command â/bin/bashâ: No such file or directoryHere /bin/bash fails to run because it does not contain the necessary dependencies which it needs to run. We have no idea what dependencies the application is talking about but if it could tell us it would be great. For this purpose we will get the shared object for dynamic linker because it is used to resolve dependencies during process runtime.
cp /lib/ld-linux-aarch64.so.1 new_root/lib/
sudo chroot ./new_root /bin/bash
/bin/bash: error while loading shared libraries: libtinfo.so.6: cannot open shared object file: No such file or directoryNow after running /bin/bash in the new environment again we will get a error with respect to an unavailable dependency. Letâs check which dependencies our two programs here bash and ls have. The command ldd helps us do that.
ldd /bin/{bash,ls}
/bin/bash:
linux-vdso.so.1 (0x0000ffff8f630000)
libtinfo.so.6 => /lib/aarch64-linux-gnu/libtinfo.so.6 (0x0000ffff8f430000)
libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff8f280000)
/lib/ld-linux-aarch64.so.1 (0x0000ffff8f5f7000)
/bin/ls:
linux-vdso.so.1 (0x0000ffff7ff02000)
libselinux.so.1 => /lib/aarch64-linux-gnu/libselinux.so.1 (0x0000ffff7fe50000)
libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff7fca0000)
/lib/ld-linux-aarch64.so.1 (0x0000ffff7fec9000)
libpcre2-8.so.0 => /lib/aarch64-linux-gnu/libpcre2-8.so.0 (0x0000ffff7fc00000)Copy these dependencies to the to-be-isolated root.
cp /lib/aarch64-linux-gnu/libtinfo.so.6 /lib/aarch64-linux-gnu/libc.so.6 /lib/ld-linux-aarch64.so.1 /lib/aarch64-linux-gnu/libselinux.so.1 /lib/aarch64-linux-gnu/libpcre2-8.so.0 new_root/lib/Now that the necessary dependencies are copied, letâs change root into the new root again.
sudo chroot ./new_root /bin/bash
bash-5.1# ls
bin lib lib64We are able to run all the commands we moved there but, not the command we didnât i.e. ps. So letâs move ps and what all processes we can see.
bash-5.1# ps -aux
bash: ps: command not foundLetâs exit out of the container, copy the command and itâs dependencies then letâs try doing the same again.
bash-5.1# exitCopy the command.
cp /bin/ps ./new_root/bin/Use the following command to print the locations of the dependencies only so that we can cycle through them.
ldd /bin/ps | awk '{print $3}' | grep -v '^$'
/lib/aarch64-linux-gnu/libprocps.so.8
/lib/aarch64-linux-gnu/libc.so.6
/lib/aarch64-linux-gnu/libsystemd.so.0
/lib/aarch64-linux-gnu/liblzma.so.5
/lib/aarch64-linux-gnu/libzstd.so.1
/lib/aarch64-linux-gnu/liblz4.so.1
/lib/aarch64-linux-gnu/libcap.so.2
/lib/aarch64-linux-gnu/libgcrypt.so.20
/lib/aarch64-linux-gnu/libgpg-error.so.0Now, letâs copy the above to our new root using the following command.
for dep in `ldd /bin/ps | awk '{print $3}' | grep -v '^$' `; do cp --parents "$dep" ./new_root; done;Now, letâs run our ps command to see what happens next.
sudo chroot ./new_root /bin/bash
bash-5.1# ps
Error, do this: mount -t proc proc /proc
bash-5.1# exit
exitOk, seems like the command `ps` knows how to help us solve this problem. Letâs move the mount command into our container with itâs dependencies.
cp /bin/mount ./new_root/bin/
for dep in `ldd /bin/mount | awk '{print $3}' | grep -v '^$' `; do cp --parents "$dep" ./new_root; done;Now, letâs run the mount command in the container considering that we copied this command to do what we were told to do by the ps process.
sudo chroot ./new_root /bin/bash
bash-5.1# mount
mount: failed to read mtab: No such file or directory
bash-5.1# exitNow, looks like we need a mtab. /etc/mtab is a symlink and we can see the chain below.
ll /etc/mtab
lrwxrwxrwx 1 root root 19 Oct 2 07:43 /etc/mtab -> ../proc/self/mountsEven /proc/mounts is a symlink.
ll /proc/mounts
lrwxrwxrwx 1 root root 11 Nov 15 21:04 /proc/mounts -> self/mountsIf we mount the /proc in the new_root to a correct location we can see that the ps command and the mount command work well after in our effort to get ps running finally.
sudo mount -t proc wavey ./new_root/procRun the mount command inside the chroot env. The command works properly after mounting the /proc.
bash-5.1# mount
wavey on /proc type proc (rw,relatime)Run the ps command inside the chroot env. We see that the command works well but we still have a view of all the other processes running in the system. When isolating a process it is important to ensure that the process doesnât have a lens into the outside functioning environment.
ps -aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
0 1 0.0 0.2 167900 11144 ? Ss Nov16 0:11 /sbin/init
...To ensure that a process will have access only to certain files and also only have a view into itâs own functioning as a process, we are going to use namespaces.
At this point I was having a hard time figuring out the usage of the unshare command. Eric Chaingâs blog on containers from scratch really me get a clearer idea.)
In the above run of ps, we are returning the process information from all the other processes running on the underlying system. Letâs put this process in a namespace and ensure that this doesnât happen. The command below letâs us run a command in a new namespace.
sudo unshare --pid --fork --mount-proc=$PWD/new_root/proc chroot ./new_root /bin/bashAs you can see above, we are using the âfork, âpid, and âmount-proc flags.
With fork, we are forking the execution into a new child process so that the process becomes PID 1 in itâs namespace. This is part of creating a PID namespace where first before we start namespacing the process to have a view of other processes. But this doesnât mean that the identifier of these processes change. To ensure that we donât pass down identifiers of these processes, we fork the execution into a new process. When creating a PID namespace with âpid and not providing a âfork flag we get the following error.
The child process is not able to fork further children.
bash-5.1# ps
PID TTY TIME CMD
2060 ? 00:00:00 sudo
2061 ? 00:00:00 bash
2062 ? 00:00:00 ps
bash-5.1# ps
bash: fork: Cannot allocate memory
bash-5.1# ls
bash: fork: Cannot allocate memoryThe process is able to execute only once and the command is not able to execute any other command after. This is because the inability of the command to fork.
Being able to fork into new processes and Process Identifiers (PIDs) respectively is very important and thus a very important part of creating a PID namespace.
Now, that we have isolated the PID, itâs time to isolate the view of the processes.
Sadly, the view of the other processes comes when we bring in the proc mount to get a view of our own namespace. Letâs do a new proc virtual filesystem mount called wavey onto /proc of our new root.
sudo mount -t proc wavey ./new_root/procWith that, we have created a dummy virtual file system where nothing lives right now, but we will use this as a base for our new procfs which needs to be a virtual filesystem.
After doing the above, entering the process jail and running mount to list the different mount in the container we get the below output.
bash-5.1# mount
wavey on /proc type proc (rw,relatime)Letâs create a procfs specifically for the namespace we are creating with unshare using the âmount-proc command with the full command below.
sudo unshare --pid --fork --mount-proc=$PWD/new_root/proc chroot ./new_root /bin/bashAfter giving a new location to the procfs which doesnât have any information on the other processes, letâs run the mount command again.
bash-5.1# mount
wavey on /proc type proc (rw,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)The second entry above is the procfs mount we just did. When we run the ps command in the container we will see the following.
bash-5.1# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
0 1 0.0 0.0 3984 3200 ? S 19:18 0:00 /bin/bash
0 2 0.0 0.0 6408 2380 ? R+ 19:21 0:00 ps auxNo processes can be seen apart from the ones running in the context of this container.
25th November 2024
With that, our process do not have any lens into the underlying system, isolating it away from other running processes. We can also isolate the process with other kinds of namespaces further.
Now that the namespace has been processed we can namespace other aspect other aspects of this process.
We would also like to have a virtualized view of time for our process.
If we go to the process and run uptime we get the following output.
bash-5.1# uptime -p
up 3 hours, 30 minutesIf we would like to start our process 9 years ahead into the future we should run the following command.
sudo unshare --fork --pid --time --boottime 300000000 --mount-proc=$PWD/new_root/proc chroot ./new_root /bin/bashRunning uptime after, gets us the following output.
bash-5.1# uptime -p
up 9 years, 28 weeks, 8 hours, 56 minutesIf you would like to learn more about Time Namespaces, check out the article below.
With that we have gotten a glimpse into creating a new root and a namespaced view for the process. In the next iteration, we are going to explore namespaces in Linux further and see what it means to isolate a process in terms of resource usage.
