For our ongoing transition from Jenkins to OpenShift, we’re currently working on porting our testing infrastructure to OpenShift.
Our tests involve installing and running our product, NetEye, in a container. The installation requires a working systemd environment inside the container, and systemd needs to run with PID 1 and as root user (UID 0). Until now we’ve been able to do this by running the testing container as privileged.
Now, with OpenShift, we would like to be able to run the container as unprivileged, but drop as many capabilities as possible while still being able to run NetEye correctly.
We are currently running OpenShift Container Platform 4.11. For the work described in this blog post we followed the instructions provided by Marco Caimi, solution architect at RedHat. We also took inspiration from previous work on the subject proposed in a blog post by Fraser Tweedale.
In this first blog post I’ll show you the config changes needed on the OpenShift side. In the next one I’ll show you how to test my proposed solution.
By default, OpenShift uses cgroups v1. After contacting RedHat’s support, they suggested we enable cgroups v2 and user namespace in CRI-O (Container Runtime Interface). Cgroups v2 allows us to have root privileges inside the container, but not outside it. To do this, we’ll exploit a feature of cgroups v2 and CRI-O: namely we will map an unprivileged uid
outside of the container to the root uid
inside the container, and we won’t use the privileged
SCC (security context constraint). The only SCC required will be anyuid
.
Since user namespace is not enabled by default in CRI-O, (see cri-o/userns.md at main – cri-o/cri-o), we need to enable it with a machine config:
cat /etc/crio/crio.conf.d/99-user-namespace-workload.conf:
[crio.runtime.workloads.userns]
activation_annotation = "io.kubernetes.cri-o.userns-mode"
allowed_annotations = ["io.kubernetes.cri-o.userns-mode"]
Furthermore, by default the kernel is booted with cgroups v1, so we want to add the following to the kernel boot parameters:
cgroup_no_v1=all psi=1 systemd.unified_cgroup_hierarchy=1
We’ll use the following MachineConfig configuration to create the necessary changes on master nodes:
---
#enable_user_ns.yml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
name: 99-enable-cgroupv2-on-masters
labels:
machineconfiguration.openshift.io/role: master
spec:
config:
ignition:
version: 3.2.0
storage:
files:
# see https://github.com/cri-o/cri-o/blob/main/tutorials/userns.md
- path: /etc/crio/crio.conf.d/99-workload-userns.conf
overwrite: true
contents:
source: data:text/plain;charset=utf-8;base64,W2NyaW8ucnVudGltZS53b3JrbG9hZHMudXNlcm5zXQphY3RpdmF0aW9uX2Fubm90YXRpb24gPSAiaW8ua3ViZXJuZXRlcy5jcmktby51c2VybnMtbW9kZSIKYWxsb3dlZF9hbm5vdGF0aW9ucyA9IFsiaW8ua3ViZXJuZXRlcy5jcmktby51c2VybnMtbW9kZSJdCg==
mode: 420
kernelArguments:
- systemd.unified_cgroup_hierarchy=1
- cgroup_no_v1="all"
- psi=1
And then deploy the changes:
oc create -f ~/enable_user_ns.yml
We can check that the config has been created:
oc get machineconfig | grep 99-enable-cgroupv2-on-masters
99-enable-cgroupv2-on-masters 3.2.0 8d
and that the changes have been deployed by monitoring the status of the MachineConfingPools
and then waiting until the master nodes are updated.
oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-c2ac180ae7398bc7cec20106a2e41cbe True False False 3 3 3 0 107d
worker rendered-worker-aedd42003621dc4c437a98e3c157a1fd True False False 2 1 2 0 107d
We can also check the boot parameters of the master nodes so we can be sure that the configuration has been deployed and the machine has been rebooted:
cat /proc/cmdline BOOT_IMAGE=(hd0,gpt3)/ostree/ [...] systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all psi=1
The only SCC permission required is the anyuid
one. For testing purposes, let’s create a service account systemd-test
in the default
namespace with the correct permissions:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: systemd-test
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: 'system:openshift:scc:anyuid'
namespace: default
subjects:
- kind: ServiceAccount
name: systemd-test
namespace: default
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: 'system:openshift:scc:anyuid'
In my next blog post I’ll show you how to run a test container with the newly modified configuration and account, and how to verify that the modification was successful.
Did you find this article interesting? Are you an “under the hood” kind of person? We’re really big on automation and we’re always looking for people in a similar vein to fill roles like this one as well as other roles here at Würth Phoenix.