登录查看更多内容

10 Kubernetes Security Context

Yehor Salo

DevOps | DevSecOps

发布日期: 2021年3月30日

Альтернативный текст для этого изображения не предоставлен

1. runAsNonRoot [P/C]

Even though the container uses namespaces and cgroups to restrict its processes, just one incorrect deployment option will give those processes access to resources on the host. If this process is run as a root user, it has the same access as the host's root account to these resources. In addition, if other module or container settings are used to reduce constraints (such as procMount or capabilities ), the presence of the root UID increases the risk of using them. Unless you have a good reason, you should never run a container as root.

So what if you have a root deployment image?

Often, base images are already built and available to users, but their use is left to the discretion of the development or deployment teams. For example, the official image of Node.js comes with a user nodewith UID 1000,the name of which you can work, but they are not explicitly set it for the current user in his Dockerfile. We'll need to either configure it at runtime using the runAsUser parameter, or change the current user in the image using a separate Dockerfile. The first assumes that UID 1000 can read files mounted in appvolume, also not a very common option for running an application on a separate volume. Instead, let's look at an example of using a derived Dockerfile to build our own image.

Without going too deep into creating images, let's say we have an npm package ready. Here is a minimal Dockerfile to build an image based on node: slimand run as a given node user.

FROM node:slim
COPY --chown=node . /home/node/app/   # <--- Copy app into the home directory with right ownership
USER node                             # <--- Switch active user to “node”
WORKDIR /home/node/app                # <--- Switch current directory to app
ENTRYPOINT ["npm", "start"]           # <--- This will now exec as the “node” user instead of root

The USER command makes node the default user inside any container launched from this image.

Option 2: The user is not defined in the base image

So what would we do if the user was not defined in the base image of the node? In most cases, we just create it in a new Dockerfile and use it. Let's expand on the previous example to do this:

FROM node:slim
RUN useradd somebody -u 10001 --create-home --user-group  # <--- Create a user
COPY --chown=somebody . /home/somebody/app/
USER somebody
WORKDIR /home/somebody/app
ENTRYPOINT ["npm", "start"]

As you can see, the only change is the RUN command, which creates a new user - the syntax can vary depending on the distribution of the base image.

NOTE: this works fine for node.js and npm, but other tools may need to change the owner of other filesystem objects. If you run into any problems, refer to the documentation for your tool.

2. runAsUser / runAsGroup [P/C]

Container images can have a specific user and / or group configured to start a process. This can be changed using the runAsUser and runAsGroup parameters. They are often installed along with a volume mount containing files with the same ownership ID.

...
spec:
  containers:
  - name: web
    image: mycorp/webapp:1.2.3
  securityContext:
    runAsNonRoot: true
    runAsUser: 10001
...

Using these settings is dangerous, because you change the parameters during container startup, these parameters may be incompatible with the original image. For example, the official server image is jenkins/jenkins CIrun by the user: the group jenkins: jenkinsand all its application files are owned by him. If we configure another user, it will not start because this image does not contain this user in the file /etc/passwd. Even if that were the case, it is likely to have problems reading and writing the files owned by jenkins: jenkins. You can verify this by running a simple Docker command:

$ docker run --rm -it -u eric:eric jenkins/jenkins
docker: Error response from daemon: unable to find user eric: no matching entries in passwd file.

As we mentioned above , it is a very good idea not to run container processes as a user root, but you should not rely on the runAsUseror parameter for this runAsGroup. What if someone removes these settings? Do not forget to set the runAsNonRootvalue to true .

3. seLinuxOptions [P/C]

SELinux is a policy configurable access control system for Linux applications, processes and files. It implements the Linux security module framework in the Linux kernel. SELinux is based on the concept of labels and applies these labels to all elements in the system that group elements together. These labels are known as the security context - not to be confused with Kubernetes securityContextand are composed of user , role , type, and an optional field - leveluser: role: type: level .

SELinux then uses policies to determine which processes in a particular context can access other flagged objects on the system. SELinux can enforce policies in which case access will be denied, or it can be configured in permissive mode in which it will log access. In containers, SELinux typically labels the container process and image in such a way as to restrict the process to access only the files in the image.

The default SELinux labels will be used by the container runtime when the container is instantiated. The seLinuxOptions parameter in securityContext allows custom SELinux labels to be applied. Be aware that changing the SELinux labeling for a container could potentially allow the container process to exit the container image and access the host's filesystem.

By default, SELinux labels will be applied by the container runtime when the container is instantiated. The parameter seLinuxOptionsin securityContextallows you to apply custom SELinux labels. Be aware that changing the SELinux labeling for a container could potentially allow the container process to exit the container image and access the host's filesystem.

Note that this feature will only apply if the host operating system supports SELinux.

4. seccompProfile [P/C]

Seccomp means the secure computing mode ( secure computing mode) and is a function of the Linux kernel, which can limit the calls that a specific process can be done in user space into the kernel. A seccomp profile is a JSON definition, usually consisting of a set of system calls and a default action to take when one of those system calls occurs.

{
    "defaultAction": "SCMP_ACT_ERRNO",
    "architectures": [
        "SCMP_ARCH_X86_64",
        "SCMP_ARCH_X86",
        "SCMP_ARCH_X32"
    ],
    "syscalls": [
        {
            "name": "accept",
            "action": "SCMP_ACT_ALLOW",
            "args": []
        },
        {
            "name": "accept4",
            "action": "SCMP_ACT_ALLOW",
            "args": []
        },
        ...
    ]
}

An example from the site https://training.play-with-docker.com/security-seccomp/ was used

Kubernetes provides a mechanism for using custom profiles through a parameter seccompProfilein securityContext.

seccompProfile:
      type: Localhost
      localhostProfile: profiles/myprofile.json

There are three values available for the type field :

Localhost - the additional parameter localhostProfilespecifies the path to the seccomp profile
Unconfined - the profile is not applied
RuntimeDefault - the default value for the container runtime is used (this is the default value if the type is not specified)

You can apply these settings in either PodSecurityContext or securityContext . If both contexts are set, then the container level settings in securityContext are used . Note that these options are relevant for Kubernetes v1.19 - if you are deploying earlier versions, there is a different syntax; for details and examples, refer to the documentation on the official Kubernetes website .

As for most security-related settings, the principle of least privilege is also relevant. Give your container access only to the privileges it needs and nothing more. Start by creating a profile that simply logs what system calls are in progress, and then test your application to create a set of allowed system calls. You can find more information on this process in the Kubernetes tutorials .

5. Avoid Privileged Containers / Escalations [C]

Granting privileged status to a container is dangerous and is usually used as an easier way to obtain certain permissions. The container runtime controls the presence of the privileged flag, grants all the privileges to the container, but removes the restrictions imposed by the cgroup. It can also modify the Linux Security Module and allow processes inside the container to exit.

Containers provide process isolation on the host, so even if the container is running as the root user, there are features that the container runtime does not provide to the container. When the privileged flag is set, the container runtime provides full access to the host file system, which makes this option extremely dangerous from a security point of view.

Avoid using the privileged flag and, if your container needs additional features, add only the ones you need. If your container doesn't need to manage system-level settings in the host's kernel, such as hardware access or network configuration, and needs access to the host's filesystem, then it doesn't need the privileged flag.

For a deeper dive into Privileged Containers, we recommend the article Privileged Docker containers — do you really need them?

6. Linux kernel capabilities [C]

Capabilities are kernel-level permissions that allow granular control over kernel call permissions, instead of running everything as root. Capabilities allow you to change file permissions, manage the networking subsystem, and perform system-wide administration functions. You can manage Capabilities through Kubernetes. securityContext.Individual capabilities, or a comma-separated list, can be represented as an array of strings. Alternatively, you can use shorthand -allto add or remove all capabilities. This configuration is passed to the container runtime and sets up capabilities when the container is created. If insecurityContext there is no capabilities section, then the container is created with the default capabilities set that the container runtime provides.

securityContext:
      capabilities:
        drop:
          - all
        add: ["MKNOD"]

It is recommended to drop all capabilities, and then add back only those that your application really needs. In many cases, applications do not actually require any capabilities to function normally. Disable all capabilities first and track the failures in the audit log to see which ones were blocked.

Note that when you enumerate capabilities in, securityContextyou are removing the prefix CAP_that the kernel uses in the capabilities names. You can use the utility capsh, it displays information about the capabilities enabled in the container in a convenient format, but leave this utility in the resulting containers, as this allows an attacker to easily determine which capabilities are enabled! You can also check the enabled capabilities in the / proc / 1 / status file.

7. Running containers with read-only filesystem [C]

If your container is compromised and its file system is read / write, an attacker could change its configuration, install software, and potentially launch other exploits. Having a read-only file system helps prevent such incidents by limiting the actions that an attacker can take. Generally, containers should not require writing to the container's file system. If your application has stateful data, you should use an external persistence method such as database, volume, or some other service. Also, make sure that all logs are written to standard output and / or to the log collector server, where they are processed centrally.

8. procMount [C]

By default, the container runtime masks certain parts of the file system /procfrom within the container to prevent potential security issues. However, there are times when you need to access them, especially when using nested containers, which are often used as part of the build process in a cluster. There are only two valid parameters for this entry: Defaultone that supports the standard container runtime behavior, or Unmaskedone that removes all masquerades for the file system /proc.

Obviously, you should only use these settings if you really know what you are doing. If you are using it for imaging, check the latest build tool as many no longer need it. Update and revert to procMountthe default, which is relevant for the tool you are using.

Finally, if you still need access to masked areas /proc, do so only for the nested container; never open /procyour host's file system to a container.

9. fsGroup / fsGroupChangePolicy [P]

The parameter fsGroupdetermines the group in which Kubernetes will change the permissions for all files in volumes when volumes are mounted in Pod. The behavior is also controlled fsGroupChangePolicy, which can be set to onRootMismatchor Always. If set onRootMismatch, the permissions will only be changed if they do not already match the container root permissions.

Be careful when using fsGroup. Changing group ownership of an entire volume can cause startup delays Podon slow and / or large file systems. It can also harm other processes that share the same volume if their processes do not have permission to access the new GID. For this reason, some providers of shared file systems, such as NFS, do not implement this feature. These settings do not affect the ephemeral volume.

10. sysctls [P]

Sysctls is a Linux kernel feature that allows administrators to change kernel configuration. On a Linux host operating system, they are defined using /etc/sysctl.conf, and can also be changed using a utility sysctl.

The parameter sysctlsin securityContextallows you to modify certain sysctls in the container. There is only a small subset of the operating system sysctls that can be modified for each container, which are located in the namespace in the kernel. Some of them are considered safe. But most are considered unsafe, depending on the potential impact on other modules. Insecure sysctls are usually disabled and must be specifically enabled by the administrator.

Given the potential for destabilizing the underlying operating system, you should avoid modifying kernel parameters using sysctls unless you have specific requirements. Such changes must be coordinated with the cluster administrator.

A note about security Context during startup

In many cases, the security settings described here are combined with policy-based admission control to ensure that the necessary settings are actually configured before containers are launched into the cluster. By combining the options securityContextwith PodSecurityPolicy, you can ensure that only containers that comply with the policy are launched by enforcing certain securityContext options. SecurityContext parameters can also be added to the configuration of the container during launch using a dynamic access control ( Dynamic Admission Control ) and using mutating webhooks.

my telegram: https://t.me/egoriwe999

@egoriwe999

要查看或添加评论，请登录

Yehor Salo的更多文章

How to receive public ipv4 from AWS ECS via Terraform

2024年3月21日

How to receive public ipv4 from AWS ECS via Terraform

Hello everyone! I want to share with you my Terraform solution which allows to receive public IP address from Elastic…

4 条评论
What is WAF?

2024年2月18日

What is WAF?

WAF (Web Application Firewall) - is a set of filtering conditions designed to detect and block attacks on a web…
Open source SAST and SCA analysis

2023年12月25日

Open source SAST and SCA analysis

Hello everyone! I will tell you about a free way for detecting possible or real vulnerabilities. What is SAST(Static…
Web 3.0

2023年11月4日

Web 3.0

Hello everyone! In this article, I want to discuss the new flow called Web3. But before we dive into that, let's…

2 条评论
Difference between Terraform and Ansible

2023年7月22日

Difference between Terraform and Ansible

Hello everyone , it this article i will discource about theme "What diffrents between Terraform and Ansible tools for…

10 条评论
How to hack SHA3-256

2023年5月3日

How to hack SHA3-256

Hello. In this article i will tell you few methods how you can hack this cryptographic hash function.
Создание отчета о тестировании на проникновение

2022年4月15日

Создание отчета о тестировании на проникновение

Многим доступным в настоящее время ресурсам для тестирования на проникновение не хватает написания отчетов. Методология…
Development of security policy

2021年6月18日

Development of security policy

Organizationally, the security policy determines the procedure for submitting and using user access rights, as well as…
How fast can hackers check compromised passwords?

2021年6月14日

How fast can hackers check compromised passwords?

Cybersecurity researchers from Agari decided to check how long it will take from the moment the password is leaked to…
Introducing Red Hat Advanced Cluster Security for Kubernetes

2021年6月11日

Introducing Red Hat Advanced Cluster Security for Kubernetes

Containers and microservices have triggered a tectonic shift in application infrastructure, and Kubernetes technology…

See all articles

10 Kubernetes Security Context

Yehor Salo

DevOps | DevSecOps

1. runAsNonRoot [P/C]

Option 2: The user is not defined in the base image

2. runAsUser / runAsGroup [P/C]

3. seLinuxOptions [P/C]

4. seccompProfile [P/C]

5. Avoid Privileged Containers / Escalations [C]

6. Linux kernel capabilities [C]

7. Running containers with read-only filesystem [C]

8. procMount [C]

9. fsGroup / fsGroupChangePolicy [P]

10. sysctls [P]

A note about security Context during startup

Yehor Salo的更多文章

社区洞察

其他会员也浏览了

NGINX + Certbot with Ansible

HackedIN: Hack to the Future

Jenkins Server Exploitation | HackTheBox Builder Walkthrough

Secure-by-design Docker Compose

Instalar o KUBERNETES

Why You Should Start Using Docker: Part 5 - Your Practical Guide to Getting Started

Dockers

Build System or Bust and Wrapping Security Tools Using Docker

Ansible to configure web-server as a container in docker

Is your Codebase Leaking Sensitive Data?

1. runAsNonRoot [P/C]

Option 2: The user is not defined in the base image

2. runAsUser / runAsGroup [P/C]

3. seLinuxOptions [P/C]

4. seccompProfile [P/C]

5. Avoid Privileged Containers / Escalations [C]

6. Linux kernel capabilities [C]

7. Running containers with read-only filesystem [C]

8. procMount [C]

9. fsGroup / fsGroupChangePolicy [P]

10. sysctls [P]

A note about security Context during startup

Yehor Salo的更多文章

How to receive public ipv4 from AWS ECS via Terraform

What is WAF?

Open source SAST and SCA analysis

Web 3.0

Difference between Terraform and Ansible

How to hack SHA3-256

Создание отчета о тестировании на проникновение

Development of security policy

How fast can hackers check compromised passwords?

Introducing Red Hat Advanced Cluster Security for Kubernetes

社区洞察

其他会员也浏览了

NGINX + Certbot with Ansible

HackedIN: Hack to the Future

Jenkins Server Exploitation | HackTheBox Builder Walkthrough

Secure-by-design Docker Compose

Instalar o KUBERNETES

Why You Should Start Using Docker: Part 5 - Your Practical Guide to Getting Started

Dockers

Build System or Bust and Wrapping Security Tools Using Docker

Ansible to configure web-server as a container in docker

Is your Codebase Leaking Sensitive Data?