Connect your local Kubernetes cluster to the cloud utilizing KinD, Kyverno, and Ngrok

Connect your local Kubernetes cluster to the cloud utilizing KinD, Kyverno, and Ngrok

What is the use case?

Deploying to or maintaining Kubernetes clusters can often seem complex and challenging, especially for novice engineers due to the intricate concepts involved. As a result, solutions like Minikube or KinD were developed, enabling users to run Kubernetes clusters locally in a relatively simple manner. Recently, it appears that KinD has gained considerable popularity as it allows for running multi-node clusters on local systems. While using KinD is excellent for regular testing purposes, it becomes increasingly complicated and troublesome when one needs to interact with cloud resources. This often requires manual management of credentials for authenticating workloads against the cloud infrastructure, which can be a cumbersome process. For this reason, I've been wondering whether an easier solution to these challenges might not exist after all.

Thinking cartoon

As evident from this article, there is a potential for a more streamlined solution. By integrating diverse tools such as Kyverno and Ngrok, I was able to deploy pods with direct access to cloud providers.

The authentication process will leverage OIDC (OpenID Connect), which is built upon OAuth 2.0 as a JWT-based protocol facilitating the verification and validation of access permissions for applications. Most contemporary IAM solutions natively incorporate external OIDC providers to support identity federation. In the subsequent sections, we will dive into how you can harness this capability to authenticate with your local cluster towards these platforms.

Our stack

As previously mentioned, a variety of tools are utilized to achieve our objectives. Although it might be possible to deploy more efficiently with less complexity, optimization was not my primary concern at this stage.

KinD (Kubernetes in Docker)

KinD serves as a lightweight command-line tool that enables engineers to establish local Kubernetes clusters comprising one or multiple nodes operating within their standard Docker or Podman environment. It's ideal for setting up local testing environments and offers extensive customization options, even on the individual nodes themselves. KinD also comes equipped with its dedicated Load Balancer cloud provider tool, easy integration with Ingress Controllers, and supports custom node images. We will simply employ it to initiate a cluster while configuring additional Apiserver settings. For further details on KinD, I strongly suggest reviewing their comprehensive documentation here.

Kyverno

Next up is Kyverno, which functions as a policy engine designed to manage and automate Kubernetes environments. It includes an extensive collection of pre-defined policies available right out of the box to streamline your operations without much effort. In our configuration, we will utilize it merely for automation tasks that can also be accomplished with specific Helm charts or independent admission webhooks. However, Kyverno’s potent policies allow us to implement a solution that's universally applicable across all three cloud providers. If you wish to explore more about Kyverno, I recommend diving into their official documentation here.

Ngrok Ingress Controller

After some initial setup challenges, causing unnecessary complexity, my colleague Rene - a huge shout out to him here - found a way to streamline the traffic management resulting in only using the Ngrok ingress controller. This ingress controller introduces a unique approach to exposing the service essential for our plans. Unlike other controllers that merely expose workloads through a cluster load balancer service, Ngrok leverages its Edge Network to tunnel traffic between your cluster and the Ngrok backbone. This setup then makes it possible for Ngrok to expose these tunneled endpoints publicly, which is precisely what we need for implementing OIDC.

How it works

The core concept behind this setup involves using Ngrok as the ingress controller. It enables EntraID and other cloud identity service providers to access the necessary OIDC endpoints via the public internet. For the end-user, there is no issue either, since we can further limit any external access through the configured ingress resources.

In an ideal scenario, it would suffice to have a single Ngrok ingress manifest that configures both paths for issuer discovery in the OIDC process. Moreover, to ensure seamless operation, we'll need to permit unauthenticated requests to these endpoints, as they are typically safeguarded by Kubernetes' Role-Based Access Control (RBAC).

Below is a straightforward example demonstrating how a workload might authenticate to identity providers from within our cluster in the future.

Rough sketch of the auth-flow

Let’s get started!

First the preparation

Before you can start deploying, make sure you have kind, helm, and kubectl installed, the docker daemon running, and also signed up to Ngrok to claim your free domain. Then, save the domain name and additionally get a Ngrok API key as well as an auth token via the GUI. More information can be found here.

Note: Please replace the values accordingly in the rest of the deployment guide every time you see: <your-ngrok-domain>, <your-ngrok-api-key>, <your-ngrok-auth-token>

Now, you are ready to go!

Deploy your kind cluster

As explained before, kind is very flexible and we merely will use any of the tooling that it offers, but what we will use is the option to pass extra arguments to the Apiserver through our kind configuration file. Which will then look like this:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: local-ngrok-oidc
nodes:
- role: control-plane
  kubeadmConfigPatches:
  - |
    kind: ClusterConfiguration
    apiServer:
        extraArgs:
          anonymous-auth: "true"
          api-audiences: https://<your-ngrok-domain>,https://kubernetes.default.svc.cluster.local
          service-account-issuer: https://<your-ngrok-domain>
          service-account-jwks-uri: https://<your-ngrok-domain>/openid/v1/jwks
- role: worker
- role: worker
- role: worker        

Save this file somewhere and then use the command “kind create cluster --config <your-file-path>” to create your cluster. Verify that everything is running accordingly by checking with kubectl - eg. “kubectl pods get -A”.

Deploy the helm charts

After the cluster itself and the system pods are running, we can now install the helm charts for Kyverno and the Ngrok ingress controllers:

# Adding helm repositories
helm repo add ngrok https://ngrok.github.io/kubernetes-ingress-controller
helm repo add kyverno https://kyverno.github.io/kyverno/
helm repo update

# Installing the helm charts
helm upgrade -i kyverno kyverno/kyverno \
   --namespace kyverno \
   --create-namespace
helm upgrade -i ngrok-ingress-controller ngrok/kubernetes-ingress-controller \
   --namespace ngrok-ingress \
   --create-namespace \
   --set credentials.apiKey=<your-ngrok-api-key> \
   --set credentials.authtoken=<your-ngrok-auth-token>

# Waiting until the ingress controller is up and running
NGROK_REPLICAS=$(kubectl get deployments -n ngrok-ingress ngrok-ingress-controller-kubernetes-ingress-controller-manager -o=jsonpath='{.status.replicas}')
kubectl wait --for=jsonpath='{.status.readyReplicas}'=$NGROK_REPLICAS -n ngrok-ingress deployment ngrok-ingress-controller-kubernetes-ingress-controller-manager        

Adding a Ngrok-specific annotation to the Kubernetes service

As the helm charts are deployed now, we need to prepare the ingress setup for Ngrok caused by the fact that the Kubernetes Apiserver service is exposed through https instead of http. To make the Ngrok ingress controller aware of the backend service running on https please run:

kubectl patch service kubernetes --patch-file <file-path>

Using the following manifest file:

apiVersion: v1
kind: Service
metadata:
  name: kubernetes
  namespace: default
  annotations:
    k8s.ngrok.com/app-protocols: '{"https":"HTTPS"}'        

Creating the ingress and RBAC configurations

Ok, now with the ingress controllers running, we can apply the ingress configurations that we need and additionally already configure the unauthorized access to the OIDC issuer discovery paths - create the following manifest file and apply it with kubectl.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: oidc-ingress
spec:
  ingressClassName: ngrok
  rules:
  - host: <your-ngrok-domain>
    http:
      paths:
      - path: /.well-known/openid-configuration
        pathType: Prefix
        backend:
          service:
            name: kubernetes
            port:
              number: 443
      - path: /openid/v1/jwks
        pathType: Prefix
        backend:
          service:
            name: kubernetes
            port:
              number: 443
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: crb:oidc-viewer
subjects:
  - kind: Group
    name: system:unauthenticated
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: system:service-account-issuer-discovery
  apiGroup: rbac.authorization.k8s.io        

Create the Kyverno policies for label-based workload identity injection

In this section, you do not necessarily need to apply the whole manifest. It is fine to just deploy the policy for the cloud you want to test it on. As before, just create the manifest file and apply it with kubectl. What they have in common though is that all of them project the service account token with the specific audience for the cloud provider and add dedicated environment variables mostly based on the custom labels. Interestingly enough, these policies could even replace the provider-offered helm charts like the one that Azure published on GitHub with just a bit more minor changes and some finetuning.

Azure

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-azure-workload-identity
  annotations:
    policies.kyverno.io/title: Add Azure Workload Identity Configuration to Pod
    policies.kyverno.io/category: Sample
    policies.kyverno.io/subject: Pod, Volume
    policies.kyverno.io/minversion: 1.6.0
    policies.kyverno.io/description: >-
      Add the workload identity configuration in case the pod has the labels "azure.sa.volume/inject=enabled".
      It requires following labels for configuration if enabled:
      - azure.sa.volume/client-id: Azure client ID
      - azure.sa.volume/tenant-id: Azure tenant ID
spec:
  rules:
  - name: add-azure-volume
    match:
      any:
      - resources:
          kinds:
          - Pod
          selector:
            matchLabels:
              azure.sa.volume/inject: "enabled"
    mutate:
      patchesJson6902: |-
        - op: add
          path: /spec/volumes/-
          value:
            name: azure-iam-token
            projected:
              sources:
              - serviceAccountToken:
                  audience: api://AzureADTokenExchange
                  expirationSeconds: 86400
                  path: token
        - op: add
          path: /spec/containers/0/volumeMounts/-
          value:
            mountPath: /var/run/secrets/azure/tokens/azure-identity-token
            name: azure-iam-token
        - op: add
          path: /spec/containers/0/env/-
          value:
            name: AZURE_FEDERATED_TOKEN_FILE
            value: /var/run/secrets/azure/tokens/azure-identity-token/token
        - op: add
          path: /spec/containers/0/env/-
          value:
            name: AZURE_CLIENT_ID
            value: {{ request.object.metadata.labels."azure.sa.volume/client-id" }}
        - op: add
          path: /spec/containers/0/env/-
          value:
            name: AZURE_TENANT_ID
            value: {{ request.object.metadata.labels."azure.sa.volume/tenant-id" }}        

GCP

Note: In GCP I added an initContainer to create the credential file automatically and pass it via volume mounts to the main container. This was not required in Azure or AWS.

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-gcp-workload-identity
  annotations:
    policies.kyverno.io/title: Add GCP Workload Identity Configuration to Pod
    policies.kyverno.io/category: Sample
    policies.kyverno.io/subject: Pod, Volume
    policies.kyverno.io/minversion: 1.6.0
    policies.kyverno.io/description: >-
      Add the workload identity configuration in case the pod has the labels "gcp.sa.volume/inject=enabled".
      It requires following labels for configuration if enabled:
      - gcp.sa.volume/project-number: GCP project number
      - gcp.sa.volume/wl-pool: Workload Identity Pool name
      - gcp.sa.volume/wl-pool-provider: Workload Identity Pool provider name
spec:
  rules:
  - name: add-gcp-volume
    match:
      any:
      - resources:
          kinds:
          - Pod
          selector:
            matchLabels:
              gcp.sa.volume/inject: "enabled"
    mutate:
      patchesJson6902: |-
        - op: add
          path: /spec/volumes/-
          value:
            name: gcp-iam-token
            projected:
              sources:
              - serviceAccountToken:
                  audience: sts.googleapis.com
                  expirationSeconds: 86400
                  path: token
        - op: add
          path: /spec/volumes/-
          value:
            name: credential-config
            emptyDir: {}
        - op: add
          path: /spec/containers/0/volumeMounts/-
          value:
            mountPath: /var/run/secrets/sts.googleapis.com/serviceaccount
            name: gcp-iam-token
        - op: add
          path: /spec/containers/0/volumeMounts/-
          value:
            mountPath: /var/run/secrets/gcloud/config
            name: credential-config
        - op: add
          path: /spec/containers/0/env/-
          value:
            name: GOOGLE_APPLICATION_CREDENTIALS
            value: /var/run/secrets/gcloud/config/federation.json
  - name: add-gcp-volume-init
    match:
      any:
      - resources:
          kinds:
          - Pod
          selector:
            matchLabels:
              gcp.sa.volume/inject: "enabled"
    mutate:
      patchStrategicMerge: 
        spec:
          initContainers:
          - name: bootstrap-gcp-credentials-config
            image: google/cloud-sdk:slim
            env:
            - name: GCP_WORKLOAD_IDENTITY_POOL_PROVIDER
              value: "{{ request.object.metadata.labels.\"gcp.sa.volume/wl-pool-provider\" }}"
            - name: GCP_IDENTITY_TOKEN_FILE
              value: /var/run/secrets/sts.googleapis.com/serviceaccount/token
            - name: GCP_PROJECT_NUMBER
              value: "{{ request.object.metadata.labels.\"gcp.sa.volume/project-number\" }}"
            - name: GCP_WORKLOAD_IDENTITY_POOL
              value: "{{ request.object.metadata.labels.\"gcp.sa.volume/wl-pool\" }}"
            command:
            - /bin/sh
            - -c
            - |
              GCP_EXT_ACCOUNT_AUDIENCE="https://iam.googleapis.com/projects/$GCP_PROJECT_NUMBER/locations/global/workloadIdentityPools/$GCP_WORKLOAD_IDENTITY_POOL/providers/$GCP_WORKLOAD_IDENTITY_POOL_PROVIDER"
              mkdir -p /var/run/secrets/gcloud/config/
              cat <<EOF > /var/run/secrets/gcloud/config/federation.json
              {
                "type": "external_account",
                "audience": "$GCP_EXT_ACCOUNT_AUDIENCE",
                "subject_token_type": "urn:ietf:params:oauth:token-type:jwt",
                "token_url": "https://sts.googleapis.com/v1/token",
                "credential_source": {
                  "file": "$GCP_IDENTITY_TOKEN_FILE",
                  "format": {
                    "type": "text"
                  }
                }
              }
              EOF
            volumeMounts:
            - mountPath: /var/run/secrets/gcloud/config
              name: credential-config        

AWS

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-aws-workload-identity
  annotations:
    policies.kyverno.io/title: Add AWS Workload Identity Configuration to Pod
    policies.kyverno.io/category: Sample
    policies.kyverno.io/subject: Pod, Volume
    policies.kyverno.io/minversion: 1.6.0
    policies.kyverno.io/description: >-
      Add the workload identity configuration in case the pod has the labels "aws.sa.volume/inject=enabled".
      It requires following labels for configuration if enabled:
      - aws.sa.volume/role-name: AWS role name
      - aws.sa.volume/account-id: AWS account ID
spec:
  rules:
  - name: add-aws-volume
    match:
      any:
      - resources:
          kinds:
          - Pod
          selector:
            matchLabels:
              aws.sa.volume/inject: "enabled"
    mutate:
      patchesJson6902: |-
        - op: add
          path: /spec/volumes/-
          value:
            name: aws-iam-token
            projected:
              sources:
              - serviceAccountToken:
                  audience: sts.amazonaws.com
                  expirationSeconds: 86400
                  path: token
        - op: add
          path: /spec/containers/0/volumeMounts/-
          value:
            mountPath: /var/run/secrets/aws/tokens/aws-identity-token
            name: aws-iam-token
        - op: add
          path: /spec/containers/0/env/-
          value:
            name: AWS_IDENTITY_TOKEN_FILE
            value: /var/run/secrets/aws/tokens/aws-identity-token/token
        - op: add
          path: /spec/containers/0/env/-
          value:
            name: AWS_ROLE_ARN
            value: "arn:aws:iam::{{ request.object.metadata.labels."aws.sa.volume/account-id" }}:role/{{ request.object.metadata.labels."aws.sa.volume/role-name" }}"        

Setup the OIDC configuration on the cloud(s)

For demonstration purposes, I will provide small scripts for 3 hyperscalers to set up an OIDC federation.

Azure

ISSUER_URI="https://<your-ngrok-domain>"

app_config=$(az ad app create --display-name $SPN --required-resource-accesses '[
  {
    "resourceAccess": [
      {
        "id": "e1fe6dd8-ba31-4d61-89e7-88639da4683d",
        "type": "Scope"
      }
    ],
    "resourceAppId": "00000003-0000-0000-c000-000000000000"
  }
]')

# Create federated identity credential
cat >federation_config.json <<EOF
{
  "name": "localoidc",
  "issuer": "$ISSUER_URI",
  "subject": "system:serviceaccount:default:default",
  "description": "This token is used to auth from local oidc cluster.",
  "audiences": ["api://AzureADTokenExchange"]
}
EOF
az ad sp create --id $(echo $app_config | jq -r '.appId')
az ad app federated-credential create --id $(echo $app_config | jq -r '.appId') --parameters federation_config.json
rm federation_config.json        

GCP

PROJECT_NUMBER="<your-gcp-project-number>"
WORKLOAD_IDENTITY_POOL_ID="<some-oidc-pool-name>"
ISSUER_URI="https://<your-ngrok-domain>"
WORKLOAD_IDENTITY_POOL_PROVIDER_ID="<some-oidc-pool-provider-name>"

# Create workload identity pool
gcloud iam workload-identity-pools create $WORKLOAD_IDENTITY_POOL_ID --location global --description "Onprem Kubernetes Cluster" --project $PROJECT_NUMBER

# Create workload identity pool provider
gcloud iam workload-identity-pools providers create-oidc $WORKLOAD_IDENTITY_POOL_PROVIDER_ID --issuer-uri $ISSUER_URI --allowed-audiences "sts.googleapis.com" --attribute-mapping "google.subject=assertion.sub" --workload-identity-pool $WORKLOAD_IDENTITY_POOL_ID --location global --project $PROJECT_NUMBER

# Bind service account to workload identity pool provider
gcloud projects add-iam-policy-binding projects/$PROJECT_NUMBER \
    --role=roles/storage.objectViewer \
--member=principal://iam.googleapis.com/projects/$PROJECT_NUMBER/locations/global/workloadIdentityPools/$WORKLOAD_IDENTITY_POOL_ID/subject/system:serviceaccount:default:default \
    --condition=None        

AWS

ISSUER_DOMAIN=”<your-ngrok-domain>”
OIDC_PROVIDER_ARN=$(aws iam create-open-id-connect-provider --url "https://$ISSUER_DOMAIN" --client-id-list sts.amazonaws.com --output json | jq -r '.OpenIDConnectProviderArn')
cat >trust-relationship.json <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "$OIDC_PROVIDER_ARN"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "${ISSUER_DOMAIN}:aud": "sts.amazonaws.com",
          "${ISSUER_DOMAIN}:sub": "system:serviceaccount:default:default"
        }
      }
    }
  ]
}
EOF
AWS_PAGER="" aws iam create-role --role-name my-role --assume-role-policy-document file://trust-relationship.json --description "my-role-description" --no-paginate
rm trust-relationship.json        

Verify that everything works correctly

Now that we also have the configuration on the cloud provider side, we can finally test what we built by running the following command adjusted with the labels from the policies above and the cli image for your respective cloud provider:

kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: test-cli
  namespace: default
  labels:
    your-cloud-provider-labels: "check the kyverno rule description above for details"
spec:
  serviceAccountName: default
  containers:
  - name: test-pod
    image: google/cloud-sdk:slim or amazon/aws-cli or mcr.microsoft.com/azure-cli
    command: ["sleep","infinity"]
EOF
kubectl exec test-cli -it – /bin/bash        

Once you are connected to the pod itself execute the login command for your cloud’s cli utilizing the environment variables that we inject automatically via policy:

# Azure
az login --service-principal --tenant "${AZURE_TENANT_ID}" --username "${AZURE_CLIENT_ID}" --federated-token "$(cat ${AZURE_FEDERATED_TOKEN_FILE})"
# GCP
gcloud auth login --cred-file=${GOOGLE_APPLICATION_CREDENTIALS}
# AWS
aws sts assume-role-with-web-identity --role-arn $AWS_ROLE_ARN --role-session-name temp1234 --web-identity-token $(cat $AWS_IDENTITY_TOKEN_FILE)        

If you are not able to log in, check again that the policies are applied correctly and Kyverno is picking up your pod when it gets deployed. Also, have a second look at the cloud provider configuration that the domain and OIDC configuration in general are configured according to your details.

Yes, this concept also works for on-prem and cross-cloud authentication!

After working on this configuration for some days I recognized how easy it was to configure these automation policies in Kyverno and how it could be replicated not only in local clusters and home labs but also for Hybrid and Multi-Cloud setups. Some of our steps do not even need to be applied in this manner when you are using managed cloud Kubernetes, which makes it even more straightforward, while for on-prem the required flags for the Apiserver and co. would be fairly similar if we just look at the docs for rke2, where you are also configuring the apiserver flags to enable public OIDC.

Final thoughts

There are two key takeaways from this setup that we should keep in mind. First the exposure of our Kubernetes OIDC issuer, which allows us to set up the trust relationship between the cloud and our local cluster. Only through this, the workloads can access the cloud resources and authenticate to the necessary APIs.?

Second, the way we can leverage Kyverno to automate the injection of required manifest configurations for exactly these workloads. As mentioned in the article, there are partially dedicated helm charts that one can deploy to set up the mutation webhook similarly, but with Kyverno, we are easily able to configure it for all cloud providers by defining policies.?

Soon, we will also show you how Kyverno can help with even more complex scenarios and if you have any ideas feel free to share them in the comments!

Cheers,

Felix

Torben Liedtke

Senior Cloud Consultant @shiftavenue

4 个月

Insightful!

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了