Job is Done Right When Reported Right: Leverage K8S Events for Job Status Reporting

Job is Done Right When Reported Right: Leverage K8S Events for Job Status Reporting

Kubernetes, the powerful container orchestration platform, provides a variety of resources to manage containerized applications. Among these resources, Jobs play a crucial role in running tasks that need to be executed once or a specified number of times. Jobs are designed for tasks that need to be run to completion, such as batch processing, data transformation, or running scripts. Unlike other controllers like Deployments, which manage long-running services, Jobs are meant for finite tasks.

Looking at it through the DevOps lenses, Jobs can be best used in implementing pre- or post-deployment tasks, such as migrations, automated testing, etc.

In this article, we will explore how to enhance the visibility on the accomplishments of a Job with emitting Kubernetes Events.

Traditional Job status experience

To see the results of a Job, you can use the following methods:

kubectl get jobs: This command lists all Jobs in the specified namespace, showing their status. As Jobs run in Pods, we can list them also with kubectl get all:


kubectl returns all resources of the namespace, including the Job and the associated Pod

kubectl describe job <job-name>: This command provides detailed information about a specific Job, including its status, conditions, and events. The status of a Job can be found in its status fields, such as succeeded, failed, and active. Jobs have conditions that provide more detailed status information, such as Complete or Failed.


kubectl describe job <job-name> returns Job status fields

kubectl logs <pod-name>: Since Jobs create pods to execute tasks, you can check the logs of these pods to see the output of the Job.

Events: Kubernetes emits events that provide insights into the lifecycle of a Job, such as when it starts, completes, or fails.

Kubernetes Events to the rescue

Kubernetes events are objects that provide a record of what happened to a resource, such as a Job, Pod, or Node. Events are useful for debugging and understanding the state changes and actions taken by the Kubernetes control plane.

Events include information such as:

  • Type: The type of event (e.g., Normal, Warning).
  • Reason: A short, machine-readable string that describes the reason for the event.
  • Message: A human-readable description of the event.
  • Source: The component that generated the event.

Let's see the events associated with a Job:

kubectl events --watch: displays and follows the Kubernetes events of the actual namespace

kubectl events command iterates over the events of a namespace

kubectl get events --field-selector involvedObject.kind=Job: returns only events for Jobs

kubectl get events --field-selector involvedObject.name=<name of the job>,involvedObject.kind=Job: returns only events for a given Job

As we can see, the default events return only very basic information. If a Job performs multiple activities we likely prefer to receive a more detailed status information about their accomplishments.

Emit custom Events

To provide more visibility on the Job, you can emit events programmatically during the execution of the Job using the Kubernetes client libraries in languages like Go or Python. If your Job is just a shell-script and would not normally require kubectl, you can use the Kubernetes API to post events with REST using the curl command. In the following example we demonstrate this option.

Helm chart

To create a simple Job for demo, we set up a new Helm chart in our project directory:

# create a separate directory in our project for Helm charts
mkdir helm

# create Helm chart for the POC
helm create helm/job-poc

# remove unnecessary template files
rm helm/job-poc/templates/*.yaml        

Service Account for the Job

The account is granted access to the Kubernetes API to create new events in the current Namespace:

# file: helm/job-poc/service-account.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: {{ include "job-poc.fullname" . }}-job
  namespace: {{ .Release.Namespace }}

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: {{ include "job-poc.fullname" . }}-job
  namespace: {{ .Release.Namespace }}  
rules:
- apiGroups: [""]
  resources: ["events"]
  verbs: ["create"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: {{ include "job-poc.fullname" . }}-job
  namespace: {{ .Release.Namespace }}
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: {{ include "job-poc.fullname" . }}-job
subjects:
  - kind: ServiceAccount
    name: {{ include "job-poc.fullname" . }}-job
    namespace: {{ .Release.Namespace }}

        

The Job

The following Kubernetes resource defines a Job?that demonstrates sending events to the Kubernetes API server. This resource is configured to run a container that generates and sends a custom event upon completion.

apiVersion: batch/v1
kind: Job
metadata:
  name: {{ include "job-poc.fullname" . }}-job-{{ randAlphaNum 8 | lower }}
  labels:
    {{- include "job-poc.labels" . | nindent 4 }}  

spec:
  ttlSecondsAfterFinished: 600  # 10 mins
  template:
    metadata:
      labels:
        {{- include "job-poc.labels" . | nindent 8 }}  
    spec:
      serviceAccountName: {{ include "job-poc.fullname" . }}-job

      containers:
      - name: job
        image: "curlimages/curl:8.10.1"
        imagePullPolicy: IfNotPresent
        # {{- with .Values.imagePullSecrets }}
        # imagePullSecrets:
        # {{- toYaml . | nindent 8 }}
        # {{- end }}

        command: ["/bin/sh", "-c"]
        args: 
          - |
            echo "POC to demo sending K8S events"

            K8S_API_SERVER="https://kubernetes.default.svc"
            TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
            CACERT="/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"

            EVENT_NAME="PoC-Event-$(date +%s%N)-$RANDOM"

            # Create the event JSON payload
            EVENT_PAYLOAD=$(cat <<EOF
            {
              "metadata": {
                "name": "$EVENT_NAME",
                "namespace": "$NAMESPACE"
              },
              "involvedObject": {
                "kind": "Job",
                "name": "$JOB_NAME",
                "namespace": "$NAMESPACE"
              },
              "reason": "Accomplishment",
              "message": "Test [ OK ]",
              "type": "Normal",
              "source": {
                "component": "$JOB_NAME"
              }
            }
            EOF
            )

            # Send the event to the Kubernetes API
            curl -s -S -i \
              -k --cacert $CACERT -H "Authorization: Bearer $TOKEN" \
              -H "Content-Type: application/json" \
              -X POST \
              -d "$EVENT_PAYLOAD" \
              $K8S_API_SERVER/api/v1/namespaces/$NAMESPACE/events \
              --fail-with-body --retry 3 --retry-delay 5 --retry-connrefused --connect-timeout 10 -m 10

        env:
          # for event reporting:
          - name: NAMESPACE
            value: "{{ .Release.Namespace }}"
          - name: JOB_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.labels['job-name']

      restartPolicy: Never
        

name: The job name is dynamically generated using a template function to ensure uniqueness. Consequently, every time the Helm chart is deployed, a new Job will be started.

ttlSecondsAfterFinished: 600?(10 minutes) - This field specifies the time-to-live for the job after it has finished execution. After this time K8S removes the Job, so it also won't show up in the CLI-results and admin-UI.

serviceAccountName: Specifies the previously defined service account to be used by the job, which grants the Job access to publish K8S events.

image: curlimages/curl:8.10.1?- The example container image used to run the job. (We need only curl for this Job to run)

command + args: Specifies the shell command with the script to send a custom event to the Kubernetes API server.

The Job is supplied with two environment variables. env:NAMESPACE: The namespace in which the job is running. env:JOB_NAME: The name of the job, derived from the job's metadata labels.

restartPolicy: Never?- Ensures the job does not restart upon completion.

Event Reporting Script

The script executed by the job performs the following steps:

Environment Variables: Sets up necessary environment variables for the Kubernetes API server, token, and CA certificate.3.

Event Name: Event names need to be unique, therefore and event name is generated using the current timestamp and a random number.

Event Payload: Constructs the JSON payload for the event, including metadata, involved object details, reason, message, type, and source.

Curl Command: Sends the event payload to the Kubernetes API server using the curl?command with appropriate headers.

Deployment of the chart

The following command deploys the chart which provisions the Service Account and a new instance of the POC-Job:

helm upgrade --install job-poc ./helm/job-poc --namespace poc --create-namespace        

Once the chart is deployed, the namespace can be explored with kubectl get all -n poc and kubectl events --watch -n poc as seen previously.


kubectl events outputs the custom event the Job sent

As we can see, the Job has reported "Test [ OK ]" as its "Accomplishment".

Events are visible on associated K8S dashboards and monitoring systems, which enhances the observability experience:

Screenshot of HeadLamp UI displaying the events of the Job, including our custom one

Conclusion

Understanding and correctly reporting the status of Kubernetes Jobs is essential for ensuring that tasks are executed as expected and for debugging any issues that arise. By leveraging the tools and methods provided by Kubernetes, you can effectively monitor and manage your Jobs, ensuring that they are done right and reported right.

In this article we demonstrated with a proof of concept how native Kubernetes events can be generated and reported to enhance the status reporting of a Job.

要查看或添加评论,请登录

Richard Pal的更多文章

社区洞察

其他会员也浏览了