Troubleshooting Ingress Controller Issues in Kubernetes
NGINX Ingress Controller

Troubleshooting Ingress Controller Issues in Kubernetes

Information: Container Service for Kubernetes provides the NGINX Ingress Controller that is optimized based on the source version. The NGINX controller provided by ACK is compatible with the open source version and supports all the annotations provided by the open source version. We can install the NGINX ingress controller when you create an ACK cluster.

NGINX Ingress Controller

Ingress can work as normal only if you deploy NGINX Ingress Controller in the cluster parse the routing rules of the Ingresses. After the NGINX controller receives a request that matches a routing rule, the NGINX Ingress controller routes the request to a corresponding backend service. The backend service then forwards the request to pods.

The NGINX ingress controller acquires Ingress rule changes from the API server and dynamically generates configuration files, such as nginx.conf. These configuration files are required by a load balancer, such as NGINX. Then, the NGINX Ingress controller reloads the load balancer.

For example: The NGINX Ingress controller runs the nginx –s load command to reload NGINX and generates new Ingress rules.

?

Cause: Misconfigured ingress or SSL/TLS problems.

Solution: Verify ingress configuration and check SSL/TLS certificates.

Dealing with Kubernetes Ingress Controller issues? Dive into the troubleshooting process:

1. Review Ingress Configuration:

  • Paths and Services: Ensure that the paths and backend services are correctly specified in your Ingress YAML.
  • Annotations: Check for any custom annotations that might impact the behaviour of your Ingress resource.
  • Service Discovery: Verify that the backend services mentioned in Ingress are discoverable within the cluster.

2. SSL/TLS Certificate Check:

  • Certificate Validity: Confirm the validity and expiration dates of your SSL/TLS certificates.
  • Certificate Chain: Inspect the certificate chain to make sure it is correctly configured and doesn’t break the trust.
  • Secrets Configuration: Double-check that the Ingress resource points to the correct Kubernetes secret where SSL/TLS certificates are stored.
  • Renewal Process: If using automated certificate management (e.g., Let's Encrypt), ensure the renewal process is functioning correctly.

3. Logging and Monitoring:

  • Logging Setup: Implement detailed logging for your Ingress Controller to capture events and errors.
  • Monitoring Tools: Use monitoring tools to set up alerts for any unusual Ingress-related activity.
  • Error Analysis: Regularly analyse logs for error messages or patterns that might indicate issues.

4. Testing:

  • Kubectl Commands: Use kubectl to test the Ingress configuration for basic functionality.
  • Specialized Tools: Leverage specialized tools (e.g., Kube-score) to assess the quality of your Ingress setups.
  • SSL/TLS Testing: Employ SSL/TLS testing tools (e.g., SSL Labs) to validate certificates and encryption settings.

Commonly used diagnostic methods:

??????????

Diagnostics Procedure

?

Use the ingress diagnostics feature.

1.????? Log on to the ACK Console. In the left-side navigation pane, click?Clusters.

2.????? On the?Clusters?page, click the name of the cluster that you want to manage and choose?Inspections and Diagnostics?>?Diagnostics?in the left-side navigation pane.

3.????? On the?Diagnosis?page, click?Ingress Diagnosis.

4.????? In the?Ingress Diagnosis?panel, enter the URL that cannot be accessed, such as https://www.example.com. Select?I know and agree?and then click?Create diagnosis.

After the diagnostic is completed, you can view the diagnostic result and try to fix the issue.

?

Diagnose the access log of the NGINX Ingress Controller pod in Simple Log Service.

You can check the access log format of the NGINX Ingress controller in the nginx-configuration ConfigMap in the kube-system namespace.

The following sample code shows the default format of the access log of the NGINX Ingress controller:

$remote_addr - [$remote_addr] - $remote_user [$time_local]
    "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" $request_length
    $request_time [$proxy_upstream_name] $upstream_addr $upstream_response_length
    $upstream_response_time $upstream_status $req_id $host [$proxy_alternative_upstream_name]        

The following figure shows the page on which the access log of the NGINX Ingress controller is displayed in the Simple Log Service Console.

Simple Log Service Console


By default, you can run the following command to query the recent access log of the NGINX Ingress controller.

kubectl logs <controller pod name> -n <namespace> | less        

Expected output:

42.11.**.** - [42.11.**.**]--[25/Nov/2021:11:40:30 +0800]"GET / HTTP/1.1" 200 615 "_" "curl/7.64.1" 76 0.001 [default-nginx-svc-80] 172.16.254.208:80 615 0.000 200 46b79dkahflhakjhdhfkah**** 47.11.**.**[]
42.11.**.** - [42.11.**.**]--[25/Nov/2021:11:40:31 +0800]"GET / HTTP/1.1" 200 615 "_" "curl/7.64.1" 76 0.001 [default-nginx-svc-80] 172.16.254.208:80 615 0.000 200 fadgrerthflhakjhdhfkah**** 47.11.**.**[]        

Diagnose the error log of the NGINX Ingress controller pod

You can diagnose the error log of the NGINX Ingress controller pod to narrow down the scope of troubleshooting. The error log of the Ingress controller pod includes the following types:

·??The log that records errors of the Ingress controller. Typically, this type of error log is generated due to invalid Ingress configurations. You can run the following command to query this type of error log:

kubectl logs <controller pod name> -n <namespace> | grep -E ^[WE]        

·??The log that records errors of the NGINX application. Typically, this type of error log is generated due to request processing failures. You can run the following command to query this type of error log:

kubectl logs <controller pod name> -n <namespace> | grep error        

Manually access the Ingress and backend pod by using the Ingress controller pod

1.????? Run the following command to log on to the Ingress controller pod:

kubectl exec <controller pod name> -n <namespace> -it -- bash        

1.????? The Ingress controller pod is preinstalled with curl and OpenSSL, which allow you to test network connectivity and verify certificates.

Run the following command to test the network connectivity between the Ingress and the backend pod:

# Replace your.domain.com with the actual domain name of the Ingress. curl -H "Host: your.domain.com" https://127.0.**.**/ # for http????
curl --resolve your.domain.com:443:127.0.0.1 https://127.0.0.1/ # for https        

Run the following command to verify the certificate:

openssl s_client -servername your.domain.com -connect 127.0.0.1:443        

Test access to the backend pod.

Run the following Kubectl command to query the IP address of the backend pod:

kubectl get pod -n <namespace> <pod name> -o wide        

Expected output:

NAME????????????????????? READY??? STATUS??? RESTARTS?? AGE??? IP??????????? NODE??????????????????????? NOMINATED NODE??? READINESS GATES
nginx-dp-7f5fcc7f-****??? 1/1????? Running?? 0????????? 23h??? 10.71.0.146?? cn-beijing.192.168.**.**??? <none>??????????? <none>        

The output shows that the IP address of the backend pod is 10.71.0.146.

To test the network connectivity between the Ingress controller pod and the backend pod, run the following command to connect to the IP address by using the Ingress controller pod:

?curl https://<your pod ip>:<port>/path        

Capture Packets.

If you cannot identify the issue, capture and diagnose packets.

  • Check whether the issue is related to the Ingress controller pod or the application pod. If this cannot be done, capture packets for both the Ingress controller pod and the application pod.
  • Log on to the nodes on which the application pod and Ingress controller pod run.
  • Run the following command on the Elastic Compute Service (ECS) instances to capture all recent packets that are received by the Ingress:

tcpdump -i any host <Application pod IP or Ingress controller pod IP> -C 20 -W 200 -w /tmp/ingress.pcap        

  • If an error is identified in the log data, stop capturing packets.
  • Diagnose the packets that are transferred during the time period in which the error occurred.

Elliott A.

Senior System Reliability Engineer / Platform Engineer

9 个月

Thank you for writing this article Nishan

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了