Resolving AKS and ACR Cross-Tenant Image Pull Issues: A Practical Guide
In multi-tenant Azure environments, managing Kubernetes clusters (AKS) to pull container images from Azure Container Registry (ACR) in another tenant can be challenging. A common issue is the dreaded ImagePullBackOff error when deploying workloads. Recently, I faced this scenario and successfully resolved it. Here’s a step-by-step guide to help you overcome it.
The Scenario
- AKS Cluster: Hosted in TenantAKS.
- ACR: Located in TenantACR.
- Issue: Pods in AKS were unable to pull container images from ACR due to cross-tenant authentication issues, resulting in an image pull back error.
?? The Problem: Image Pull Failure Across Tenants
The deployment failed with the following error:
Failed to pull image "acr_name.azurecr.io/repository:tag": rpc error: code = Unknown desc = Error response from daemon: pull access denied
Despite having a correctly configured AKS cluster and a functioning ACR, the images refused to load. The culprit? Cross-tenant authentication gaps.
Solution Overview
The solution involves creating a Service Principal (SP) with appropriate permissions, securely storing credentials in AKS, and linking them to the deployment (or CronJob in my case).
?? The Solution: Establishing Secure Access Between AKS and ACR
To enable the AKS cluster to authenticate with an ACR in another tenant, we need to leverage Service Principal (SP) credentials.
Step 1: Create and Configure the Service Principal in TenantACR
There are two ways to create a Service Principal (SP) and generate the necessary credentials:
- Using Azure CLI (Command-Based Approach)
Execute the following command:
az ad sp create-for-rbac --name "acr-access-sp" --role AcrPull --scopes $(az acr show --name <ACR_NAME> --query id -o tsv)
This command returns essential details: appId, password, and tenant.
- Using Azure Portal (Azure Entra ID):
a. Go to Azure Entra ID (formerly Azure AD) > App registrations > New registration.
b. Enter a name, select the appropriate account type, and click Register.
c. Once the app is registered, navigate to Certificates & secrets.
d. Click New client secret, provide a description, set an expiration, and copy the generated secret.
In both methods, ensure you assign the AcrPull role:
- Assign the SP the AcrPull Role if it isn't done automatically:
az role assignment create --assignee <appId> --role "AcrPull" --scope $(az acr show --name <ACR_NAME> --query id -o tsv)
?? Step 2: Create a Kubernetes Secret for ACR Authentication
Once the Service Principal is ready, its credentials must be securely stored in AKS.
领英推è
- Run the following command:
kubectl create secret docker-registry acr-auth \
--docker-server=<acr-login-server> \
--docker-username=<appId> \
--docker-password=<password> \
--docker-email=<email>
Example:
kubectl create secret docker-registry acr-auth \
--docker-server=myregistry.azurecr.io \
--docker-username=12345abc-de67-89f0-g123-hijklmn45678 \
--docker-password=superSecretPassword \
--docker-email=user@example.com
?? Step 3: Update the CronJob to Reference the Secret
To resolve the image pull issue, the CronJob’s configuration must reference the created secret.
- Edit the CronJob:
kubectl edit cronjob <cronjob-name> -n <namespace>
Add or Modify the imagePullSecrets:
spec:
template:
spec:
imagePullSecrets:
- name: acr-auth
?? Step 4: Manually Trigger the CronJob for Verification
To ensure the setup works, manually start the CronJob:
kubectl create job --from=cronjob/<cronjob-name> <new-job-name> -n <namespace>
Check for running jobs and their logs:
kubectl get jobs -n <namespace>
kubectl logs <pod-name> -n <namespace>
? Key Takeaways
- Cross-Tenant Authentication Requires Precision: Ensure proper SP configuration and role assignment.
- Use the Right Secret Name: Verify that imagePullSecrets matches the secret created.
- Manual Execution for Testing: Triggering CronJobs manually ensures configuration integrity.
?? Lessons Learned
- Role Assignment Visibility: Always double-check the role assignments on ACR.
- Namespace Accuracy: Ensure the secret and deployments reside in the correct namespace.
- Manual Testing Helps: Running jobs manually reveals misconfigurations faster than waiting for schedules.
?? Pro Tip
Regularly update your Azure CLI and Kubernetes tools to avoid compatibility issues with cross-tenant scenarios.
Conclusion
Managing Kubernetes workloads across Azure tenants adds complexity but also reinforces the value of understanding service-to-service authentication. If you’re navigating similar waters, I hope this guide helps you steer clear of common pitfalls. Feel free to share your experiences or ask questions in the comments!