Automate self-hosted GitHub Runner creation using Azure Container Instances & GitHub Actions
Owain Osborne-Walsh
?? Driving Cloud Native Adoption @ Microsoft | App Platforms Lead | MSFT SAE | AWS SAP & SCS | RHCE
Introduction
Recently, as part of a larger demo I am creating for an upcoming LunchBytes on the Microsoft Developer YouTube Channel (with accompanying blog coming soon) I have been experimenting with using Self-Hosted runners within GitHub actions. This is so that I can configure, post provision, private resources in my VNET. Dynamically getting access tokens, URLs and passing these through to a container image. This very much felt like a throw back to my cloud engineer days at IBM.
As I would like my LunchBytes demo to be reusable and available to anyone at the click of a button automating the creation and configuration of the private self hosted runner became very important. Traditionally this work may be done by static VM's that sit and wait for jobs however those VM's are expensive, inefficient and prone to errors.
Instead Azure Container Instances lend themselves very well to this type of work. ACI is a PaaS offering providing individually provisioned managed containers that allow users to simply click and deploy individual applications into isolated container instances.
ACI with Github Actions allows us to dynamically provision our selfhosted runners for individual runs.
Although I could find plenty of blog posts detailing the steps required to manually setup ACI (or any PaaS) runners, I could not find one that detailed the nuances of automating such a deployment. As a result I have created this mini blog post and GitHub Repository with the required workflows and files to help others get up and running faster.
This blog includes creation of the entire ACI group using Terraform. In production some small tweaks could be made to provision additional instances within the same group.
Implementation
Subscription Setup
As in my demo I am provisioning all of my infrastructure with Terraform I also used Terraform to create my container instance group. I would advise forking the repository so you can just update the repo secrets to get up and running.
To start with we need to set our environment and azure subscription up. I have created a script to create the necessary terraform and azure service principal that we will use later. (Although all code snippets are included in the article I don't recommend copying and pasting from here, instead use GitHub).
This script can be ran from your terminal after changing some variables such as the resource group name and your subscription ID.
It can be found here (GitHub Link):
#!/bin/bash
# This Azure CLI script helps prepare everything you need to run Terraform in GitHub Actions. It Sets up:
? ? # Storage Account and Container to store Terraform State remotely.
? ? # Creates a Service Principal and then assigns contributor at tenant root. Note: you may wish to reduce this scope for your deployment down to single Subscriptions etc!
? ? # Please change the variables to suit your requirements!
?
az login
########################################
# Set the below
########################################
export location="uksouth" ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # This sets the Resource Group and Storage Account location.
export rgname="example-rg-name" ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# This sets the Resource Group name the Storage Account will be deployed into.
export strname="oowghexample" ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# This sets the Storage Account name - note this must be unique!
export containername="tfstate" ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# This sets the Container name.
export envtag="Environment=TFStorage" ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # This sets the Environment Tag applied to the Resource Group and Storage Account.
export spname="tfdeploy" ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# This sets the Service Principal Name
# Below Subscription should be the Management Subscription
export mansub="xxxx-xxxx-xxxx-xxxx-xxxx" ? ? ? ? ? ? ? ?# This is the ID of the Subscription to deploy the Resource Group and Storage Account into.
########################################
# Creates Resource Group and Storage Account for TF State File Storage
az account set -s $mansub
az group create --location $location --name $rgname --tags $envtag
az storage account create --location $location --resource-group $rgname --name $strname --tags $envtag --https-only --sku Standard_LRS --encryption-services blob --subscription $mansub
export storageacckey=$(az storage account keys list --resource-group $rgname --account-name $strname --query '[0].value' -o tsv)
az storage container create --name $containername --account-name $strname --account-key $storageacckey
# Creates Service Principal for TF to use and gives access at root.
export spid= az ad sp create-for-rbac -n $spname --role Owner --scopes ?/subscriptions/$mansub
########################################
# Information to setup GitHub Secrets and Terraform backend configuration is output by the script below.
########################################
Write-Output "
Below are the details of the storage account that will need to be in the Terraform Backend Configuration:
Resource Group: $rgname
Storage Account: $strname
Container Name: $containername
Below are the details of the Service Principal that will need to be in the GitHub Repo Secrets:
ARM_CLIENT_ID: "$spid.appid"
ARM_CLIENT_SECRET: "$spid.password"
ARM_TENANT_ID: "$spid.tenant"
ARM_SUBSCRIPTION_ID: $mansub"
The key thing to note from this script are the outputs at the bottom. We will need these outputs to provision our ACI resources through GitHub Actions.
GitHub Actions
The first thing we have to do in GitHub is create or use an existing PAT. This is an unfortunate step that I was hoping to avoid by using the GitHub token that is created for the workflow at run time. The action secret however does not have enough permissions to generate a runner authentication token therefore a PAT has to be created.
We can create a PAT through the GitHub portal at the following link. It is advisable to use RBAC settings that create PATs with the least privilege. Once the PAT is created take a note of the value as we will use it in the next step.
Now let's take a look at our GitHub actions workflow. To start with we need to use the values we created in that previous script and store them as GitHub Action Secrets. Assuming you have forked the repo we can do that by navigating over to Settings > Security > Secrets & Variables > Actions and updating the following variables with the values we set or output in the previous script.
We have two variables that needs to be set that isn't directly from output is the TF_GH_REPO_URL which needs to be set your own repo URL. We also need to set GH_PAT to the PAT we created earlier.
Once these values are set we can then look at how they are used in our Github Action workflow.
name: Provision and Configure Azure Infrastructure
# Controls when the workflow will run
on:
? # Allows you to run this workflow manually from the Actions tab
? workflow_dispatch:
# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
? Terraform_Provision:
? ? # The type of runner that the job will run on
? ? runs-on: ubuntu-latest
? ? permissions: write-all
? ?
? ? # Set the working directory to main for the config files
? ? defaults:
? ? ? run:
? ? ? ? shell: bash
? ? ? ? working-directory: /home/runner/work/aci-gh-runner-automation/aci-gh-runner-automation/Terraform
? ? # Steps represent a sequence of tasks that will be executed as part of the job
? ? steps:
? ? ? # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
? ? ? - uses: actions/checkout@v3
? ?
? ? ? # Install the preferred version of Terraform CLI
? ? ? - name: Setup Terraform
? ? ? ? uses: hashicorp/setup-terraform@v1
? ? ? ? with:
? ? ? ? ? terraform_version: 1.4.5
? ? ? - name: Generate runner url from secret
? ? ? ? run: |
? ? ? ? ? ? export gh_runner_url="https://api.github.com/repos/${{ github.repository }}/actions/runners/registration-token"
? ? ? ? ? ? echo "GH_runner_url=$gh_runner_url" >> $GITHUB_ENV
? ? ? - name: Generate runner registration token
? ? ? ? run: |
? ? ? ? ? ? export token="$(curl -L ? -X POST ? -H "Accept: application/vnd.github+json" ? -H "Authorization: Bearer ${{ secrets.GH_PAT }} " ?-H "X-GitHub-Api-Version: 2022-11-28" $GH_runner_url | jq -r .token)"
? ? ? ? ? ? echo "GH_runner_token=$token" >> $GITHUB_ENV
? ? ?
? ? ?# Run Terraform Init
? ? ? - name: Terraform Init for Intial Terraform Config
? ? ? ? working-directory: /home/runner/work/aci-gh-runner-automation/aci-gh-runner-automation/Terraform
? ? ? ? id: init
? ? ? ? env:
? ? ? ? ?ARM_CLIENT_ID: ${{ secrets.ARM_CLIENT_ID }}
? ? ? ? ?ARM_CLIENT_SECRET: ${{ secrets.ARM_CLIENT_SECRET }}
? ? ? ? ?ARM_TENANT_ID: ${{ secrets.ARM_TENANT_ID }}
? ? ? ? ?ARM_SUBSCRIPTION_ID: ${{ secrets.ARM_SUBSCRIPTION_ID }}
? ? ? ? ?RESOURCE_GROUP: ${{ secrets.RESOURCE_GROUP }}
? ? ? ? ?STORAGE_ACCOUNT: ${{ secrets.STORAGE_ACCOUNT }}
? ? ? ? ?CONTAINER_NAME: ${{ secrets.CONTAINER_NAME }}
? ? ? ? run: terraform init -backend-config="storage_account_name=$STORAGE_ACCOUNT" -backend-config="container_name=$CONTAINER_NAME" -backend-config="resource_group_name=$RESOURCE_GROUP"
? ? ?
? ? ? # Run Terraform Apply with Auto Approve
? ? ? - name: Terraform Apply for Intial Terraform Config
? ? ? ? working-directory: /home/runner/work/aci-gh-runner-automation/aci-gh-runner-automation/Terraform
? ? ? ? env:
? ? ? ? ?ARM_CLIENT_ID: ${{ secrets.ARM_CLIENT_ID }}
? ? ? ? ?ARM_CLIENT_SECRET: ${{ secrets.ARM_CLIENT_SECRET }}
? ? ? ? ?ARM_TENANT_ID: ${{ secrets.ARM_TENANT_ID }}
? ? ? ? ?ARM_SUBSCRIPTION_ID: ${{ secrets.ARM_SUBSCRIPTION_ID }}
? ? ? ? ?RESOURCE_GROUP: ${{ secrets.RESOURCE_GROUP }}
? ? ? ? ?STORAGE_ACCOUNT: ${{ secrets.STORAGE_ACCOUNT }}
? ? ? ? ?CONTAINER_NAME: ${{ secrets.CONTAINER_NAME }}
? ? ? ? ?gh_repo_url: ${{ secrets.TF_GH_REPO_URL }}
? ? ? ? run: terraform apply -auto-approve -var="gh_pat=$GH_runner_token" -var="gh_repo_url=$gh_repo_url"
There are a few things to highlight in this workflow:
Lets now take a look at how we are provisioning and configuring our ACI instance for our self hosted runners. I will add a disclaimer here that I don't consider myself a Terraform expert so there may be more efficient ways of writing these files but I will confirm these work.
To start with we have a main.tf file, variables.tf and output.tf (empty) in the root of our Terraform folder. These files are just setting the context calling the ACI module. The three variables that can be changed in the root variables.tf can be used are the resource group name you want to create your instances into, the region you want to deploy too and finally the VNET name.
It is likely you will want to edit the VNET creation in main.tf to be using data from an already created VNET but for the purpose of the repository I will create a new one.
These files can be found below (GitHub Link):
领英推荐
data "azurerm_client_config" "current" {}
resource "azurerm_subnet" "runners_subnet" {
? name ? ? ? ? ? ? ? ? = "runners-subnet"
? resource_group_name ?= var.resourceGroupName
? virtual_network_name = var.vnetName
? address_prefixes ? ? = ["10.240.10.0/26"]
? delegation {
? ? name = "delegation"
? ? service_delegation {
? ? ? name ? ?= "Microsoft.ContainerInstance/containerGroups"
? ? ? actions = ["Microsoft.Network/virtualNetworks/subnets/join/action", "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action"]
? ? }
? }
}
resource "azurerm_container_group" "self_hosted_runners" {
? depends_on = [azurerm_subnet.runners_subnet]
? name ? ? ? ? ? ? ? ?= "github-runners"
? location ? ? ? ? ? ?= var.location
? resource_group_name = var.resourceGroupName
? ip_address_type ? ? = "Private"
? os_type ? ? ? ? ? ? = "Linux"
? subnet_ids = [azurerm_subnet.runners_subnet.id]
? container {
? ? name ? = "runner"
? ? image ?= "owain.azurecr.io/selfhostedrunner:latest"
? ? cpu ? ?= "1"
? ? memory = "1.5"
? ? environment_variables = {"GH_REPO_URL":var.gh_repo_url}
? ? secure_environment_variables = {
? ? ? GH_PAT = var.gh_pat
? ? }
? ? ports {
? ? ? port ? ? = 443 # Not open as private but required for tf creation
? ? ? protocol = "TCP"
? ? }
? ? ?
}
}
variable "location" {
? type ? ?= string
? default = "EastUs"
}
variable "rg_name" {
? type ? ?= string
? default = "rg-aci-ghrunners"
}
variable "vnet_name" {
? type = string
? default = "test-vnet"
?
}
variable gh_pat {
? ? default = ""
}
variable gh_repo_url {
? ? default = ""
}
You can change the subnet to a data source if it already exists or edit the VNET range. In our main.tf we call the ACI module.
Azure Container Instance Module:
Our Azure Container instance group is created with the files in the ACI module. In this module its worth highlighting we create a new dedicated subnet within our VNET for our runner. We also have blank variables created in the variables.tf file to pass through the GitHub Runner Token and URL from our workflow through to our container.
The files look as follows (GitHub Link):
data "azurerm_client_config" "current" {}
resource "azurerm_subnet" "runners_subnet" {
? name ? ? ? ? ? ? ? ? = "runners-subnet"
? resource_group_name ?= var.resourceGroupName
? virtual_network_name = var.vnetName
? address_prefixes ? ? = ["10.240.10.0/26"]
? delegation {
? ? name = "delegation"
? ? service_delegation {
? ? ? name ? ?= "Microsoft.ContainerInstance/containerGroups"
? ? ? actions = ["Microsoft.Network/virtualNetworks/subnets/join/action", "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action"]
? ? }
? }
}
resource "azurerm_container_group" "self_hosted_runners" {
? depends_on = [azurerm_subnet.runners_subnet]
? name ? ? ? ? ? ? ? ?= "github-runners"
? location ? ? ? ? ? ?= var.location
? resource_group_name = var.resourceGroupName
? ip_address_type ? ? = "Private"
? os_type ? ? ? ? ? ? = "Linux"
? subnet_ids = [azurerm_subnet.runners_subnet.id]
? container {
? ? name ? = "runner"
? ? image ?= "owain.azurecr.io/selfhostedrunner:latest"
? ? cpu ? ?= "1"
? ? memory = "1.5"
? ? environment_variables = {"GH_REPO_URL":var.gh_repo_url, "GH_PAT":var.gh_pat}
? ? ports {
? ? ? port ? ? = 443 # Not open as private but required for tf creation
? ? ? protocol = "TCP"
? ? }
? ? ?
}
}
It is worth noting here that if you do want to use ACI for private configurations that rely on private DNS zones, ACI does not automatically inherit the private DNS zones of the VNET it is deployed in. You will need to configure a DNS_Config block in the above terraform.
variable location {
? type=string
? default="EastUs"
}
variable resourceGroupName {
? type=string
? default=""
}
variable vnetName {
? ? type = string
? ? default = ""
}
variable gh_pat {
? ? default = ""
}
variable gh_repo_url {
? ? default = ""
}
Container Image:
I have provided the Dockerfile that I have used to create the "selfhostedrunner" image that is hosted in my public Azure Container Registry. Feel free to tweak this and build it to fit your needs. The public version I released uses the variables passed through from the workflow and has no hardcoded values it simply executes the configure script that can be found in the scripts folder within the ACI module.
The Dockerfile looks as follows (GitHub Link):
FROM ubuntu:22.04 as base
RUN \
apt-get update -y && \
apt-get install build-essential -y && \
apt-get install curl -y && \
apt-get install dotnet-sdk-6.0 -y && \
apt-get install tar && \
DEBIAN_FRONTEND=noninteractive apt-get install git -y
RUN git clone https://github.com/owainow/aci-gh-runner-automation.git
RUN chmod +x /aci-gh-runner-automation/Terraform/modules/aci/scripts/configureLinuxRunner.sh
ENTRYPOINT /aci-gh-runner-automation/Terraform/modules/aci/scripts/configureLinuxRunner.sh
The configure script that is called looks as follows (GitHub Link):
#!/bin/bash
# Create a folder
mkdir actions-runner && cd actions-runner
curl -o actions-runner-linux-x64-2.304.0.tar.gz -L https://github.com/actions/runner/releases/download/v2.304.0/actions-runner-linux-x64-2.304.0.tar.gz
tar xzf ./actions-runner-linux-x64-2.304.0.tar.gz
export RUNNER_ALLOW_RUNASROOT=1
# Create the runner and start the configuration experience
./config.sh --url $GH_REPO_URL --token $GH_PAT
# Last step, run it!
nohup ./run.sh
sleep 60
We can see in the script above that we are using the variables initially in our GitHub Action workflow and passing them to the config.sh script used to register our GitHub Runner.
Once we run the workflow we can monitor the success of the deployment.
If we go to the Azure portal and check on our container logs we can see that the ACI container has successfully been registered.
If we go to our GitHub settings for our repository and go to actions we can also see the registered instance ready to be used.
Conclusion
By setting the appropriate values in our GitHub Secrets we can use hosted runners, Terraform and GitHub actions to build an automated method of creating self-hosted ACI runners great for configuring infrastructure within our private networks.
Although outside of the scope of this demo we could also in the same workflow implement clean up of these runners both in Azure and GitHub actions using the same toolset once the workflow is complete.
Azure Container Instances could also be swapped for other PaaS offerings such as Azure Container Apps.
Data scientist
4 个月I've tried this solution and everything went well but when checking the runners I can't see the registered instance. On the Azure portal as well, I get this error: Http response code: NotFound from 'POST https://api.github.com/actions/runner-registration' (Request Id: 10C0:125F3A:AEE4CC3:B1036ED:6690F7DC) {"message":"Unable to find runner owner.","documentation_url":"https://docs.github.com/rest","status":"404"} Do you have any idea what the problem could be? ??
Staff Devops Engineer @ Tobii | DevOps
8 个月Does this mean, a container is kept running all the time even if it is not in use ?
Cloud Security Sales Engineer at CrowdStrike
1 年Expert GIF maker
Cloud Engineering Leader | 36x Microsoft certified
1 年Charlene McKeown