The simple perfect CI pipeline: AWS (w/Control Tower), GitHub Actions, Terraform & OIDC running from a monorepo.
Image generated by DAL·E

The simple perfect CI pipeline: AWS (w/Control Tower), GitHub Actions, Terraform & OIDC running from a monorepo.

The user story

As a [ put your role here ],?I want to set up a CI/CD pipeline that enables the seamless deployment of Infrastructure as Code (IaC) from development to production using a single repository with a branch per productive environment. This pipeline should adhere to an identity standard for authentication and authorization. Additionally, the pipeline should include stages for linting, Static Application Security Testing (SAST), Software Composition Analysis (SCA), and Financial Operations (FinOps) before execution. By adopting a monorepo strategy, we aim to centralize version control, simplify collaboration, and ensure consistent deployment across different environments for the application project.

The tool selection

  • AWS, ideally with Control Tower Landing Zone implemented.
  • GitHub with Actions for pipelines configuration.
  • OpenID Connect (OIDC) for authentication and authorization.
  • Terraform (HCL) for describing IaC.

Disclaimer for purists: while OIDC's primary role is authentication, it works within a framework (OAuth 2.0) that handles authorization.


The implementation rationale

You are in a situation where you either want or have:

  • a single application monorepo for application code, libraries and IaC.
  • a centralized pipeline that can leverage i.e. webhooks to deploy multiple application though the same centralized pipeline (less preferable, obviously).

An example monorepo structure can look like this:

/.github
    /workflows
/iac
    /dev
    /prod
    /sandbox
/app
    /service1
    /service2
/modules
    /module1
    /module2        

Advantages of a monorepo, specially thinking on IaC:

  • A mono repo becomes a single source of truth for application and IaC code.
  • It consolidates IaC configuration for testing, which can be important for db testing, queues, event streaming and/or data pipelines.

Disadvantages of a monorepo:

  • Depending on project's code tidiness, scaling can be an issue.
  • A more strict access control should be applied to the project's code, i.e. using CODEOWNERS file for access control limitation in GitHub.


The technical implementation

AWS w/Control Tower

AWS Control Tower is the AWS recommendation for a Landing Zone implementation. In plain few words, what you obtain from AWS with Control Tower is the necessary level of tidiness for a long-term, successful, AWS footprint, a single sign-on mechanism (using IAM Identity Center, formerly AWS SSO), security guardrails (controls) and a product-like landing zone that you can scale over time, plus a structured mechanism for growing your landing zone functionality based on IaC that you can define and maintain yourself in the form of AWS Service Catalog products you can register.

A newly created Landing Zone using AWS Control Tower can perfectly look like this:

Figure 1: AWS Control Tower Landing Zone organization structure


AWS Identity Center (formerly known as AWS SSO)

Not specially important to this implementation, AWS Identity Center brings the single sign-on functionality to AWS Control Tower landing zone. The rest of the configurations required in terms of identity will be performed in AWS IAM, not in AWS Identity Center.


GitHub project structure

GitHub is today the go-to option for many developers. Like in other git-based tools, the code is structured in branches. In this monorepo implementation the branches are this:

[13:37:49] jgf:app-monorepo-pipeline git:(devel) $ git branch -l

* devel
  live
  main
  sandbox
(END)        


The idea behind this monorepo is that application code and IaC coexist in the same git repository for all the environments the application should cross. This aspiration, as described in the previous section "The implementation rationale" has its PROs and CONs that obviously should be studied.


GitHub variables and secrets

GitHub can pass variables and secrets to Terraform in multiple ways, but the preferred solution here is to use environments. Environments are used to describe a general deployment target like production, staging, or development.

An example environment configuration can look like this:

Figure 2-A
Figure 2-B: Example GitHub environments configuration


In the above setup, GitHub Environments not only keep secrets secret, but also organise the variables in two logical groups:

  • CI_* for the variables used in GitHub Actions (pipeline) configuration itself.
  • TFVAR_* for the variables that should be passed to Terraform (in any way).

It must be said that this naming convention is used only for simplicity while setting up the pipeline configuration, to avoid mistakes while using variables.


IMPORTANT NOTE:

TFVAR_* doesn't pretend to be Terraform's TF_VAR_*, it's just a nomenclature used to differentiate the expected variable uses.


GitHub OIDC configuration to AWS

GitHub have documented the process of registering as an Identity Provider (IdP) in multiple clouds, starting at https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-amazon-web-services.

The linked technical article describes the step-by-step process to configure OpenID Connect in GitHub to authenticate with Amazon Web Services IAM and STS. Long story short, the process requires adding a new identity provider in the AWS accounts that need to be visible to GitHub Actions pipeline, that means, accounts that will be provisioned using Terraform from GitHub pipelines.

A typical configuration will require the following steps:

Figure 3
Figure 4


The GitHub related values for OIDC provider in AWS IAM are:

For the "Provider URL": 
- Use "https://token.actions.githubusercontent.com"

For the "Audience": 
- Use "sts.amazonaws.com" (1)        

(1) if you are using the official "@configure-aws-credentials" action, located at https://github.com/aws-actions/configure-aws-credentials


Figure 5
Figure 6
Figure 7
Figure 8


As you see in picture above, AWS will complain with a warning because in the line 15 it's preferable detailing an exact github_organization/repository_name than using a wildcard that matches many repositories, but using a wildcard is what we really want and this won't generate a failure.

Figure 9


Using a wildcard for all the repositories in an organization will work fine. Therefore, the content of the trust policy for the new role being created, according to the documentation should look like this:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::123456789012:oidc-provider/token.actions.githubusercontent.com"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
        },
        "StringLike": {
          "token.actions.githubusercontent.com:sub": "repo:YourGitHubOrg/*"
        }
      }
    }
  ]
}        


Next step is to assign an scoped-down policy. In this example we just use AdministratorAccess for rapidness in the screen capture, but remember to use a policy with the least privilege.

Figure 10


Finally, it's time to save the role with a name that we can easily identify as related to GitHub Actions and CI/CD pipelines.

Figure 11
Figure 12


GitHub Actions pipeline configuration

Up to this point, all the step completed worked towards the GitHub integration into AWS IAM so when a pipeline in GitHub Actions is in execution the job can assume a role in the destination account and provision resources via IaC using Terraformn.

Now you need to give form to your pipeline as GitHub Actions. For this, you are going to implement first a minimalistic GitHub Actions configuration for committing or PR-ing code on branches like sandbox or devel, and later you'll be implementing a more complete GitHub Actions configuration for commiting to branches like pre-production, production or live.

IMPORTANT NOTE: explaining how GitHub Actions works fall totally outside of the scope of this article. It's given for granted that this is something you already know or that you can easy learn by reading the official GitHub Actions documentation at https://docs.github.com/en/actions/learn-github-actions/understanding-github-actions


Simple GitHub Actions pipeline

You can grab the source code at https://github.com/safebytelabs/code_examples/blob/main/github_actions/example0.yml

Figure 13


Complete GitHub Actions pipeline

You can grab the source code of this pipeline at https://github.com/safebytelabs/code_examples/blob/main/github_actions/example1.yml

Figure 14-A
Figure 14-B
Figure 14-C


GitHub Action pipeline execution details

One of the most important details during the pipeline execution is ensuring the GitHub runner (the software the executes the pipeline on a physical or virtual machine) can assume the AWS IAM Role at the destination account to perform its duty, that is injecting IaC. We can double check this on the pipeline job detail page:

Figure 15


As you can observe, GitHub Actions perform the AWS credential configuration and the GitHub Action runner is able to assume the role named "github-actions-oidc-role" receiving temporarily session credentials from AWS IAM STS service as "AROAabcdefghijklm:GitHubActions".


Terraform state file and lock table

As discussed at the beginning of this "short" article, of of the objectives of this deployment type is to keep centralized the Terraform state file and its associated DynamoDB tables.

In many deployments Terraform state file either live at the destination accounts or centralized and in many more cases the DynamoDB used for locking is local to the account where its being deployed IaC.

In this article the Terraform state file AND the DynamoDB used as locking table are BOTH centralized into an account named "infra1", a kind of shared services account for the landing zone.

To make this configuration to work, this are the details:


backend.tf

Put the account number for the equivalent to the "infra1" account where you see the label NNNNNNNNN.

Figure 16

DynamoDB state locking table

Figure 17


Terraform state file on a centralized S3

Figure 18
Figure 19


Finally, the last configuration item, the S3 resource policy that will make the trick fully operational:

You can grab the source code at https://github.com/safebytelabs/code_examples/blob/main/aws/resource_policies/example0.json

Figure 20


Conclusion

Building a GitHub Actions pipeline using a monorepo has been an interesting journey. Centralising the Terraform state file and the DynamoDB Table used for state locking was an important objective to meet. Finally, leveraging OpenID Connect (OIDC) between GitHub and AWS has been a satisfactory experience that avoids the always unwanted creation of roles, policies, S3 buckets and dynamodb tables in every account where Terrafom should deploy.

The objective of this article was purely educational. The objective of this article was not criticising technology or technology implementations.

Jonathan, wow, your initiative to streamline CI/CD pipelines with GitHub Actions and AWS using Terraform for centralized resources is a game-changer! Thanks for sharing your expertise and helping elevate DevOps practices

回复
Olivier Monnot

Senior Practice Manager | Principal - AWS Professional Services - Global Financial Services (GFS) - EMEA

1 年
Jamie Adamchuk

Organizational Alchemist & Catalyst for Operational Excellence: Turning Team Dynamics into Pure Gold | Sales & Business Trainer @ UEC Business Consulting

1 年

Great work on tackling that pain point and creating a seamless integration solution for CI/CD pipelines with GitHub Actions and AWS cloud!

回复

要查看或添加评论,请登录

Jonathan González的更多文章

社区洞察

其他会员也浏览了