Reproducible builds
Source: https://www.pagerduty.com/resources/learn/what-is-continuous-integration/

Reproducible builds

Every software company struggles with the pressure of delivering software on time and at a constant pace. Even though the technology ecosystem evolved tremendously over the last 15 years (continuous integration as a service, high-level frameworks that promise to accelerate development, new programming languages, generative AI, Devin, ...) still building software is a complex activity.

In the last few years, I collaborated with multiple companies and one common issue was the ability to have the builds reproducible. While this is counter-intuitive at first let me try to visualize the problem.

Dependencies overview

In many programming languages, Library 1, 2 and 3 are going to be downloaded from a remote registry whenever you try to build your application. Moreover, library 3 is a transient dependency so it might change between builds.

The immediate consequence of this is developers can potentially compile their software and the resulting binary might be different than what the CI pipeline is going to build. The problem might cause very subtle bugs that are hard to reproduce and solve.

In this article, we discuss about "hermetic builds" and how to achieve them in a simple manner (no advanced infrastructure setup required).

Bazel hermeticity

Bazel is one of the few build systems that tackles the hermeticity issue. Here is an excerpt of their official documentation page:

In order to isolate the build, hermetic builds are insensitive to libraries and other software installed on the local or remote host machine. They depend on specific versions of build tools, such as compilers, and dependencies, such as libraries. This makes the build process self-contained as it doesn't rely on services external to the build environment.

The two important aspects of hermeticity are:

  • Isolation: Hermetic build systems treat tools as source code. They download copies of tools and manage their storage and use inside managed file trees. This creates isolation between the host machine and local user, including installed versions of languages.
  • Source identity: Hermetic build systems try to ensure the sameness of inputs. Code repositories, such as Git, identify sets of code mutations with a unique hash code. Hermetic build systems use this hash to identify changes to the build's input.

Visit the excellent Bazel Hermeticity article for more information.

Hermeticity without Bazel

As a Software Engineer, I do like to focus on the core issues and remove all non-functional unpleasant surprises. Compiling software and packaging it for production is one of those activities where teams can spend a considerable amount of time without properly solving the issue.

IMHO, Hermiticity and Hermetic Builds are extremely important especially for teams with more than 1 developer and can be achieved quite simple.

The following diagram emphasizes the approach for a C++ project:

High-level overview of the toolchain image approach

In the above diagram, the Toolchain Docker Image contains all the required libraries, tools and compilers agreed at organization and/or team level. This versioned and can evolve over time. Nevertheless, a specific version once published can not be altered again.

Before discussing further about the benefits and implications, let us see the approach in action:

  1. Run the toolchain locally
  2. Code your application.

Run the toolchain locally

./cpp/build/run-remote-development.sh        

Code your application

Luckily, most of the IDEs and code editors are already prepared for the above diagram (usually referred as Remote Development / toolchain). I will highlight the steps for the approach using CLion IDE.

Configure a remote toolchain

Open Settings (CMD + , or CRTL + , on Windows). Access toolchains section:

Toolchains section

Press add new toolchain button (+ sign) and select Remote Host.

Remote Host toolchain

Under the form, you can fill in the details for the new remote host:

Toolchain settings

The SSH credentials for the demo (hermetic-build repository) are:

  • Username: root
  • Password: test

Now, you can open your project (e.g.: CMake based) and configure it to use the new toolchain:

Open project
Overview

Once opened, you must configure the CMake settings for the project (press CMD + , or CTRL + , depending on your operating system)

Hermetic build final setting

Note: for the hermetic-build-demo from github make sure you also set the following CMake Options:

-GNinja
-DCMAKE_TOOLCHAIN_FILE=/home/dev/vcpkg/scripts/buildsystems/vcpkg.cmake        

Run the example

Running the example

You can find the complete source code that allows you to run the commands on github: https://github.com/rcosnita/hermetic-build

Hermetic builds conclusion

The approach presented in this article works for all programming languages and operating systems. Moreover, if you adopt hermeticity and hermetic build concepts you have the following benefits:

  1. Extremely easy to setup the development environment.
  2. Always compile and test the binary that is going to be deployed in production.
  3. Extremely easy to deliver the source code into an escrow account. Many software vendors are actually legally obliged to ensure this for business continuity reasons.
  4. Strict audit and control over the dependencies used and their licenses.
  5. Complete parity between local builds and CI builds.
  6. No ambiguity into how developers are going to setup their development environment.
  7. Each new developer will be able to start building the code in minutes without workarounds.



Evgeny Petrov

Software Engineer | Driving Innovation in Charging Stations | Expert in C++, Rust and Bazel

7 个月

This could be a nice start for teams that don't want to deal with complex build systems like Bazel. There is a couple of ways to extend and organize the way you use containers. For better IDE integration and user experience take a look at dev-containers (https://www.jetbrains.com/help/clion/connect-to-devcontainer.html) it is support by VS Code as well. It allows to store the container configuration for IDE in the source code. For CI integration there is Earthly (https://earthly.dev/), that allows to define multiple build targets and containers that should be used.

要查看或添加评论,请登录

Radu Viorel Cosnita的更多文章

  • Transitioning to Mac M1 (ARM64)

    Transitioning to Mac M1 (ARM64)

    Problem statement Recently, I was forced to change laptops and use a new Apple Macbook Pro M1 Max (ARM64) architecture.…

    5 条评论

社区洞察

其他会员也浏览了