Infrastructure As Code Is a Joke
Cliff Berg
Co-Founder and Managing Partner, Agile 2 Academy; Executive level Agile and DevOps advisor and consultant; Lead author of Agile 2: The Next Iteration of Agile
The idea behind “infrastructure as code” means that the actions pertaining to defining servers, networks, and deployment of applications are automated. More than that, it means that the automation is defined as a scripted process that is defined by an artifact that can be maintained in a source code control system. That way one can see what changes have been made to it at any point in time, and progressively make the script more robust - if there is a problem, one can fall back to the previous version of the script. One can also copy an existing script that is known to work, and incrementally improve it.
In this approach, the “script” is the “code” - as per the expression “infrastructure as code”.
The problem is, the technologies for doing this are not really code: they are most often templates written using a syntax called JSON, or another called YAML. These are quite terrible, compared to true “code” as we know it today.
During the 1960s it was common to program in something called assembly language. Assembly language is just a step above what is known as “machine language”. Machine language is the actual bitwise instructions that a computer can interpret. It is ones and zeros. A human cannot look at it and “read it” - not unless one has unique, perhaps autistic, ability.
Since humans cannot - without great effort - write machine language, assembly language was invented. Assembly language is a slight step up from machine language: instead of dealing with bits, one uses mnemonic instruction names, and one can express numbers in decimal instead of as a series of bits; and sequences of human readable characters - e.g., for an output message - can be expressed as, well, characters - instead of a sequence of bits.
An important characteristic of assembly language is that it is - like machine language - tied to the kind of machine it is written for. So one cannot, for example, write assembly language for, say, an Intel architecture computer and run it on an ARM architecture computer. It would be like taking a Ford part and trying to insert it into a Toyota.
It was not long before people created a better way. Languages like COBOL, Fortran, and C were created that enabled one to write portable programs: programs that could be run on any computer, no matter the manufacturer or design. Actually, such programs do not run directly: they are “compiled” into machine language. But the program - the code that a programmer writes - is in a language that is the same regardless of where the program will be run.
These languages are called “higher level languages” - meaning that they are at a level of abstraction higher than assembly language. Since they are at higher level, they are also more concise: ten lines of assembly language will likely only require one or two lines of code in a higher level language.
Early higher level languages such as the ones I mentioned had a fixed set of data types. That is, one could use numbers - integers or floating point numbers - and sequences of characters. Then in the 1980s “object oriented” languages were invented that enabled one to define new kinds of data - so-called “objects”. An object is a programmer-defined type of data: the programmer specifies the name of the type, the fields that it may contain, and the operations that can be performed on it. (Such object type definitions are usually called “classes”.)
In a language such as Fortran, the operations that can be performed on a type such as Integer are fixed and pre-defined by the language: one can perform arithmetic operations, and that is that. But if one uses an object oriented language and defines a new type, say “Vector”, one can define what makes up a Vector and the operations that can be performed on it.
This was a breakthrough because it ushered in the age of “type safe” languages: the computer can tell if you have made a mistake, using say, an Integer instead of a Vector, because it knows what kinds of data are allowed for any given operation. This sounds like a triviality, but it actually has an immense impact on software maintainability: if one later changes the code, one tends to make all kinds of mistakes, specifying the wrong kinds of data. With a type safe language, those kinds of mistakes are all detected when the changes are made.
If only infrastructure as code were like that. Today’s “infrastructure as code” is akin to assembly language: it is tied to the architecture - the cloud vendor, such as AWS. There are cloud-neutral frameworks such as Terraform, but they actually have cloud vendor-specific APIs, so that one’s purported cloud neutral code is full of vendor-specific constructs.
Cloud infrastructure code is generally expressed in JSON or YAML - template syntaxes. These are not type safe: they use strings and integers. Even those that use actual code instead of these template formats generally have an attribute-centric syntax, and so one must know what attributes to use for each kind of cloud resource, and often an attribute error is not found until runtime. That’s the issue: if a tool cannot detect an error until runtime, then it is much less help - the “code” is much more “brittle”.
So we are really at the level of assembly language for infrastructure as code. Such “code” - I hesitate to call it that - is very difficult to maintain. It is also verbose - just like assembly language, compared to a higher level language. And it is difficult to write: since it is not object oriented, the system cannot tell if you mistakenly leave out the fields that are required, or if you provide, say, an AWS resource name when an AWS “ARN” is required - since both are string values. So one is faced with writing a huge amount of so-called code, without any type safety, and it is arcane and difficult to maintain.
Amazon has recently introduced something called Cloud Development Kit (CDK). This is an API, but it is declarative: you define the target state that you want. It acts essentially as a higher level language for their cloud infrastructure. It is true “infrastructure as code”. It is what they should have created when they created Amazon Web Services.
Amazon’s native template format is called “Cloud Formation Template” (CFT). It is now arguably a legacy system, to be hidden from the public, and I expect that when CDK matures, they will start to do just that, by replacing their myriad online CFT code examples with CDK equivalents.
CFT templates are written using JSON. It is horrible - verbose, error-prone, hard to maintain, and arcane. Why didn’t they create CDK instead of CFT? One cannot say; but it is probably because some engineer at Amazon back in the early 2000s - probably someone not there anymore - saw that JSON was new (even though it is a technological throwback) and thought it would be cool to use JSON. That was most likely some very smart tech lead, someone up on all the latest tools, but without much programming experience - someone who did not appreciate the importance of maintainability and ease of coding; someone who perhaps had never used assembly language and so never learned what a savior higher level object oriented languages were.
So much for the idea that experience does not matter.
I look forward to the evolution of CDK. It is not quite ready for prime time. Parts of it seem to not work yet, and it is very sparsely documented. I have filed a couple of bug reports, and critiqued the docs, and was told that they are working hard to improve the docs. In contrast, CFT is rock solid, and it is well documented, although to use it, there is a very long learning curve - one that will be filled with frustration. And if it is now a legacy tool, why bother?
The dilemma is Hobson’s choice: use CFT, which is legacy and a nightmare to use, or use CDK, which is not quite ready and poorly documented. My advice would be to use neither if you can help it, and wait for CDK to mature; or use CDK but be prepared for some instability and guesswork for the next year.
And I expect that other cloud vendors will see CDK and say, “Why didn’t we think of doing that?”
But what we really need is a higher level language that is not tied to AWS or Azure or Google Cloud or whatever. We need a portable language for infrastructure as code: an object oriented higher level language.
Today most of the complexity of applications is not in the software code, which defines the deployable components: today the complexity is in the interactions between components. That is called the “outer architecture”. But today the outer architecture is not defined in a higher level language: it is defined in the JSON and YAML deployment templates - which I have explained is written in a kind of assembly language. That is not a good state of affairs, and it is quite surprising that we are doing that now in the year 2020, given that higher level languages were invented in the 1960s.
The outer architecture of today’s applications, which are highly distributed real time systems, should be written in a higher level object oriented cloud-neutral language. A language that can detect errors before you perform a deployment. A language that one can compile for any target kind of environment. A language in which one can define a secure, robust, and composable system. There is no excuse for how we are doing things today. It is terrible. We should expect better.
Managing Consultant Network Security at operational services GmbH & Co. KG Microsoft Azure Network Engineer Associate (AZ-700)
10 个月While hyperscalers offer at lot of resources and concepts that are common among them, they also offer unique types, services, and concepts, which may be why you select them in the first place. You either limit your deployments to common resource types, which may result in clean and portable "code" or you will have to use some vendor specific resources to get the most out of cloud services. Perhaps something like C #ifdef could be an approach to handle vendor specific resources.
DevOps & Agile Engineering Senior Leader
4 年I think when "Agile Infrastructure" (a.k.a. "Infrastructure as Code") was first coined (by the likes of Andrew Clay Shafer and Patrick Debois) it was not with JSON or XML in mind, but rather more like what Chef and Puppet offered (or CfEngine more than a decade before): - A human readable DSL (whether Ruby-like, or Groovy/Python/Perl-ish) that allows declaration-like expression of "what" to build/configure (e.g., the target), with ability to provide imperative-like rules for HOW top build it - preferably expressed as a reusable, parameterizable "pattern" so that most of the time, only the declarations (and key dependencies) are needed, and the recipes to build/derive them can be inferred from the built-up "cookbook" of pattern-rules). JSON and XML/YAML do this for Infrastructure about as (un)well as ANT did for Java. It didnt read very rule-like or script-like and the signal-to-noise ration (here, being the meaningful-text to delimiters ratio) was not all that great (e.g. "A Joke!") BUT the other aspect of "as-code" was that it could be formally codified (as an "executable specification" of what to build/configure, and how to build/configure it (if it didnt follow known established rules/patterns). And thus lent itself to being checked-in to version-control, and other good (agile) development practices (including TDD/BDD, CI/CD of the infra-as-code, refactoring, even test-automation).
Enthusiastic Business Value Obsessed Nerd
4 年Great article! I love the Assembly Language metaphor.
Senior Vice President Artificial Intelligence & Dataspaces
4 年Having worked with domain specific languages for a long, long while and supporting model-based approaches, I strongly support your criticism of the JSON/YAML approach. We have progressed so much more in other areas (MBSE, etc.). JSON and YAML are just very accessible (text editor, parsers in all languages), which make them so popular.
Expert Developer (Html, Bootstrap 5, CSS, C#, Objective C, Python, PHP, Javascript, .NET), Odoo.com Certified Expert, API Integration & Dev. Databricks, Cloud Computing Integration
4 年Or as JCL