Safety Engineering: Lessons for Software Development
(c) Rob Baskerville 2019

Safety Engineering: Lessons for Software Development

Recent experience[5] of a seatbelt pre-tensioner firing has made me think about how poor software engineering is.

The pre-tensioner fires in a collision which meets certain parameters, and rapidly pulls the seatbelt tighter by about 10cm. This holds the occupant firmly against the seat, significantly reducing the risk of “submarining” - sliding out under the belt, resulting in serious injury - and significantly reducing the travel of the occupant before the belt restrains them. It seems quite difficult to find a specifications sheet saying just how rapidly it does this, but having watched some video taken at 28,000fps it is somewhere between 0.003 and 0.01 seconds. That's fast.

So here is a component which sits inside a vehicle for year upon year (12 years in this specific case), doing absolutely nothing at all. Until the moment arises, if it ever does, when the precise conditions are met for it to fire. They don't go off when the conditions are not met. And they must go off when the conditions are met. False positives are unacceptable. False negatives equally so.

The level of design, engineering ,and reliability required to achieve this is astonishing - and not matched by most modern software systems.

So consider some questions:

  1. When does code never need[0] patching to correct errors or security issues?
  2. When does code never need updating for new functionality?
  3. What sort of code can you rely upon to work without flaws every time?

And their answers:

  1. When its functions are simple and well defined enough to be provably correct. Where the inputs are constrained such that edge cases, if any, are exhaustively and correctly covered. And of course when it is comprehensively tested - only really possible with simple code.
  2. When the requirements[1] are simple enough, clear enough and well thought out enough that this is demonstrably not required.
  3. Simple code meeting simple unambiguous fully-defined requirements.

And all this tends to suggest simple autonomous components. This is what you find in the safety systems within a car; the multiple airbags, the ABS, the traction control. The complexity is kept to a minimum by carefully and precisely defining each system's functional requirements.

Cowsay outputting result of uptime command

This reminds me of the Unix philosophy: develop minimalist, modular software. Use excellent simple tools in combination. Each tool does one specific type of task, but does it really, really well. You'll see this if you are familiar with the command line text processing tools like ??????, ??????, ????????, ??????, ??????????, ??????, ????????, ????????, ????????, ????????????, ????????????, ???? and so on.

Frame from output of sl, a command to “train” you not to type too fast and get ls backwards!

Build your system from well-defined components[2], with extremely well-defined interfaces and interactions. Get each one right, and your system will function well. Of course, that does mean getting a solid set of requirements in place up front. As in, requirements which will not change on a whim half-way through development. I appreciate that this can be difficult, indeed sometimes virtually impossible, but in all honesty if the requirements cannot yet be defined, what is one doing building a system at all at this point? There are advanced techniques which, whilst they might not make you popular, may assist in extraction of solid requirements[3].

Having seen many software development projects fall into the same tar pits that Brooks[4] warned of all the way back in 1975, I wonder if we should actually change our approach to one which is less sure to go wrong.

But what am I saying - the track-record of software projects speaks for itself.

In the meantime, I tip my hat to the Toyota engineers who designed and built this component some time before 2007 - a component I rarely considered but which did its job perfectly the first time of asking.

Footnotes

[0] The need for patching should not be confused with the lack of patching. The number of systems I see which are unacceptably behind with required patching still shocks me. I'm talking about components which do not need patching at all rather than ones which need it but don't get it.

[1] People working in “agile” environments may find it useful to look this word up, judging from the level of understanding have frequently met. Most reputable dictionaries can assist.

[2] Hey you could even work intensively on one or a few of these components at once for a fortnight, say, if you like that sort of cadence.

[3] See Wikipedia for examples.

[4] Brooks' The Mythical Man Month is available from many online book stores and highly recommended reading.

[5] No serious injury, low speed collision, sprained intercostal muscles from the belt, nothing that anti-inflammatories plus painkillers can't handle.....

Toby Seaman

Cyber Security Consultant

3 年

In this case, the CER (Crossover Error Rate) of false positives and false negatives *must* be close to zero. But not all systems need the CER to be that low. Setting it that low for everything would be 'hard'.

回复

要查看或添加评论,请登录

Rob Baskerville的更多文章

社区洞察

其他会员也浏览了