Lunar Landing Lessons from 2020
2020 was certainly a year to remember!
Just before the 50th anniversary of the Apollo 11 Moon Landing in 2019 I started writing a series of stories which show how modern day Site Reliability Engineers, Sysadmins, Operators and DevOps Engineers can learn valuable lessons from the actions and practices of NASA’s astronauts and flight controllers as they took humanity to the Moon.
I've been publishing these articles in Medium at https://medium.com/ibm-garage
Here's a list of all the articles:
- The 1201 program alarm
- Glenn’s flight
- Functional vs non-functional requirements
- RACI
- ChatOps
- SRE & Transparency
- Operational Scorecards
- MVP vs PoC
- DevSecOps Quarantine
- Reliability
- Day-2 Operations
- Flying to the Moon is like developing on your laptop
- ?Three minutes without breathable air
- If Neil Armstrong were your engineer, you wouldn't need this!
- Computer-aided responses and reflexes
- Overcoming chaos on the way to the Moon
- Managing an Agile product launch — over Christmas
In the second half of the year, there was enough interest in these lessons that my proposals were accepted by three different conferences:
- Prevail 2020 is an IBM conference, organized by the IBM Academy of Technology around the subjects of Performance, Availability, Security, and SRE. You can find my session “Failure Is Not An Option” in the 4th track of the conference.
- SRECon Americas 20 is a gathering of engineers who care deeply about site reliability, systems engineering, and working with complex distributed systems at scale organized by USENIX: The Advanced Computing Systems Association. You can find my session “Failure is not an option : SRE Lessons 50 Years after the Apollo 13 Flight” on the schedule for the 2nd day and on Youtube too.
Since all the local face to face conferences were cancelled this year, TLV Dev Community was a combination conference, covering DevOps Days TLV, Statscraft, Cloud Native Day TLV, DevSecCon TLV and DevRel IL. I delivered a short 5 minute Ignite session covering a lesson from the Challenger disaster.
For those of us who prefer to read, the most popular article of the year was Resilience and redundancy on the way to the Moon, which reminded us that when you design a system to be resilient and highly available, you need to take humans into account — whether it being making sure you have a backup if the person on call doesn’t respond, whether it being sure that the system is well documented so that if a lead developer leaves you can continue or whether you need to have another astronaut ready to go in case one of the prime crew members get exposed to German Measles just before the flight.
Speaking of infectious diseases, the most topical article of the year was probably DevSecOps Quarantine which detailed the relationship between DevOps pipelines security and protecting mankind from lethal space organisms!
For IBM SREs, Sysadmins and operators, 2020 was the year Watson AIOps first appeared, and a number of articles discussed it. Watson AIOps takes the best of existing IBM solutions like Netcool Operations Insight, Predictive Insights, and Agile Service Manager and adds some cutting edge AI and Natural Language Processing to create a solution that it much more than the sum of its parts.
As I’ve said in an article, If Neil Armstrong were your engineer, perhaps you wouldn’t need Watson AIOps, but for the rest of us mere mortals, Watson AIOps will help separate the wheat from the chaff and alert us about problems with a clarity that would be impossible without it.
I think that while 2020 was the year Watson AIOps first appeared up, 2021 will be the year it really flexes and shows us what it’s capable of.
For myself, I’ve set the following goals for 2021:
- I’ll start co-publishing some articles, especially the more technical ones, right here in the Management, ITSM, and AI Ops Global IBM community group.
- More articles. My goal is to average at least one a month and end up with 12–13 articles next year.
- The nascent space program of the 50s, 60s and 70s is an endless well of stories and lessons, but I think that the time has come to start covering more modern projects such as the Space Shuttle, the International Space Station and perhaps the various commercial space endeavours. I will try to keep the historical perspective and not discuss the latest missions too much.
- While I enjoy the fact that my articles do not require any previous knowledge and can be understood (and enjoyed) by people from across the spectrum of technical expertise, I do want to try and deliver some more technical deep-dives to the newer IBM technologies and solutions in a proportion of upcoming articles.
I look forward to the end of 2021, when I’ll see how much I’ve achieved.
Finally, since I’ll be widening the scope of the articles beyond the Moon landings of the 60s and 70s, the overall title of the article might need to be changed from “Lessons from the Lunar Landings”
- Perhaps “SRE Space Solutions”?
- Perhaps “Service Management Space Mistakes”?
If you have any ideas for a good title, please let me know. If you have any requests or suggestions for future articles, I’d love to hear them.
In the meantime, for all future lessons and articles, follow me on Medium as Robert Barron, on Twitter as @flyingbarron or on Linkedin here!
Wishing you all a happy and healthy New Year.
#SRE #DevOps #AIOps #IBM #Apollo11 #Apollo13 #Apollo50
Senior PreSales Consultant BigData & Analytics
4 年Perfect match ! What a great idea to visualize #SRE through some historical insights of the crewed lunar space missions. An easy read to get into topic. Thx.
UX Researcher, and Senior Info Developer at IBM UK Ltd
4 年Very powerful presentation, Robert. Thank you,
Executive IT Architect - IBM Consulting
4 年SRE... The final frontier... These are the voyages of humans into space ? ??
Senior Manager, Communications, Social Media, AMD
4 年Robert, this is awesome and love the whole 2020 series you have done with this subject exploring engineering. One of the highlights of my work in 2019 was working on the Apollo 50th anniversary IBM story and meeting Apollo 17 Astronaut/Geologist Jack Schmitt in Cleveland at an INCOSE conference. This year, I was proud to co-host a LinkedIn Live chat with and Space Historian & Author Rod Pyle and Dr. Larry Kennedy (Data Systems Analyst for the Super Specialists during Apollo program!). Love this view of engineering, SRE and devops.