Pets vs Cattle - vegan edition!
Courtesy of - https://www.youtube.com/watch?v=ljm6RU6lRuM

Pets vs Cattle - vegan edition!

I heard a story recently that someone disconnected from a talk about microservices and containers due to the introduction of pets vs cattle and the idea of what you do with sick cattle vs sick pets. Maybe you're not vegan and think this is ridiculous, oh well! I thought I'd attempt a new take on this old analogy and as I worked through this idea, I actually like this a bit better and think it's a little more relevant. If someone had this idea, cool let me know and I'll drop you kudos in the comments, but I've not come across it before.

Oak Trees vs Corn Fields!

As a kid we used to play on this grand old oak tree at my Grandparents house, a very typical British affair of a big farm field with one big oak tree in the middle. It was magnificent and we had many fun hours climbing, playing, swinging on that tree. As kids we were all very emotionally attached to the tree and the idea of anything happening to that tree would be heartbreaking. When my Grandmother finally needed to move into full-time care, one of the biggest losses was the childhood memories of her family home, and our beloved oak tree. I daren't go back to see if it's still standing today! We have a similar tree in our garden now that my kids have swings and ropes hanging from, so it's a core part of their childhood play, again the though of anything happening to that tree would be devastating. We look after that tree, add attachments (swings, ropes, maybe a tree house when the kids get older) to make that single tree more of a personal attachment. It didn't take much Googling to find a whole industry around caring for our beloved trees, making braces and cabling, or righting and repairing trees that had been dislodged by storms.

Our house backs onto farm land, and our farmer neighbour generally plants corn in the fields for animal feed. There is lots of it, and most of the fields in the area have corn year after year. The farmer must plant thousands, if not millions of individual corn plants every year. There is no-doubt a carefully calculated quantity that they plant every year based on estimated percentages of yields. I did a bit of Googling and didn't get a decent figure, so I'm just going to make one up, maybe the farmer figures 80-90% yield from crops being planted. So what about the plants that don't yield? Does the farmer worry over the potential 10-20% loss? Of course not, it's part of the calculated yield. Too much is bad (wasteful across a number of resources = wasted cost), too little is bad (not enough to feed their animals).

In pets vs. cattle a farmer sells or has as many cattle as possible to make as much money as possible (either for meat or dairy). Of course there's still a calculation, but I don't believe it's as directly comparable to what we do in IT. In oak trees vs corn, a farmer needs to carefully calculate the estimated yield required to feed their livestock, based on failure averages, seasonal variations and so on.

When you create an IT application you need to estimate the required user experience based on acceptable respond times, seasonal load, number of users and so on, while also considering your N+x failure tolerance. You do exactly what the farmer does with their corn. You do not do what me and my brothers did and look after a single tree in the middle of a field, spending time having fun, adding cool widgets (swings) to make that single thing more productive, but infinitely more susceptible to failure.

The pets vs cattle analogy also has it's flaws because most farmers aren't totally heartless, most individual cattle do get veterinary attention when they are sick. It's very expensive business raising cattle, so of course they spend money on taking care of individual cattle. I know a few smaller farms that even name all their cattle!

In a field of corn, every single plant gets the same treatment, I see the full spectrum across the year from my home office window, from spreading fertiliser, to muck spreading, to tilling, then planting, then spraying, etc. As much as possible, every single corn kernel gets the same treatment, and if one area of the field is less fertile, gets less water or sun, well it doesn't matter because that's all part of the calculated yield.

So it is with microservices based architectures.

Individual components should to be:

  • As high performing as possible & practical (high yield, low maintenance)
  • As failure tolerant as possible, auto-recovery, checkpoints, etc. (drought and disease resistant corn strains)
  • But ultimately designed to fail, a single component or number of components failing should not be a catastrophic event!
  • As individually complete and re-usable as possible (this might make more sense when we talk about Agile)
  • Every components get health treatment like security, testing, monitoring, etc.

A full microservice architecture should to be:

  • Scaled to meet the estimated demands after taking into consideration estimated failure rates
  • Efficient (not wasteful of resources = high cost!)
  • Resilient of component failures. (While recovery is important, it shouldn't be a DR event. The platform should have enough estimated additional capacity to self heal / self recover with no noticeable impact.)
  • Componentised (is that a real word?) as possible so that individual products can be replaced if it makes sense. (Maybe my farmer switches to barley or oats next year, the net result in their full business is the same, but it's a discreetly different component.)
  • Overall health is measured to make sure we are on target to meet the customer requirements and demands

Not everything needs to be geo-resilient either. If my farmer neighbour loses an entire field of crop then they are in trouble, but the circumstances around losing an entire field are so extreme that it isn't worth calculating on their farm (larger farms no doubt will). Additionally, the potential causes would have significant other knock-on effects. For example extensive flooding can damage major areas of crop-land, but this has bigger implications on their business that it's not worth spending too much time trying to cover this failure scenario, instead they try have financial reserves, co-operatives or lean on subsidies.

If you have multiple locations across a single country and your user base is in that country, is it really worth the effort of having full HA into a second location? I am talking active high availability here and not data backups, which have a different set of considerations.

Hopefully I've used language like should/might, rather than need/must (quick search to check myself!). The reason is that this type of architecture is meant to be very flexible. The important thing is to consider why and what-if. So long as you consider the eventualities and you either cover them, or know them as acceptable risks, then you will design your application well. A core part of good design is efficiency, so do enough to meet your requirements and demands. This is exactly what my farmer neighbour does when estimating the yield of the amount of corn they need to plant.

Here's a challenge, please have a think about your answer before going to the next paragraph:

"What is the minimum number of components needed to provide a cluster?"

Usually I get quorum answers of 3, sometimes I get legacy answers of 2. Firstly 2 is a terrible number for clustering, you have no quorum and so you either need to tightly couple these systems (like traditional storage arrays / core networking), or you need to build an independent witness to avoid split-brain, and even then it isn't flawless.

My controversial answer is zero. The minimum number of components required to have a resilient cluster is zero. Why? Because of scaling and understanding demand. So long as my application can detect or predict demand and scale to the required level, then zero is a perfectly acceptable cluster size. If there's no demand, why have systems sitting there consuming resources? This thinking is also a great primer for serverless, but we haven't quite got there yet! I will cover this scale challenge a little later when we start the thought experiment of designing our own application.

Ben Dingley

Enterprise Architecture and Digital Transformation | IT Leadership | Cloud Native Computing

6 年

Cool take on the old analogy... I like the zero cluster!

回复

要查看或添加评论,请登录

Chris Kranz的更多文章

  • Kids today!

    Kids today!

    I want to write this article mostly so I can use it myself as a reference when I see folks commenting something…

    5 条评论
  • 2023 So Far...

    2023 So Far...

    I'm sharing this really for all the people I know that wonder what I'm up to these days. It's been a very interesting…

    81 条评论
  • What is a Security Audit Really About?

    What is a Security Audit Really About?

    Having spent many years coaching and mentoring sales teams, at some point or other the topic of security audits comes…

    1 条评论
  • AI Fireside Chat: Could you go out of business if you fail an audit?

    AI Fireside Chat: Could you go out of business if you fail an audit?

    I’ve run a fair few training classes of eager cyber security sales people. At some point, because we’re selling…

    1 条评论
  • What is a “Rock Star” in IT anyway?

    What is a “Rock Star” in IT anyway?

    We’re on a big recruiting drive at the moment, and I notice many of our ecosystem cousins in the cloud & cloud-native…

    5 条评论
  • We Want You! But Why Join Sysdig?

    We Want You! But Why Join Sysdig?

    If you missed it, we recently announced a round of funding that takes Sysdig to a valuation of $2.5bn.

    2 条评论
  • Infrastructure Admin to DevOps & Site Reliability Engineer

    Infrastructure Admin to DevOps & Site Reliability Engineer

    Over the past 5/6 years I’ve slowly made the transition from being an infrastructure engineer / architect into being…

    11 条评论
  • Algorithms for Confirmation Bias

    Algorithms for Confirmation Bias

    This is probably more of a rant than anything useful, but I've been meaning to put my thoughts down on paper regarding…

    5 条评论
  • Doomsday Exploits – What happened to security good practices?

    Doomsday Exploits – What happened to security good practices?

    And I looked, as he opened the runc seal, and behold, there was a great earthquake, and the kernel became as black as…

  • Is it up?

    Is it up?

    I had a question this week about setting up alerts based on container or pod count. They thought they had a problem…

社区洞察

其他会员也浏览了