Continuous Monitoring #2 - What is under the hood?

Continuous Monitoring #2 - What is under the hood?

This series is about to bring closer technicalities around monitoring to people, who have no more profound knowledge about how their business application works, and what are connections between Infrastructure, Application and Business parts in terms of monitoring.

TL;DR - "Should I worry if CPU is 100% on my Orders app?"

Let's see how it usually looks when we think about our application (a store, warehouse management system, or website) in terms - of how it's built, and where it is set.

Three scenarios are most common: on-premise (old way), in the cloud as IaaS (Infrastructure as a Service, one of the most common cloud adoptions, made as lift & shift from the old server data center), and finally full cloud adoption with PaaS/SaaS approach (Platform or Software as a Service approach in the cloud).

In each case, monitoring is a challenge. But no worries, it will simplify in the end.

On-premise stack example

No alt text provided for this image

In this scenario, everything from our data center till the very end of website access is in our hands, and under the control of our admins. Also, this means, that Admins (facility admins, hardware & system admins, database admins, and application admins) need to monitor a lot of layers.?

From top to bottom:

  • Our users & customers layer - the business users of our applications & websites
  • Network?layer - which connects users to applications, and applications with databases and other systems
  • Application layers (where our applications, databases, and websites reside)
  • Server layer - like database servers, web servers, or application hosting servers
  • Operating System layer - where our servers are installed, system access and supporting processes
  • Hardware layer - this is physical equipment (aka computers) used to host every piece of software we have and interconnect between (network devices)
  • Facility layer - simply roof over our hardware, secured doors, power connections, air conditioners, etc.

I know, it's a bit simplified, but for discussion about monitoring "how?" should fit. Besides monitoring, this picture should be good to understand what is connected with what, and what is the influence, when errors will start to occur between these layers.

On-prem stack monitoring applied

No alt text provided for this image

You can notice, that we have to cover 7 layers with monitoring!

As the on-prem scenario existence is on the verge, let's not focus deeply on this, but on some explanation I'll do.

Monitoring in this case is a connection of the following steps:

No alt text provided for this image

From a business user perspective, it's a long net of internal dependencies when something can break - it will immediately hit the business layer and interrupt the business flow.

To get the most insightful information about the system's condition, we need to create a system map, which describes each business service from the left side but with some indicators of how components beneath behave.

However, a simple diagram based on the flow pictured above may do a thing.

And this is frankly speaking the worst-case scenario, so let's jump further into the more familiar world.

Cloud IaaS stack example

No alt text provided for this image

This scenario is a bit simpler: we don't bother about physical machines and building at all, as we pay for the service to the cloud provider (let's take Azure as an example).

IaaS stack monitoring applied

And as in the previous, this scenario is a bit similar according to the layers:

No alt text provided for this image

Visually, we have only 6 layers to cover - with still very heavy coverage in each:

  • Our users & customers layer - the business users of our applications & websites
  • Network?layer - which connects users to applications, and applications with databases and other systems - as we're already in the cloud, this is rather a service, than a physical equipment
  • Application layers (where our applications, databases, and websites reside)
  • Server layer - like database servers, web servers, or application hosting servers
  • Operating System layer - where our servers are installed, system access, and supporting processes - still we need to cover this, although it's a part of our provisioned virtual machines
  • Virtual Machines layer - bought as an IaaS from the cloud

?

So, interconnections are still a bit complicated, but at least we need to focus on a less:

No alt text provided for this image

The good thing about this approach is, that whenever business users notice a lot of application errors caused by a long time of transactions, it can be revealed down the line, that one of the provisioned Virtual Machines is too weak in performance, so rehosting is possible (depends which service is responsible - it can be Database VM, Application Server VM, web server VM etc).

However, in this scenario, we need also to monitor the current costs of VMs (OPEX - Operational Expenditures), which is another dimension of the monitoring.

Let's jump to a more appealing (meaning a lot simpler scenario).

PaaS/SaaS Cloud stack example

No alt text provided for this image

Whereas PaaS (Platform as a Service) or even SaaS (Software as a Service) simplifies monitoring on a completely different level. In this scenario, for example, we can use platforms (like Azure Managed SQL Database), hosting services (as an Application Service plan or Web-hosting service), and FTP-less file access based on Azure Blob.

PaaS/SaaS stack monitoring applied

From the monitoring perspective, we still need to be able to see some things:

No alt text provided for this image

It looks like we can focus only on 4 layers. But there's a bit of change in the approach - we rely on managed services (PaaS) where cost control basically jumps into more monitoring position (as overspending may occur and need to be controlled vs our expected performance).

From top to bottom:

  • Our users & customers layer - the business users of our applications & websites
  • Network?layer - which connects users to applications, and applications with databases and other systems - as we're already in the cloud, this is rather a service, than a physical equipment
  • Application & Databases layer - this is a set of various application hosting services like Applications, APIs, Functions, Logic Apps or web-hosting and of course wide area of managed databases such as Azure Managed SQL or CosmosDB (services which allows you fully concentrate on database content & logic)
  • Service layer - though we are on managed services (like PaaS, SaaS, and FaaS (Function as a Service) basically - serverless, still on this layer we can observe the overall condition, utilization, and most important - costs.


In the details:

No alt text provided for this image

I've marked white these metrics which aren't necessary to be monitored, as they play a less role in managed services, as they can autoscale to desired performance (but this will come with the price, just remember that).

Summary

Key to understanding monitoring is a service map available on the business monitoring level and for technical users - on the application level. And behind that, interconnections between each component.


I hope, that after this article the answer to the question set at the beginning is a bit more clear: "Yes, in case of CPU 100% on your database server, You need to worry, my dear business user". And in the best-case scenario, this information should be hidden under the simple "Service Availability = 1%".

要查看或添加评论,请登录

Kosma Fu?awka的更多文章

  • Continuous Monitoring #1 How are we doing?

    Continuous Monitoring #1 How are we doing?

    Continuous monitoring is not often a real thing. Let's start with How are we doing question: - Fine.

  • Po co monitoring albo jak nie o?lepi? jednookiego króla

    Po co monitoring albo jak nie o?lepi? jednookiego króla

    Po co monitoring? By wiedzie?, czy system dzia?a dobrze. A jak dobrze? A na to odpowie monitoring.

    5 条评论
  • Jak stworzy? kultur? DevOps?

    Jak stworzy? kultur? DevOps?

    Wspólna praca biznes ownera, developerów i operatorów w jednym zespole to podstawa prawdziwej kultury DevOps. Przy…

  • We employ the best. But why?

    We employ the best. But why?

    We could say that everyone employs the best. Because why not? And looking the other way - is there any company that…

  • Who is an Expert?

    Who is an Expert?

    Niels Bohr, a Danish physicist from the beginning of the 20th century, said: An expert is a man who made all possible…

    3 条评论

社区洞察

其他会员也浏览了