The Flaws of the average when collecting System Metrics

The Flaws of the average when collecting System Metrics

No alt text provided for this image

The above picture tells the whole story, this is what happened to us when we were collecting the average CPU usage. We are using https://github.com/sensu-plugins/sensu-plugins-cpu-checks this plugin for collecting system metrics, but it did not help us to figure the out issue, and our application crash sometime and the average CPU was seems fine.

So, Later we realise the our node application consuming and contributing to high CPU usage when I start to write custom script or plugin to monitor each and every process that seems which can contribute to high CPU usage.

Here is the Server stats which seems fine but keep in mind you will be wrong if you rely on the overall CPU average.

No alt text provided for this image

So our plugin was done the job to identify the classic case of the flaw of the average.You can find it on GitHub. Here is the stats from our plugin during Application crash.

No alt text provided for this image

This is very simple to integrate all you need to create simple check configuration with process name to monitor.

{
  "checks": {
  "myapp-stats": {
     "type": "metric",
     "command":"/opt/sensu/embedded/bin/metrics_per_process.py -p /home/user/my-app.js",
     "interval": 60,
     "subscribers": [
       "my-app"
     ],
    "handler": "librato"
  }
 }


In Next article I will share how to deal with such high CPU usage and how to terminate automatically a process that is not important to run for example filebeat is not more important than the server itself, but what if filebeat consuming all the resources?





Rahat Hameed

Backend Developer

4 年

I am suffering from same problem more or less, hope it will be fixed soon.

要查看或添加评论,请登录

Adil M的更多文章

  • Start Node application with different process manager based on ENV

    Start Node application with different process manager based on ENV

    Nodejs application often depended on Environment variables and in normal cases it is fine but in case of Docker, I will…

    3 条评论
  • The King of modern Process manager

    The King of modern Process manager

    Pm2 will help you to diagnose the issues in your servers in a day where the developers fails to investigate in a month.…

    2 条评论
  • Get rid of Docker "exec"

    Get rid of Docker "exec"

    The docker exec command provides the much-needed help to users while debugging and checking logs of the container and…

    8 条评论
  • Simple, fun and dynamic SSH

    Simple, fun and dynamic SSH

    The hardest moment is when the server goes down and you just received a call from you monitoring serves and it's around…

    3 条评论
  • RDS is available, Thanks Slack! [ReadTime 2 minutes]

    RDS is available, Thanks Slack! [ReadTime 2 minutes]

    There are other ways around to get a notification when RDS is available but that might be email or SNS etc, Its take a…

    1 条评论
  • Create DynamoDB in 30 Seconds [Readtime 4 mint]

    Create DynamoDB in 30 Seconds [Readtime 4 mint]

    In my previous article, I shared about how to run DynamoDB locally using Docker, In today article I am going to share…

    1 条评论
  • Configure Multiple Environments from the Same DockerFile [Part 1]

    Configure Multiple Environments from the Same DockerFile [Part 1]

    Everyone moving toward Docker because Docker is awesome :). But when its come to run in a different environment for…

    4 条评论
  • Setting Up DynamoDB locally

    Setting Up DynamoDB locally

    The Docker version of DynamoDB lets you write and test applications without AWS account. As a DevOps, one should…

    5 条评论
  • Terraform (AWS) mistakes to avoid

    Terraform (AWS) mistakes to avoid

    provider "aws" { region = "us-west-2" access_key = var.credentials.

社区洞察

其他会员也浏览了