登录查看更多内容

Linux Performance issues troubleshooting

Kaushik Banerjee ( He/Him/His )

SVP| Autonomous & Accountable DevOps, APAC SRE Head for Trading Tech| Execution, Empathy & Unleashing Team's Potential| I help Organizations reduce TOIL ,MTTR & MTTD while Improving Resiliency & Reliability

发布日期: 2022年8月20日

You are having performance issues on your Linux Server ( Bare metal or EC2). You login and start checking the underlying cause. What commands should be on your checklist to do it fast and efficiently?

Below is my go-to list.

uptime: Check the last 3 numbers, which show the 1, 5, and 10 mins exponentially damped load averages.

18:44:08 up 14:52, 1 user, load average: 32.22, 26.17, 21.20

If the 1 min number is much higher than the 10, 15 mins load averages, then the load is still increasing. If otherwise, then the issue might have already subsided, and you have likely missed the bus.

Drill down further into loads on your CPU with the below.

mpstat -P ALL 5: Shows CPU time per CPU. Check for any of them running at 100% consistently. If so, it's a single-threaded process that is using up that 1 core and causing performance issues for itself. A multi-threaded redesign of that process might be in Order.

dmesg: It will show all system messages and is an excellent place to check if there were any system errors. e.g., oom killers, packet drops, etc. You can then take action accordingly.

Check memory consumptions using free and vmstat.

vmstat 5: virtual memory stats ( the 5 means refresh after every 5 seconds interval ). The 1st line has stats since boot, and the rest of the lines are at 5 seconds refreshes.

free -m: Shows free memory, especially note if the buffers are cached are good ( ie decent non zero numbers ), else those can lead to iowaits.

total used free shared buffers cached

Mem: 285999 24546 261453 80 62 541

-/+ buffers/cache: 23945 262053

Swap:

If you suspect i/o bottlenecks, then check those with iostat, dd and iotop:

iostat -xd -k 2 5 or iostat -p: It shows i/o performance of devices or nfs mounts and write response times.

Use man or help to find out the various other options

iotop -aoP: Will show a list of processes using up most disk i/o along with other neat stats like %tage of the disk i/o each is consuming.

dd:

dd if=/dev/zero of=mytest_write.txt bs=64k count=16k conv=fdatasync --> Will create a file of 0s, called mytest.txt and write to it. This will show stats for write speed.

dd if=mytest_read.txt of=/dev/null bs=64k count=16k --> Have a massive file ( mytest_read.txt) available to read, and the above command can help check for reading speeds.

lsof: list of open files. Can be useful to check files when a disk is not getting unmounted. Can be useful to check open files on a given port (use flag -TCP:port number) or all network connections ( use -i flag)

Check process level resource consumption using the below.

领英推荐

exFAT File System – to Save or Not to Save?

Henk Smit 1 年前

13 Best Tools to Monitor Your System’s Resources

Arun KL 2 年前

VPP Linux CP - Part4

Pim van Pelt 10 个月前

pidstat 5: Will show a rolling summary of resource consumption by each PID and will keep on refreshing after n ( 5 in the above example ) seconds. Very useful to find which processes are consuming the most resources.

htop ( or top ): Shows cmd level consumption and can be sorted on various columns. Check more details on https://www.maketecheasier.com/power-user-guide-htop/

ps: ps has a large number of options. Combine them to get useful data. E.g. the below 2.

ps -aeFHI --sort --cpu%,%mem

ps -eah --format uid,pid,tty,%cpu,rss,cmd --sort %cpu,-rss

Check for network bottlenecks.

sar:

sar -n TCP,ETCP1

sar -n DEV 5

netstat -a |more : Network statistics of interface, incoming and outgoing packets.

iftop: Similar to what top does, but for network usage stats.

tcpdump: It will need you to be root/sudo, so not sure it will work for most in Enterprise levels. But if you have access ( I don't ), then along with Wireshark, it's a very powerful tool.

There are a large number of other cmds that can be useful. There are also a very large number of OSS tools, which I haven't mentioned here as most of those might not be installed on your Enterprise Linux Hosts.

Hope the above helps you to find out performance issues on *Nix systems faster in your daily work life.

要查看或添加评论，请登录

Kaushik Banerjee ( He/Him/His )的更多文章

A Quick Linux Performance Analysis.

2024年1月15日

A Quick Linux Performance Analysis.

Over the last few weeks, I have twice run into "something is slow on the server side". To make my life easier and…

1 条评论
Back to Basics: DevOps

2023年6月8日

Back to Basics: DevOps

Having been in DevOps (and SRE ) for a bit now, I decided to redo some basic courses. I found the Fundamentals of…
Can your process ( or VM ) allocate more memory than is physically available on the underlying Host ?

2023年3月22日

Can your process ( or VM ) allocate more memory than is physically available on the underlying Host ?

While trying to figure something out for Linux VMs on new hardware, I noticed that the cheapest tier had the following…
What Happens When/During: File transfer in Linux

2022年7月12日

What Happens When/During: File transfer in Linux

Part 3: During File transfer in Linux Preface: As part of improving my general knowledge, I have hit upon the (…
What Happens When/During :

2022年6月9日

What Happens When/During :

Part 2: During *nix Login Preface: As part of improving my general knowledge, I have hit upon the ( Brilliant? Foolish…
What Happens When/During :

2022年5月25日

What Happens When/During :

Part 1: During Linux System Boot & Startup Preface: As part of improving my own general knowledge, I have hit upon the…

1 条评论
Why Linux CLI is the spearpoint blade of your SRE/DevOps/ITOps swiss army knife.

2022年1月8日

Why Linux CLI is the spearpoint blade of your SRE/DevOps/ITOps swiss army knife.

Why Linux CLI is the spearpoint blade of your SRE/DevOps/ITOps swiss army knife. Trying something on Docker got me…
Are my systems "Observable"?

2021年12月12日

Are my systems "Observable"?

So people ask/wonder often, we have so much ( too much ? ) alerting and monitoring. Are my systems what they call…

2 条评论
Surprising facts about energy consumption in PoW@BTC

2021年11月1日

Surprising facts about energy consumption in PoW@BTC

I read 3 lucid articles in the last 24 hrs which repudiates in great length the general impression that PoW miners are…

1 条评论
Where/How to start when assembling a new SRE Team.

2021年10月23日

Where/How to start when assembling a new SRE Team.

The below is my interpretation of an interesting talk by Benjamin Bütikofer at USENIX SRECon21. All the good parts are…

See all articles

Linux Performance issues troubleshooting

Kaushik Banerjee ( He/Him/His )

SVP| Autonomous & Accountable DevOps, APAC SRE Head for Trading Tech| Execution, Empathy & Unleashing Team's Potential| I help Organizations reduce TOIL ,MTTR & MTTD while Improving Resiliency & Reliability

领英推荐

Kaushik Banerjee ( He/Him/His )的更多文章

社区洞察

其他会员也浏览了

VPP Linux CP - Part5

How to Clone a Hard Drive on Windows amp; Mac [2024]

How to Partition and Format the Hard Drives on Raspberry Pi?

VPP Linux CP - Part6

Day 40 Tasks : 40 /90 Days

VPP Linux CP - Part2

VPP Linux CP - Part3

Understanding operating systems through Linux.

FreeDOS, now 30 years old, will soon run Windows 3.x

Understanding the Linux Boot Process: How Your System Comes to Life

领英推荐

Kaushik Banerjee ( He/Him/His )的更多文章

A Quick Linux Performance Analysis.

Back to Basics: DevOps

Can your process ( or VM ) allocate more memory than is physically available on the underlying Host ?

What Happens When/During: File transfer in Linux

What Happens When/During :

What Happens When/During :

Why Linux CLI is the spearpoint blade of your SRE/DevOps/ITOps swiss army knife.

Are my systems "Observable"?

Surprising facts about energy consumption in PoW@BTC

Where/How to start when assembling a new SRE Team.

社区洞察

其他会员也浏览了

VPP Linux CP - Part5

How to Clone a Hard Drive on Windows amp; Mac [2024]

How to Partition and Format the Hard Drives on Raspberry Pi?

VPP Linux CP - Part6

Day 40 Tasks : 40 /90 Days

VPP Linux CP - Part2

VPP Linux CP - Part3

Understanding operating systems through Linux.

FreeDOS, now 30 years old, will soon run Windows 3.x

Understanding the Linux Boot Process: How Your System Comes to Life