Diagnose a System Slowdown in Two Minutes

Diagnose a System Slowdown in Two Minutes

A recent client complaint highlighted severe system slowness. Initial suspicions fell on two lengthy SQL queries that lacked clarity. To investigate, we turned to our anomaly monitor and reviewed data spanning from the 21st to the 31st.


Contrary to the client’s initial assumption that the slowdown began on the 28th, the data revealed earlier signs of trouble on the 25th, extending right through the weekend.

The charts showed a distinctive pattern of slowness correlating with business hours, indicating that the performance degradation was not an isolated incident tied strictly to a single date. Instead, the slowdowns unfolded gradually, becoming more pronounced over time.

Linking the Slowdown to Disk Subsystem Issues: Moving Beyond the Suspicion of SQL Queries

By focusing on the patterns uncovered in the anomaly monitor, it became evident that the root cause did not lie in complex SQL queries or recent system changes. The real clues emerged through “buffer IO” problems, which were a clear sign of disk performance bottlenecks. To refine our understanding, we zeroed in on a single day—Friday—and analyzed its data. The issues aligned closely with normal working hours, confirming that the degraded performance matched periods of heavier disk usage rather than query complexity.



With more than two weeks elapsed since the incident, our daily averages still clearly showed a dramatic increase in write and read operations, surging from manageable rates (0.3 MB/s writes and 1 MB/s reads) to values more than sixty times higher. These spikes peaked on the days the client reported the worst slowdowns, unambiguously pointing to the disk subsystem’s struggles as the source of the ongoing issues.


Confirming the Root Cause and Reporting Back: Poor Disk Performance as the Core Problem

With the timeline established and the problem verified through multiple data points, the conclusion was straightforward.

The slowdowns did not stem from complicated queries or subtle configuration changes. Instead, poor disk performance created a bottleneck that affected the entire system’s responsiveness. This diagnosis provided the client with a clear action plan—address the disk performance issues directly rather than attempting to fine-tune SQL queries or hunt down elusive code-level inefficiencies.

Armed with these insights, the client could take precise steps to restore normal operations, saving time and effort by focusing on the actual culprit behind their system’s prolonged slowdown.

Check Out the Whole Story Below:


要查看或添加评论,请登录

DBPLUS Better Performance的更多文章

社区洞察

其他会员也浏览了