ASHviz: Accidentally good
This is a short blurb about being sensitive to whether a visualization that works well in a specific case will translate to more general cases.
The plot below shows Average Active Sessions aggregated separately by minute for each STATE_CLASS and over all instances, plotted as a simple line chart.
When I first saw the plot it struck me as quite informative about what was happening on the cluster during the time interval:
- database activity (time) is spent mostly in three classes: User I/O, Cluster, and CPU
- time spent waiting on User I/O dominates time on CPU or in Cluster wait by a factor of 3 or more
- two very large spikes in Cluster wait time (5-8x normal) occurred, one around 11:53 and a second at 12:02
- preceding/concurrent with the second spike there was increased activity in two other wait classes
Filled with confidence, my next thought was to investigate if the two "incidents" were instance-specific or cluster-wide. This is accomplished by faceting the plot on INSTANCE_NUMBER:
Here we see that both spikes were concurrent and similar in size on all instances, so the incidents were cluster-wide. Observe also that session activity (load) in all three of the dominant classes is very similar, from which we tentatively conclude the cluster processes highly consistent and load-balanced workload. Finally, notice the almost 10 minute variation in time ranges for the separate ASH dumps, and be reminded that ASH samplers are independent and samples are variable length records so each circular ASH buffer will wrap around at different times.
OK, at this point I really think I know something about this cluster, and in fact I actually do. But why do these line charts seem to work? Well the answer is really because there are only 3 lines with any activity at all and very few line intersections among these. Visually independent lines can each tell their own story without much interference, but crossing lines makes for messaging confusion. So the truth here is that the DATA made the line chart visualization work in this case. In another case, with other data, it could work heavily against interpretation.
For instance, reversing the roles of INSTANCE_NUMBER and STATE_CLASS from the plot above (coloring by instance and faceting by state) we get the following:
What can be said about it, other than "it's squiggly"...?
So when a specific data visualization seems to tell a good story, think about whether this will be true generally, for you will want to tell more stories. Don't fall in love with an "accidentally good" visualization.