登录查看更多内容

ASHviz: Fiddling with violins

John Beresniewicz

Architect at Oracle

发布日期: 2019年5月6日

The last ASHviz installment, Densities and dark matter, was a bit of a cognitive burden, but the concepts introduced are fundamental to many of the ASHviz investigations. Here we continue the thread with a new visualization of the sampled and estimated latency density functions.

Violin plots: geom_violin

The ggplot package includes a "violin" geom that produces a variation on the probability density plot that is both pleasant to the eye and facilitates certain visual comparisons. A so-called "violin plot" is simply the probability density curve reflected around the x-axis and then tilted up 90-degrees (coordinate flip) making it (sometimes) look like a violin or cello due to the smoothly curved outlines.

Here is code to create a violin plot of sampled latencies from the Events data frame:

p <- ggplot(data=Events, aes(y=log10(TIME_WAITED)))


p +  geom_violin(aes(x="SAMPLE"))

With the resulting plot:

Recall the standard density plot and observe how the violin shape is simply a variation on the density using the reflection and coordinate rotation.

The violin plot is also solid-filled with white by default, so it has a much more tangible visual impact. The standard line plot is much better for reading off latencies for features like values at the peaks, mostly due to the horizontally oriented latency axis. We will soon see the advantages of violin plots for comparisons.

Estimate-weighted violin plots

Just as with the standard density, we can plot violin densities over the estimated count of events to get an unbiased view of the distribution of event latencies from those sampled by the ASH dump.

Plotting weighted densities using geom_violin frequently resulting in the following warning message:

Warning message in density.default(x, weights = w, bw = bw, adjust = adjust, kernel = kernel, :
“sum(weights) != 1  -- will not get true density”

Just as with the standard density plot, the solution is to set weight = EST_COUNT / sum(EST_COUNT) as follows:

p +  geom_violin(aes(x ="EST_COUNT", weight=EST_COUNT/sum(EST_COUNT)))

This produces the following plot without warning messages:

Note that incorrect weighting (with error messages) does produce correct plot, so the function seems to do simple scaling automatically. These warning errors might be ignored when doing rapid prototyping of visualizations, however before drawing conclusions or making important observations about a visualization the program should be made to execute warning-free to insure plot accuracy.

Note again how much of the weighted density plot lies below 100 microseconds. It looks like about half of the area, which means event latencies are just as likely to be below as above 100 microseconds.

Recall the visual difficulties and distinctions that were made plotting the estimated and sampled density charts together. Crossing lines were distracting, and we ended up doing one plot as filled area and one as a line. Comparison of the two densities was a bit of an issue.

Side-by-side with medians

To assist in comparing the two probability densities, we can plot the two violins side-by-side. This really improves the ability to visually compare and contrast features of the densities at different latencies.

Horizontal lines have also been added at the density medians using the draw_quantiles parameter of geom_violin. Now we see that the actual median of the estimate-weighted density is closer to 1 millisecond than the earlier 100 microsecond guesstimate. We can also roughly quantify the sampler bias in the sense that the median values of the two densities differ by close to 2 orders of magnitude, in other words quite a bit.

Comparing instances

A natural question when looking at system-level data in a RAC environment is "how do instances compare to each other?" Violin plots of the density of sampled latencies split up by instance look like this:

To my eye these look extremely similar, meaning that the statistical properties of the sampled event latencies are almost identical across instances. This would seem to indicate a strong similarity in workload processing with no instance-level aberrations like CPU saturation. The smoothed and symmetric shape of the violins makes comparison for like-ness quite direct, they all line up at all the peaks and valleys and if one didn't it would surely be noticeable.

Count-estimate weighted densities

Plots of the instance latency densities weighted by estimated counts yield similar results:

Here again we see very high level of visual agreement in the number, location, and prominence of the violin features. Comparing four densities against each other for similarity feels not much different than comparing two, so there may be good scalability properties for that use case. The fact that the agreement is strong even after the weighting transformation seems to indicate strong consistency across instances, especially in the lower latencies. At the very lowest latencies some subtle differences can be observed that were of course invisible in the unweighted plot.

Note that we had to compute instance subtotals of EST_COUNT in order to weight the values properly and plot without the warning messages.

Conclusions

This investigation explored the use of violin plots to visualize probability density functions of ASH dump event latencies both as sampled and weighted using the count estimation technique. Violin plots facilitate visual comparisons for similarity by transforming 1D lines into 2D shapes that are more amenable to direct visual cognition.

notebook:

github/jberesni/ASHviz/Jupyter/eventEst.ipynb

要查看或添加评论，请登录

John Beresniewicz的更多文章

Estimating OLTP Execution Latencies Using ASH

2021年2月16日

Estimating OLTP Execution Latencies Using ASH

I want to share something super-useful about Active Session History that I came to understand only last week. Examining…

17 条评论
ASHviz: Dark matter 2

2019年5月16日

ASHviz: Dark matter 2

This article extends the discussion of "dark matter" in ASH by exploring a completely new source of data about event…
ASHviz: Densities and dark matter

2019年5月1日

ASHviz: Densities and dark matter

This installment gets into some deeper concepts relative to visualizing event latency distributions as well as using…
ASHviz: Can you box that, please?

2019年4月27日

ASHviz: Can you box that, please?

This installment explores the distribution of sampled event latencies from the ASH dump using `geom_boxplot( )`. ASH…
ASHviz: Issue at the x-axis

2019年4月25日

ASHviz: Issue at the x-axis

Take another look at the plot in header above. This plot aggregates ASH data by STATE_CLASS using SAMPE_TIME as the…
ASHviz: Accidentally good

2019年4月23日

ASHviz: Accidentally good

This is a short blurb about being sensitive to whether a visualization that works well in a specific case will…
ASHviz: Visualizing ASH dumps with Jupyter Notebooks

2019年4月22日

ASHviz: Visualizing ASH dumps with Jupyter Notebooks

This article begins what I hope will be an interesting series focusing on some data visualization research I have been…
Visualizing Performance Benchmarks (4) - Validate, analyze, conclude

2018年11月16日

Visualizing Performance Benchmarks (4) - Validate, analyze, conclude

In this final episode, we VALIDATE our suspicions about the file-based configurations bottlenecking on read I/O…

12 条评论
Simple SQL Injection Vulnerability Testing

2018年11月13日

Simple SQL Injection Vulnerability Testing

According to The Open Web Application Security Project (OWASP), injection remains the number one category of security…

3 条评论
Visualizing Performance Benchmarks (3) - Start Small and Predict

2018年11月12日

Visualizing Performance Benchmarks (3) - Start Small and Predict

So far in this series we've seen some nice visualizations of elapsed time data for loading a large number of 5GB files…

See all articles

ASHviz: Fiddling with violins

John Beresniewicz

Architect at Oracle

Violin plots: geom_violin

Estimate-weighted violin plots

Side-by-side with medians

Comparing instances

Count-estimate weighted densities

Conclusions

notebook:

John Beresniewicz的更多文章

社区洞察

其他会员也浏览了

Jean-Michel Basquiat: A Visceral Change to the Art World

The evolution of an understanding: Heat Loss

The Ghost Beneath the Paint

Good Art Helps Us Escape

Understanding Klemens Hannigan's Artistic Manifesto. Exploring Art Through the Prism of Absurdity.

Chapter 2: The Ticking Paradox

Measuring the Speed of Light

Shaping Memory: Ars Memoriae

HOW TO UNDERSTAND THOUGHT BEFORE THOUGHT TURNS INTO ACTION: THE WAY OF THE ARTIST.

The Existential Journey of an Artist: Struggles, Solitude, and Unveiling Meaningful Art.

Violin plots: geom_violin

Estimate-weighted violin plots

Side-by-side with medians

Comparing instances

Count-estimate weighted densities

Conclusions

notebook:

John Beresniewicz的更多文章

Estimating OLTP Execution Latencies Using ASH

ASHviz: Dark matter 2

ASHviz: Densities and dark matter

ASHviz: Can you box that, please?

ASHviz: Issue at the x-axis

ASHviz: Accidentally good

ASHviz: Visualizing ASH dumps with Jupyter Notebooks

Visualizing Performance Benchmarks (4) - Validate, analyze, conclude

Simple SQL Injection Vulnerability Testing

Visualizing Performance Benchmarks (3) - Start Small and Predict

社区洞察

其他会员也浏览了

Jean-Michel Basquiat: A Visceral Change to the Art World

The evolution of an understanding: Heat Loss

The Ghost Beneath the Paint

Good Art Helps Us Escape

Understanding Klemens Hannigan's Artistic Manifesto. Exploring Art Through the Prism of Absurdity.

Chapter 2: The Ticking Paradox

Measuring the Speed of Light

Shaping Memory: Ars Memoriae

HOW TO UNDERSTAND THOUGHT BEFORE THOUGHT TURNS INTO ACTION: THE WAY OF THE ARTIST.

The Existential Journey of an Artist: Struggles, Solitude, and Unveiling Meaningful Art.