QQ Plots for Statistical Analysis of Errors in GPR Pavement Thickness Surveys
Payman Hajiani, PHD, PGP
Engineering Geophysicist at California Department of Transportation
Statistical error analysis is necessary to confirm the validity of pavement thickness data acquired from GPR surveys. Mean thickness, variance and standard error are typical statistical values obtained by comparing GPR-derived thicknesses to pavement cores. But how many users ask this critical question:? Are my statistics valid? Common statistical analysis assumes that data follow a normal (Gaussian) distribution. If that doesn’t exist, normal statistics are meaningless.
So, how can you test for normal distribution?? Here at the Caltrans Geophysics Branch, we use quantile-quantile (QQ) plots to determine if discrepancies between the interpreted GPR layers and the core logs are normally distributed (Gaussian distribution). A QQ plot is created by plotting two sets of quantiles against one another. Quantiles are values that group data or probability distribution into equal part. A type of quantile in everyday use is the percentile, used to split data into 100 parts. Basically, QQ plots take your sample data, sort it in ascending order, and then plot them against quantiles calculated from a theoretical distribution. The number of quantiles is selected to match the size of your sample data. If both sets of quantiles come from the same distribution, the QQ plot should form a straight line. Therefore, if our measured quantiles plot on (or at least close to!) the theoretical quantiles, we can be assured that normal statistics are valid for our data.?
Here's an example from a GPR survey performed on Interstate 5 near Sacramento, California. An air-launched Kontur GPR array was deployed at near highway speed (50 mph) along 18 miles in both southbound and northbound directions. The GPR unit was coupled with an Applanix POS-LV GNSS (Global Navigation Satellite System) for accurate location of measurements. For validation of the GPR results, 18 locations along the surveyed alignment were cored.
Pavement structure was traced and picked using Kontur Examiner. Using the cores and as-builts, the interfaces were interpreted as hot-mix asphalt (HMA), Portland Cement Concrete (PCC), cement-treated base (CTB), and Lime Treated Subbase (LTS). An example GPR profile is displayed in Figure 1, the resulting QQ plot for HMA is shown in Figure 2. Goodness of fit (R-squared) for all plots varied between 0.925 and 0.988— confirming normally distributed data.? That gave us confidence that our statistical analysis was valid.
领英推荐
QQ plots provide another benefit:? we were able to identify a few outliers—five data points that deviated significantly from the theoretical reference line. When we examined the corresponding core logs for those outliers more closely, we identified two sources of error with that subset:? 1) mechanical core breaks with sample loss (incomplete cores) could be seen in three of the core photos, and 2) measurement or transcription errors occurred in two other logs. From that information, we were able to either correct those logs or exclude them from the analysis.
In summary, QQ plots are a good tool for confirming the validity of your statistics—consider including them in your QC program!
Chief, Geophysics Branch at California Department of Transportation
5 个月Payman, good work!