Correlation: Updating Font size/Linear Regression/R2 for Chart.Correlation

Correlation: Updating Font size/Linear Regression/R2 for Chart.Correlation

Note: Original package for this function: https://www.rdocumentation.org/packages/PerformanceAnalytics/versions/2.0.4/topics/PerformanceAnalytics-package

If you are using the chart.Correlation function from the PerformanceAnalytics package and you cannot change the font size, consider using the modified code provided here. This customized code allows you to adjust the font size by leveraging the functions from the pairs package. Additionally, the modified code sets a fixed size for the R values and their significance indicators.

first, load the library

library(PerformanceAnalytics)

1. Use this code to access the function for chart.Correlation :

trace("chart.Correlation", edit=T)        

2. Then replace the code with this and click save:

# Custom function for enhanced chart.correlation
function (R, histogram = TRUE, method = c("pearson", "kendall", "spearman"), ...) {
  # Data validation and method selection
  x = checkData(R, method = "matrix")
  if (missing(method)) 
    method = method[1]
  cormeth <- method

  # Panel function for displaying correlation
  panel.cor <- function(x, y, digits = 2, prefix = "", use = "pairwise.complete.obs", 
                        method = cormeth, cex.cor, ...) {
    usr <- par("usr")
    on.exit(par(usr))
    par(usr = c(0, 1, 0, 1))
    r <- cor(x, y, use = use, method = method)
    txt <- format(c(r, 0.123456789), digits = digits)[1]
    txt <- paste(prefix, txt, sep = "")
    if (missing(cex.cor)) 
      cex <- 2
    test <- cor.test(as.numeric(x), as.numeric(y), method = method)
    Signif <- symnum(test$p.value, corr = FALSE, na = FALSE, 
                     cutpoints = c(0, 0.001, 0.01, 0.05, 0.1, 1), 
                     symbols = c("***", "**", "*", ".", " "))
    text(0.5, 0.5, txt, cex = cex)
    text(0.8, 0.8, Signif, cex = 2, col = 2)
  }

  # Function for density plot
  f <- function(t) {
    dnorm(t, mean = mean(x), sd = sd.xts(x))
  }

  dotargs <- list(...)
  dotargs$method <- NULL
  rm(method)

  # Histogram panel function
  hist.panel = function(x, ... = NULL) {
    par(new = TRUE)
    hist(x, col = "light blue", probability = TRUE, axes = FALSE, 
         main = "", breaks = "FD")
    lines(density(x, na.rm = TRUE), col = "red", lwd = 2)
    rug(x)
  }

  # Pair plot creation with conditional histogram
  if (histogram) 
    pairs(x, gap = 0, lower.panel = panel.smooth, upper.panel = panel.cor, 
          diag.panel = hist.panel, ...)
  else 
    pairs(x, gap = 0, lower.panel = panel.smooth, upper.panel = panel.cor, ...)
}        

3. Example plot

To increase the size of the variable labels (present along the diagonal), you can use the cex.labels parameter and set it to a numeric value. For example:
chart.Correlation(mtcars[,-1], cex.labels = 3)        

In this example, the cex.labels parameter is set to 3, which enlarges the variable labels in the diagonal of the scatter plot matrix.

and a few more updates:

4. the hollow points (scatterplot) will be black now, and the trend is in blue :

function (R, histogram = TRUE, method = c("pearson", "kendall", 
    "spearman"), ...) 
{
    x = checkData(R, method = "matrix")
    if (missing(method)) 
        method = method[1]
    cormeth <- method
    panel.cor <- function(x, y, digits = 2, prefix = "", use = "pairwise.complete.obs", 
        method = cormeth, cex.cor, ...) {
        usr <- par("usr")
        on.exit(par(usr))
        par(usr = c(0, 1, 0, 1))
        r <- cor(x, y, use = use, method = method)
        txt <- format(c(r, 0.123456789), digits = digits)[1]
        txt <- paste(prefix, txt, sep = "")
        if (missing(cex.cor)) 
            cex <- 2
        test <- cor.test(as.numeric(x), as.numeric(y), method = method)
        Signif <- symnum(test$p.value, corr = FALSE, na = FALSE, 
            cutpoints = c(0, 0.001, 0.01, 0.05, 0.1, 1), symbols = c("***", 
                "**", "*", ".", " "))
        text(0.5, 0.5, txt, cex = cex)
        text(0.8, 0.8, Signif, cex = 2, col = 2)
    }
    f <- function(t) {
        dnorm(t, mean = mean(x), sd = sd.xts(x))
    }
    dotargs <- list(...)
    dotargs$method <- NULL
    rm(method)
    hist.panel = function(x, ... = NULL) {
        par(new = TRUE)
        hist(x, col = "light blue", probability = TRUE, axes = FALSE, 
            main = "", breaks = "FD")
        lines(density(x, na.rm = TRUE), col = "red", lwd = 2)
        rug(x)
    }
    panel.smooth <- function(x, y, col = "black", bg = NA, pch = 20, 
        cex = 1, col.smooth = "blue", ...) {
        points(x, y, pch = pch, col = col, bg = bg, cex = cex)
        ok <- is.finite(x) & is.finite(y)
        if (any(ok)) 
            lines(stats::lowess(x[ok], y[ok]), col = col.smooth, 
                ...)
    }
    if (histogram) 
        pairs(x, gap = 0, lower.panel = panel.smooth, upper.panel = panel.cor, 
            diag.panel = hist.panel, ...)
    else pairs(x, gap = 0, lower.panel = panel.smooth, upper.panel = panel.cor, 
        ...)
}        

save this, and then you can run the:

chart.Correlation like this:

chart.Correlation(mtcars[,-1], label.pos = 0.8, cex.labels = 3)        

and you should get something like this: the size of the variable increases and the scatter plot is changed.

5. With the following code, you can get a linear model (regression) in the scatter plot side or lower panel with R square.

# Enhanced chart.correlation function with histogram and multiple correlation methods
function (R, histogram = TRUE, method = c("pearson", "kendall", "spearman"), ...) {
    # Validate input data and set default correlation method
    x = checkData(R, method = "matrix")
    if (missing(method)) 
        method = method[1]
    cormeth <- method

    # Inner function to create a correlation matrix panel
    panel.cor <- function(x, y, digits = 2, prefix = "", use = "pairwise.complete.obs", 
                          method = cormeth, cex.cor, ...) {
        # Set up graphics parameters and restore upon exit
        usr <- par("usr")
        on.exit(par(usr))
        par(usr = c(0, 1, 0, 1))

        # Calculate and format the correlation coefficient
        r <- cor(x, y, use = use, method = method)
        txt <- format(c(r, 0.123456789), digits = digits)[1]
        txt <- paste(prefix, txt, sep = "")

        # Default font size for correlation coefficient
        if (missing(cex.cor)) 
            cex <- 2

        # Perform correlation significance test
        test <- cor.test(as.numeric(x), as.numeric(y), method = method)
        Signif <- symnum(test$p.value, corr = FALSE, na = FALSE, 
                         cutpoints = c(0, 0.001, 0.01, 0.05, 0.1, 1), 
                         symbols = c("***", "**", "*", ".", " "))

        # Display the correlation coefficient and its significance
        text(0.5, 0.5, txt, cex = cex)
        text(0.8, 0.8, Signif, cex = 2, col = 2)
    }

    # Function for density plot calculation
    f <- function(t) {
        dnorm(t, mean = mean(x), sd = sd.xts(x))
    }

    # Prepare additional arguments, excluding 'method'
    dotargs <- list(...)
    dotargs$method <- NULL
    rm(method)

    # Histogram panel function
    hist.panel = function(x, ... = NULL) {
        par(new = TRUE)
        hist(x, col = "light blue", probability = TRUE, axes = FALSE, main = "", breaks = "FD")
        lines(density(x, na.rm = TRUE), col = "red", lwd = 2)
        rug(x)
    }

    # Panel function for smooth scatter plots with linear regression and R^2 display
    panel.smooth <- function(x, y, col = "black", bg = NA, pch = 20, cex = 1, col.smooth = "blue", ...) {
        points(x, y, pch = pch, col = col, bg = bg, cex = cex)
        ok <- is.finite(x) & is.finite(y)
        if (any(ok)) {
            fit <- lm(y ~ x)
            abline(fit, col = col.smooth, ...)
            r2 <- summary(fit)$r.squared
            r2_rounded <- round(r2, 2)
            r2_text <- bquote("R"^2 ~ "=" ~ .(r2_rounded))
            text(x = mean(x), y = max(y), labels = r2_text, pos = 1, cex = 1.5)
        }
    }

    # Generate pairs plot with optional histogram and smooth scatter plot panels
    if (histogram) 
        pairs(x, gap = 0, lower.panel = panel.smooth, upper.panel = panel.cor, diag.panel = hist.panel, ...)
    else 
        pairs(x, gap = 0, lower.panel = panel.smooth, upper.panel = panel.cor, ...)
}        

and the same code as the previous :

chart.Correlation(mtcars[,-1], label.pos = 0.8, cex.labels = 3)        

should produce something like this:


Did you find this post useful? If so, and you'd like to show your appreciation, why not treat me to a coffee? It's just a click away: Buy Me a Coffee. Your support means a lot, and it keeps the coffee (and the ideas) flowing! ??

要查看或添加评论,请登录

Dr. Saurav Das的更多文章

  • Synthetic Data for Soil C Modeling

    Synthetic Data for Soil C Modeling

    Note: The article is not complete yet My all-time question is, do we need all and precise data from producers (maybe I…

  • Bootstrapping

    Bootstrapping

    1. Introduction to Bootstrapping Bootstrapping is a statistical resampling method used to estimate the variability and…

  • Ecosystem Service Dollar Valuation (Series - Rethinking ROI)

    Ecosystem Service Dollar Valuation (Series - Rethinking ROI)

    The valuation of ecosystem services in monetary terms represents a critical frontier in environmental economics…

  • Redefining ROI for True Sustainability

    Redefining ROI for True Sustainability

    It’s been a while since I last posted for Muddy Monday, but a few thoughts have been taking root in my mind, growing…

  • Linear Plateau in R

    Linear Plateau in R

    When working with data in fields such as agriculture, biology, and economics, it’s common to observe a response that…

    2 条评论
  • R vs R-Studio

    R vs R-Studio

    R: R is a programming language and software environment for statistical computing and graphics. Developed by Ross Ihaka…

    1 条评论
  • Backtransformation

    Backtransformation

    Backtransformation is the process of converting the results obtained from a transformed dataset back to the original…

    3 条评论
  • Spectroscopic Methods and Use in Soil Organic Matter & Carbon Measurement

    Spectroscopic Methods and Use in Soil Organic Matter & Carbon Measurement

    Spectroscopic methods comprise a diverse array of analytical techniques that quantify how light interacts with a…

    2 条评论
  • Regression & Classification

    Regression & Classification

    Regression and classification are two predictive modeling approaches in statistics and machine learning. Here's a brief…

    2 条评论
  • Vectorization over loop

    Vectorization over loop

    Vectorization Vectorization in R refers to the practice of applying a function to an entire vector or array of data at…

社区洞察

其他会员也浏览了