Why is transforming the response in regression analysis and hypothesis testing so dangerous? ??

Adrian Olszewski

Clinical Trials Biostatistician at 2KMM (100% R-based CRO) ? Frequentist (non-Bayesian) paradigm ? NOT a Data Scientist (no ML/AI/Big data) ? Against anti-car/-meat/-cash and C40 restrictions

发布日期: 2022年10月10日

Introduction

When reading scientific articles, blogs, tutorials and even textbooks, you will quickly notice, that transformation (especially the log-transformation, but not only) of variables (especially the response, DV) in regression analysis or testing hypotheses is common. Some researchers at least try to justify this step, some try the Maximum-Likelihood approach (e.g. via Box-Cox), but many transform data almost automatically, if only data are skewed or "non-normal" (as if it ever mattered!), completely ignoring the basic questions: "But what for? What will you gain with that? Can you explain what happened after you did it?". It is not wrong to do something - and transforming the response is no exception, but one needs to explain the reasons (e.g. whether you play with truly multiplicative relationships with such error).

/ Let's currently ignore the fact, that no single regression analysis needs either the IV (independent variables, predictor, covariates, "X" - pick the name you use in your field) or DV(dependent variable, response, "Y") to follow any specific distribution. It's wrong and never repeat that. The assumption of normality pertains only to the conditional distribution of the response rather than the raw DV. But because this is unverifiable, we assess the equivalent thing - the distribution of the model residuals. And please note - it applies only for the residuals from the General Linear Model (the Generalized LM has many types of residuals and neither need to be normal), and only if you perform the inference over the model coefficients. The Gauss-Markov theorem doesn't need it for the estimate of model parameters to be BLUE. If you want to learn more about this topic, visit Jim Frost excellent article titled 7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression. and/or read the book "Understanding Regression Analysis: A Conditional Distribution Approach" by Peter H. Westfall and Andrea L. Arias, CRC Press /

OK, you're right, transformation of the response may be a worthwhile technique when doing predictions and not caring of anything else, so you just feed the black-box with some data and enjoy the output. It may be also useful in econometric or pharmacokinetics (and -dynamics) problems when having numerical predictor variables (covariates; IVs). But you should definitely think twice if you're going to use it for exploratory or confirmatory analysis and inference (testing hypotheses). In a while you will learn how many things such transformation affects. Did you ever think about that before? Let me tell you - I did not too, in the past. This is rarely taught or at least discussed.

/BTW, as many people asked me - what about the "log-linear" models, Adrian? First, don't confuse log-transformed response OLS with the Poisson GLM here. Log-linear models (often applied to analyze contingency tables) account for the conditional distributions and transform the conditional expectation. /

To what extent it happens depends on the existence of categorical IVs or how discrete is the measured looks the underlying continuous IV (e.g. only a few values are measured, while all are possible) mimicking "categories". Then the Jensen's inequality "starts messing" as conditional DV != observed (mixture) DV. In absence of categorical IV and presence of "nicely continuous IVs", it may work well, as the conditional DV ~ observed DV.

The things that will be affected. Don't ignore them.

??Interpretation

Transformation completely changes the formulation, and affects the interpretation. Only in "clean" cases you will get interpretable outcome, like log-transformed data generated by multiplicative process (not just any right skewed data!). log, exp, reciprocal, square/cube root, power of 2, 3 transformations may be meaningful in special scenarios, e.g. velocity, area, volume, concentration, length (square root of area). But if you used the Box-Cox which returned "-0.67" for your DV - what does it ever mean?! And most of your audience will have no idea how your response changes with the predictor unless you draw the curves. Doable for a singe IV, but if you have a multi-variable model? You will need the analysis of marginal effects to give some idea. And still it won't be a piece of cake for your non-technical audience.

Yes, you're right - sometimes we can decide to approximate the obtained coefficient with well-known ones, e.g. 0.45 is close to 0.5 (square root), -0.89 is close to -1 (reciprocal), but it’s not that easy in general. So why doing that?

Yes, the log-transformed models are common in econometrics, for instance. When doing them, just remember what happens:

log(y): log(y) = x'β +ε ; ε~N(0,σ2), which on the original scale turns into: y=exp(x'β)exp(ε), which means a multiplicative error with the error following the log-normal distribution.
GLM with log link: y~N(μ,0); log(μ)=x'β, which turns out to be: y=exp(x'β)+ε with ε~N(0,σ2), so on the original scale we have an additive error following the normal distribution.

( https://stats.stackexchange.com/questions/47840/linear-model-with-log-transformed-response-vs-generalized-linear-model-with-log )

So just decide what's OK for your case.

The log-transformation is special and gives you some extra benefits when interpreting the results. Let's look at the 3 cases:

log(Y) ~ log(X): a 1% increase in X --> β% change in Y (β are the "elasticities")
log(Y) ~ X: a 1-unit increase in X --> β*100% change in Y for small β, so exp(β)≈1+β; otherwise it's better: 100*(exp(β)?1)?( see why: https://stats.stackexchange.com/questions/18480/interpretation-of-log-transformed-predictor-and-or-response/320815#320815 ). For example, for β=0.05, exp(0.05)≈1.05, so we can say there's a 5% increase in Y for a 1-unit increase in X. But for β=0.5, exp(0.5)≈1.65 and of course this is NOT 50% increase in Y.
Y ~ log X -- a 1% increase in X --> to a β/100 change in Y

??Forcing the data to tell your story

By transforming, you *force your variables to follow certain distribution* and to *tell your story*. For example, log-transformation assumes your data comes from log-normal distribution, thus - they were generated by a multiplicative process. Is this true? How do you know? Did you do a research or have any expectation about that? If so - go ahead.

/Some people messaged me and said: "but when you propose the GLM, the same problem appears - you have to decide which distribution describes your conditional data the best! Yes. Fair point. The GLM isn't a panacea relaxing you from thinking. Both approaches may be correct as long as you can justify the choice. With Box-Cox you don't decide, it mechanistically decides for you. Always remember about that. If it fits your needs, use it, but do it consciously.

??Nature of errors

It changes the model along with the errors! In our case - from additive to multiplicative including errors.

Whether it's good or not depends on your case. If you have thoroughly considered this issue - sure, go ahead.

??The mean-variance relationship

It will also affect the mean-variance relationships. Well, it can be useful if we want to stabilize variance, BUT it changes more! For example, in normal distribution mean and variance are independent, but in the log-normal it's not.

Yes, this is an idealized case, but Box-Cox may return any weirdest coefficient - guess how will the model and the mean-variance relationship change? If you know what you're doing - go ahead.

??What you see and believe is important may be actually not

Remember the intro? By running regression you are interested in modelling the conditional statistic (e.g. the expected value) of the response, E(Y|X=x), rather than transformed response itself. But transformation of the response (DV) affects the entire variable, which is… a mixture of conditional (to the predictor) responses.

Repeat the bolded sentence one more time. This is very important to understand.

What does it mean? One may have perfectly normal conditional distributions of the response, but together, their mixture may look totally skewed.

Let me show you an example, when the IV is categorical. Look at the picture on a top of this article (sadly, LinkedIn doesn't allow me to insert images in the text!)

The coloured ones are generated from the normal distributions. Pooled together they form a huge skewness. That’s what we see, when we look at the DV itself. But regression doesn’t care of the pooled DV at all.

Many people decide then (incorrectly) to log-transform such response "because it is skewed".

If one, blindly, log-transforms the response (the DV), it may immediately spoil the residuals. What a disastrous actions… It will work if the conditional distributions are skewed itself. How do we know, if the transformation was right, if we cannot easily check the conditional distributions? That’s the role of checking the residuals (after the model is fit). If they are normal, it means, that the conditional responses were skewed, so the (say) log-transformation worked.

Please find two examples with a simple linear model with one categorical IV and numerical DV. In R it's: lm(DV ~ IV). I made so that the conditional (to the levels of the IV) distributions are pretty normal, but the “pooled” DV looks seriously skewed.

No alt text provided for this image — The observed distribution of the DV (response) is the MIXTURE of all conditional DVs

In other words (different data, same rule):

When we run the linear model over the data with conditional normals, the residuals will be approximately normal too, but when we apply the log-transformation to the entire DV "to fix the skewness in the DV", we will spoil the residuals (right-side graphs in each panel). Because the transformation affects also the conditionals!

Now let's have a look at the figure below. I bet this is not what you expected to achieve!

You can reproduce it with the following R code:

set.seed(1000)?
?
cnts  <- c(50, 30, 20, 10, 10, 10,   5)?
means <- c( 3,  6, 10, 15, 20, 25, 35)?
?
DV    <- unlist(mapply(rnorm, cnts, means))?
IV    <- unlist(mapply(rep, LETTERS[seq(cnts)], cnts))?
data  <- data.frame(DV, logDV = log(DV), IV=factor(IV))?
?
m  <- lm(DV ~ IV, data = data)
m1 <- lm(logDV ~ IV, data = data)?
?
layout(matrix(1:6, ncol = 2))?
?
hist(data$DV, main= sprintf("raw DV\nShapiro-Wilk: %.3f", shapiro.test(data$DV)$p.value))?
hist(m$residuals,  main= sprintf("residuals; raw DV\nShapiro-Wilk: %.3f", shapiro.test(m$residuals)$p.value))?
car::qqPlot(m$residuals, main="residuals; raw DV", grid=FALSE)?
?
hist(data$logDV, main= sprintf("log DV\nShapiro-Wilk: %.3f", shapiro.test(data$logDV)$p.value))?
hist(m1$residuals, main= sprintf("residuals; log DV\nShapiro-Wilk: %.3f", shapiro.test(m1$residuals)$p.value))?
car::qqPlot(m1$residuals, main="residuals; log DV", grid=FALSE)?

And the opposite case - here I made the conditional DV log-normal. The residuals from untransformed model look really bad. But after log-transformation the residuals are approximately normal. Why? Because now they were truly conditionally-skewed.

You can reproduce it with the following R code:

set.seed(1000)?
?
cnts  <- c(50, 30, 20, 10, 10, 10, 5)?
means <- c( 1,  2, 3, 4, 5, 6, 7)?
?
DV    <- unlist(mapply(rlnorm, cnts, means))?
IV    <- unlist(mapply(rep, LETTERS[seq(cnts)], cnts))?
data  <- data.frame(DV, logDV = log(DV), IV=factor(IV))?
?
m  <- lm(DV ~ IV, data = data)?
m1 <- lm(logDV ~ IV, data = data)?
?
layout(matrix(1:6, ncol = 2))?
?
hist(data$DV, main= sprintf("raw DV\nShapiro-Wilk: %.3f", shapiro.test(data$DV)$p.value))?
hist(m$residuals,  main= sprintf("residuals; raw DV\nShapiro-Wilk: %.3f", shapiro.test(m$residuals)$p.value))?
car::qqPlot(m$residuals, main="residuals; raw DV", grid=FALSE)?
?
hist(data$logDV, main= sprintf("log DV\nShapiro-Wilk: %.3f", shapiro.test(data$logDV)$p.value))?
hist(m1$residuals, main= sprintf("residuals; log DV\nShapiro-Wilk: %.3f", shapiro.test(m1$residuals)$p.value))?
car::qqPlot(m1$residuals, main="residuals; log DV", grid=FALSE)

Do you see, what happened to the residuals after "fixing" it with logarithm, when we looked at the overall DV rather than conditional?

But wait a moment, not all is lost.

if you have only numerical IVs with many unique values (no/few ties), the conditional distribution of the DV may resemble the conditional ones and then you can use the transformations over the DV. That's very often used in pharmacokinetics, where by default we log-transform various kinds of variables of interest expecting they will be conditionally skewed (that's because we know they come from multiplicative processes, which are asymmetric and well approximated by log-normality).
but if you have categorical variables with only a few levels, the conditional distributions may be very different from the raw DV formed by their mixture. And here enters the scene the Jensen's inequality, which clearly says, that transformation(E(Y)) ≠ E(transformation(Y)) except the transformation is identity. Your outcomes may differ greatly between the two approaches!

Ah, and remember that no transformation can handle certain response distributions properly, like counts (it simply makes no sense - you won't obtain more (continuous) from less (discrete)).

BTW, there IS a technique, that handle properly transformation(E(y)) - it's the Generalized Linear Model (GLM). That's exactly what you may want in many cases, when you need to apply a transformation to the DV. It's because it deals with the conditional expected value (via link function) and NOT the raw response (being a mixture of conditional distributions).

/ Some of you messaged me: Adrian, you're not right. The general linear model with transformed response is a special case of the GLM with Gaussian conditional response and identity link with transformed response. So the log(Y)=... is still a GLM. Sure, it is, but it doesn't matter, as you do not transform the conditional expectation here (link = identity), so admitting this fact doesn't change anything, we end up with the difference between response-transformed GLM with identity link vs. response-transformed GLM with non-identity link :) /

?? And what happened to the hypotheses? o_O

In case of testing, transformation changes the null hypothesis, which may be not the one you wanted to assess. For example, the famous log-transformation switches from testing the shift in arithmetic means to the ratio of geometric means. You knew that, did you? If yes - go ahead.

I can hear you: “I was told it leads to valid inference!”.

Yes, it leads to valid inference... of a hypothesis you have not started with initially unless you can justify that transition and the equivalence between the hypotheses on transformed and untransformed data. Maybe it's OK - at the end of the day we use the monotonous transformations preserving the ordering of data. From the other side, parametric tests are built on a top of moments, and moments aren't invariant to general monotone transformations (logarithms, square roots). So maybe you... obtained a technically valid answer to an unasked question.

And yes, the analysis of log(y) may result in different outcomes than those returned by a GLM model with log link (e.g. Poisson or gamma regression). You will have to decide which one to choose. What now?

Sometimes there are industry guidelines, like those given by the FDA for clinical biostatistics, which advises using log on PK data (for a good reason), but *even those guidelines* warn you against unconditional and *unjustified* transformations!

Check this article (free PDF): Becker, Thomas & Robertson, Melissa & Vandenberg, Robert. (2019). Nonlinear Transformations in Organizational Research: Possible Problems and Potential Solutions. Organizational Research Methods. 22. 831-866. 10.1177/1094428118775205.

Consider also the approach described on page 688/689 at (free PDF): Hill, N. & Bailey, James. (2019). Best Practices in Data Collection and Preparation: Recommendations for Reviewers, Editors, and Authors. Organizational Research Methods. 24. 109442811983648. 10.1177/1094428119836485.

?? What about the back-transformed confidence intervals?

Your back-transformed confidence intervals and predictions on the original scale will be biased. Remember the Jensen's inequality. Another disease to the collection. But if you checked the evaluated your predictions with some measures and it worked well - sure, use it. At the end of the day the results matter.

But if your goal is the exploratory and confirmatory data analysis, think twice about this very issue and decide.

Remember also that - according to many guidelines - you generally should report appropriate CIs together with results of testing hypotheses (if possible). What if they will be discrepant in conveying the statistical significance, if you report p-value on transformed data and the CI on the untransformed one?

?? At least the transformation gives me the normality I wanted. Right? Right?!

No, they do NOT guarantee to reach the properties you expect it does. It can help, but you should not blindly rely on it solely. And if it does not, then what are you going to do? Transform again the data again, and again, until satisfied? :)

Not to mention that you may turn your right skewed data into... left-skewed one and fall into more troubles :)

?? I have 0 or negative values in my data...

For 0 it's a rather minor issue. Just shift them all by some "infinitesimal" value to get rid of this problem. But if you have skewed data but starting from negative ones, then, well, you're in trouble. And YES, such data do exist, e.g. the T-score in densitometry of patients under treatment of osteoporosis or osteopenia.

From the other side, if the entire conditional expectation will take a negative value (e.g. in the whole treatment the mean is negative), the problem remains and you need a different transformation.

Well, I cannot give out-of-the box solutions, but maybe switching just to the quantile-based methods will solve your problem. Make the quantile regression your friend. It also extends to a mixed model, so you can account for clustered or repeated observations (e.g. in longitudinal studies).

Summary

I know there are many proponents of unconditional data transformation ("skewed data? → just go transform it!") even on the Research Gate. They have been taught this for tens of years. Moreover, some of them were told by authorities "to continue using it".

But in the light of the above arguments I collected I strongly advise you to consider (practically always better) alternatives and do not allow anyone to force you to perform such analysis "just because so".

Otherwise you (not them) will be in charge of all the nonsense that you may obtain.

Except the mentioned few scenarios, the transformation of the response in regression modelling can be really problematic in confirmatory and exploratory modelling.

Can it be useful? Yes, it may be OK in predictive modelling, especially if you agree on using a “black box” approach, where you care mostly of the predicted outcomes and not the rest of the story. If the predicted outcome agrees with the expectations – you are fine with that. Then – it’s OK. If your case is justified - e.g. the data comes truly from a multiplicative process, like often in the pharmacokinetics (and - in general - in the clinical laboratory diagnostics: hormone concentrations, enzyme activity) - sure, do it.

领英推荐

Ordinary Least Squares

Marcin Majka 5 个月前

How to Deal with Multicollinearity?

Mohammad Arshad 2 年前

How to deal with Multicollinearity?

Mohammad Arshad 4 年前

But never do that blindly.

You only criticize. Propose something constructive!

In the 21st century we have a plenty of models, estimation methods and other techniques (being here for ca 50 years) allowing us to deal with certain violations of the assumptions (normality, homoscedasticity, independence of observations and so on), including:

0) do nothing, if your residuals are sufficiently approximately normal, regardless of the distibution of the IV and DVs. Check my example here: https://www.dropbox.com/s/3y41agu5w37t6vk/skewness_IV_DV.pdf?dl=0

1) generalized models (GLM and GAM), like: gamma, beta, logistic (and probit), fractional logit, Poisson, negative binomial, etc. regressions. Trucated (most of real variables have truncated domain, keep it in mind!) and censored regression (e.g. tobit model). I’m sure you will be able to find a tool suitable for you. Remember, that this generalizes nicely to the mixed-effect models and marginal (population-average) ones using the GEE estimation.

2) truly non-linear models

3) robust and non-parametric methods and tests (there are over 550 statistical tests! Lots of them do not require or relax certain parametric assumptions, like Yuen, Brunner-Munzel, ATS, WTS, ART ANOVA, Welch, Mann-Whitney/Wilcoxon, and many, many more). At the end of this document I added an longer set of the literature that I read and can wholeheartedly recommend it.

4) If you need AN(C)OVA on non-normal or heterogeneous data, remember that you can a run more advanced model relaxing certain - or even most - assumptions (e.g. robust regression, quantile regression, mixed models) or use GLS or GEE estimation and follow the modelling with the assessment of the main and interaction effects to obtain the type-2 or type-3 ANOVA (also repeated).

Yes, you read well – that’s what the anova() (or car::Anova(… type=2/3) function does in R when dealing with so many kinds models. It either compares nested models via likelihood ratio tests (it looks at the reduction of residual variance or deviance -for the GLM per each model term) or jointly tests the model coefficients using the Wald’s approach (e.g. emmeans::joint_tests()). Which – in case of the simple general linear model – reduces to the analysis of certain contrasts, which is nothing but comparing group means. See? The dots connect!

4) quantile regression (which handles also mixed effects and additive effects) – it’s one of the most powerful method, requiring no distributional assumptions yet still offering good interpretability! A bonus: it also handles censored data.

5) Generalized Least Square and Generalized Estimation Equations estimation

6) Passing-Bablock and Deming regression

7) resampling (permutation/exact tests, approximate permutation tests, bootstrapped interval estimation). Only remember those methods aren’t widely accepted yet by the regulators in the Clinical Research industry when used to analyse the primary outcomes. Consult your local regulatory authority before you decide to apply it in such scenario.

8) In case of serious skewness you can also try adding categorical covariate(s) to your model which may split your dataset into more homogeneous subgroups. Why? Because the skewness often comes from mixed data coming from 2+ populations with different variability. You saw this - the DV is a mixture of conditional distributions, right? So - what if there is missing categorical covariate (IV), which could, potentially, split your DV into nicely symmetric (approximately normal) sub-groups? "Omitted covariate" is a serious problem, not only in this case, research this topic.

Afterword

As always in statistics – there’s no easy solution to all cases. There are justified cases, where the transformations are not only applicable, but also demanded by the regulations – see an example here: FDA: Guidance for Industry - Statistical Approaches to Establishing Bioequivalence or here: EMA - ICH Topic E 9 Statistical Principles for Clinical Trials, step 5

Link to a picture: https://qph.cf2.quoracdn.net/main-qimg-390427b41189ece0bf92be1de6db343a

Also: “THE LOG TRANSFORMATION IS SPECIAL”

https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.530.9640&rep=rep1&type=pdf

Also, find my diagram (on DropBox) showing a few of the families of models (along with the relationships) and estimation methods, that may be useful for you more than data transformation: https://www.dropbox.com/s/5a8w8kckyfeaix0/statistical%20models%20-%20diagram.pdf

A few URL linking to discussions and resources worth reading by the occasion of this topic:

1. Log-transformation and its implications for data analysis

2. GLM with a Gamma-distributed Dependent Variable (PDF)

3. CrossValidated: When to use gamma GLMs?

4. To transform or not to transform: using generalized linear mixed models to analyse reaction time data

5. Stat 504 - Introduction to Generalized Linear Models

6. Do Not Log-Transform Count Data, Bitc*s!

7. Generalized linear models - An introduction by Christoph Scherber

8. https://www.theanalysisfactor.com/the-difference-between-link-functions-and-data-transformations/

9. Notes on Transformations and Generalized Linear Models

10. Handling Skewed Data: A Comparison of Two Popular Methods

11. CrossValidated: Linear model with log-transformed response vs. generalized linear model with log link

12. CrossValidated: How to decide which glm family to use?

13. CrossValidated: Family of GLM represents the distribution of the response variable or residuals?

14. CrossValidated: Why is GLM different than an LM with transformed variable

15. CrossValidated: GLM vs square root data transformation

16. https://stats.idre.ucla.edu/sas/faq/how-can-i-interpret-log-transformed-variables-in-terms-of-percent-change-in-linear-regression/

17. CrossValidated: How to interpret regression coefficients when response was transformed by the 4th root?

18. CrossValidated: Express answers in terms of original units, in Box-Cox transformed data

Books worth reading:

1. Alan Agresti, Foundations of Linear and Generalized Linear Models

2. John Fox, Applied regression analysis and generalized linear model

3. Roger Koenker, Victor Chernozhukov, Xuming He, Limin Peng, Handbook of Quantile Regression

4. Young, Derek S, Handbook of regression methods

5. Andreas Ziegler, Generalized Estimating Equations

6. Daryl S. Paulson, Handbook of Regression and Modeling Applications for the Clinical and Pharmaceutical Industries

7. Myles Hollander, Douglas A. Wolfe, Eric Chicken, Nonparametric Statistical Methods

8. Jason C. Hsu, Multiple Comparisons, Theory and methods

9. Alex Dmitrienko, Ajit C. Tamhane, Frank Bretz, Multiple Testing Problems in Pharmaceutical Statistics

10. Michael G. Akritas and Dimitris N. Politis, Recent Advances and Trends in Nonparametric Statistics

11. W. J. Conover practical nonparametric statistics

12. P. H. Westfall, A. L. Arias, Understanding Regression Analysis A Conditional Distribution Approach

+ some more literature about the modern and flexible non-parametric methods (there’s lots of more beyond the Mann-Whitney-Wilcoxon, Friedman, Kruska-Wallis!), so you don’t have to transform your data ;]

1. Erceg-Hurn, David & Mirosevich, Vikki. (2008). Modern Robust Statistical Methods An Easy Way to Maximize the Accuracy and Power of Your Research. The American psychologist. 63. 591-601. 10.1037/0003-066X.63.7.591.

https://www.researchgate.net/publication/23319441_Modern_Robust_Statistical_Methods_An_Easy_Way_to_Maximize_the_Accuracy_and_Power_of_Your_Research

https://pdfs.semanticscholar.org/88cb/15520b2f84fd2a5a09e0341e791f40ab4118.pdf2. Edgar Brunner, Madan L. Puri, Nonparametric Methods in Factorial Designs https://www.researchgate.net/profile/Jos_Feys/post/What_statistical_tests_can_I_use_to_compare_mean_values_for_my_study/attachment/59d6558b79197b80779acad7/AS:526088510111744@1502440683536/download/Brunner.pdf

3. Brunner, E., & Puri, M. L. (1996). Nonparametric methods in design and analysis of experiments. In Design and Analysis of Experiments (Vol. 13, pp. 631–703). Elsevier. https://doi.org/https://doi.org/10.1016/S0169-7161(96)13021-2

4. Wobbrock, J.O., Findlater, L., Gergle, D. and Higgins, J.J. (2011). The Aligned Rank Transform for nonparametric factorial analyses using only ANOVA procedures. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI '11). Vancouver, British Columbia (May 7-12, 2011). New York: ACM Press, pp. 143-146. https://faculty.washington.edu/wobbrock/pubs/chi-11.06.pdf

5. Christophe Leys, Sandy Schumann, A nonparametric method to analyze interactions: The adjusted rank transform test https://cescup.ulb.be/wp-content/uploads/2015/04/Leys_and_Schumann_nonparametric_interactions.pdf

6. Haiko Lüpsen, The Aligned Rank Transform and discrete Variables -a Warning https://kups.ub.uni-koeln.de/7554/1/ART-discrete.pdf

7. Friedrich, S., Konietschke, F., & Pauly, M. (2017). GFD: An R Package for the Analysis of General Factorial Designs. Journal of Statistical Software, 79(Code Snippet 1), 1 - 18. doi:https://dx.doi.org/10.18637/jss.v079.c01

8. Kimihiro Noguchi, Yulia R. Gel, Edgar Brunner, Frank Konietschke,“nparLD: An R Software Package for the Nonparametric Analysis of Longitudinal Data in Factorial Experiments”

9. Edgar Brunner, Arne C. Bathke, Frank Konietschke, Rank and Pseudo-Rank Procedures for Independent Observations in Factorial Designs: Using R and SAS, Springer, 2019, ISBN: 303002914X, 9783030029142, page 134 https://books.google.pl/books?id=t9KiDwAAQBAJ&lpg=PA134&ots=_Jgi9Rt0Kz&hl=pl&pg=PA134#v=onepage&q&f=false

10. Feys, Jos. "New Nonparametric Rank Tests for Interactions in Factorial Designs with Repeated Measures." Journal of Modern Applied Statistical Methods 15.1 (2016): 78-99. Web. https://digitalcommons.wayne.edu/cgi/viewcontent.cgi?article=1924&context=jmasm

11. Friedrich, S., Konietschke, F., Pauly, M.(2017). GFD - An R-package for the Analysis of GeneralFactorial Designs. Journal of Statistical Software, Code Snippets 79(1), 1–18, doi:10.18637/jss.v079.c01.Pauly, M., Brunner, E., Konietschke, F.(2015). Asymptotic Permutation Tests in General FactorialDesigns. Journal of the Royal Statistical Society - Series B 77, 461-473

12. Akritas, M. G., & Politis, D. N. (2003). Recent Advances and Trends in Nonparametric Statistics. Elsevier B.V. https://doi.org/10.1016/B978-0-444-51378-6.X5000-5

13. Peterson, K.M. (2002). Six Modifications Of The Aligned Rank Transform Test For Interaction. https://pdfs.semanticscholar.org/ad4b/54e104acf7356b53c075e959ba8c24e23fea.pdf

14. Schacht, A., Bogaerts, K., Bluhmki, E., & Lesaffre, E. (2008). A new nonparametric approach for baseline covariate adjustment for two-group comparative studies. Biometrics, 64 4, 1110-6

15. Shah DA, Madden LV. Nonparametric analysis of ordinal data in designed factorial experiments. Phytopathology. 2004;94(1):33-43. doi:10.1094/PHYTO.2004.94.1.33, https://apsjournals.apsnet.org/doi/pdf/10.1094/PHYTO.2004.94.1.33

16. Versace, V., Schwenker, K., Langthaler, P. B., Golaszewski, S., Sebastianelli, L., Brigo, F., Pucks-Faes, E., Saltuari, L., & Nardone, R. (2020). Facilitation of Auditory Comprehension After Theta Burst Stimulation of Wernicke's Area in Stroke Patients: A Pilot Study. Frontiers in neurology, 10, 1319. https://doi.org/10.3389/fneur.2019.01319

, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6960103/17. Prossegger, J., Huber, D., Grafetst?tter, C., Pichler, C., Braunschmid, H., Weisb?ck-Erdheim, R., & Hartl, A. (2019). Winter Exercise Reduces Allergic Airway Inflammation: A Randomized Controlled Study. International journal of environmental research and public health, 16(11), 2040. Winter Exercise Reduces Allergic Airway Inflammation: A Randomized Controlled Study

18. Akritas, M.G. and E. Brunner. 1997. A unified approach to rank tests for mixed models. Journal of Statistical Planning and Inference. 61:249–277.

19. Haiko Lüpsen, Anova with binary variables - Alternatives for a dangerous F-test

20. Haiko Lüpsen, Comparison of nonparametric analysis of variance methods a Monte Carlo study - Part A: Between subjects designs - A Vote for van der Waerden

+ my list of various non-parametric and robust alternatives to the classic n-way ANOVA:

Adrian Olszewski's answer to Is there any reliable non-parametric alternative to two-way ANOVA in biostatistics?

Jeroen Elassaiss-Schaap

Driving drug development forward through quantitative pharmacology. With this passion my clients overcome hurdles in their projects. By for example selecting the right dose or designing the right trial.

3 个月

Great article Adrian, thanks! Just would like to add/ summarize ??: “The expectation of the mean is not the mean of the expectation” … it is so easy for people to loose sight of that

Sai Krishna Dammalapati

Civic Technology | Statistics | Data | Science

5 个月

I hope I understood right the "Nature of Errors" issue with log transformation of the DV. Considering a prediction problem, say I plot the predictions and observations on a scatter plot (expecting them to fall on a 45 degree line). I've seen people log transforming DV, when this scatter deviates from the 45 degree line. And after transformation, it looks good and people go ahead making predictions with the transformed model. But, the predictive problem still remains the same. The residual at smaller values of DV is not the same as the residual at larger values of DV. Let the DV be price of a house in millions. And after log transformation, the residual is 0.25 across the DV. The following predictions are made. 1. For an observed log(price)=15 house, log(price) = 15.25 is predicted. 2. For an observed log(price)=20 house, log(price) = 20.25 is predicted. But the prediction errors are actually huge, and this log transformation can mislead us. 1. In the first case, the actual residual in price between Log_price values of 15 and 15.25 is 0.88 Million 2. In the second case, the actual residual in price between Log_price values of 20 and 20.25 is 129 Millions

Federico Nagy

5 个月

what a trip down memory lane of my university years. lovely read

Adrian Olszewski

Clinical Trials Biostatistician at 2KMM (100% R-based CRO) ? Frequentist (non-Bayesian) paradigm ? NOT a Data Scientist (no ML/AI/Big data) ? Against anti-car/-meat/-cash and C40 restrictions

5 个月

Angelica Spratley Hi! I thought you might find this potentially useful when planning your future posts about regression models. These issues are rarely discussed...

Adrian Olszewski

Clinical Trials Biostatistician at 2KMM (100% R-based CRO) ? Frequentist (non-Bayesian) paradigm ? NOT a Data Scientist (no ML/AI/Big data) ? Against anti-car/-meat/-cash and C40 restrictions

6 个月

Omar Almolla I think you might find this interesting. Log-transformation and other transformations have their place for the reason you listed in your post. But sometimes they may cause issues, especially when used for inference, so it's good to know what it changes and affects to apply it consciously, with all the consequences.

查看更多评论

要查看或添加评论，请登录

Adrian Olszewski的更多文章

Model-based hypotheses testing, part 2: the 2-sample Wald’s z-statistic for proportions with pooled variances IS the Rao test of logistic regression

2025年3月21日

Model-based hypotheses testing, part 2: the 2-sample Wald’s z-statistic for proportions with pooled variances IS the Rao test of logistic regression

In my two previous articles: “Is logistic regression a regression? It has been a regression since its birth?—?and is…

6 条评论
Model-based hypotheses testing, part I: the 2-sample Wald’s z-statistic for proportions with un-pooled variances IS the AME of logistic regression

2025年3月9日

Model-based hypotheses testing, part I: the 2-sample Wald’s z-statistic for proportions with un-pooled variances IS the AME of logistic regression

In my two previous articles: “Is logistic regression a regression? It has been a regression since its birth?—?and is…

1 条评论
Logistic regression can replicate multiple parametric and non-parametric tests of proportions

2024年3月8日

Logistic regression can replicate multiple parametric and non-parametric tests of proportions

Today we will continue the topic about logistic regression. In the previous part "Logistic regression has been a…

10 条评论
Logistic regression has been a regression since its birth - and is used this way every day.

2024年3月6日

Logistic regression has been a regression since its birth - and is used this way every day.

TL;DR Is the linear or quantile regression a regression to you? I bet you say "yes, because it predicts numerical…

23 条评论
Small Data in Clinical Research

2017年6月22日

Small Data in Clinical Research

Even in the era of ubiquitous big data (thousands of terabytes - and counting) processed by advanced machine learning…

9 条评论

See all articles

Why is transforming the response in regression analysis and hypothesis testing so dangerous? ??

Adrian Olszewski

Clinical Trials Biostatistician at 2KMM (100% R-based CRO) ? Frequentist (non-Bayesian) paradigm ? NOT a Data Scientist (no ML/AI/Big data) ? Against anti-car/-meat/-cash and C40 restrictions

Introduction

The things that will be affected. Don't ignore them.

??Interpretation

??Forcing the data to tell your story

??Nature of errors

??The mean-variance relationship

??What you see and believe is important may be actually not

?? And what happened to the hypotheses? o_O

?? What about the back-transformed confidence intervals?

?? At least the transformation gives me the normality I wanted. Right? Right?!

?? I have 0 or negative values in my data...

Summary

领英推荐

You only criticize. Propose something constructive!

Afterword

Adrian Olszewski的更多文章

社区洞察

其他会员也浏览了

The Statistical Lie, you are always told...

Idea of Use and Abuse of Regression

How to Interpret the Intercept in 6 Linear Regression Examples

Understanding Gradient Descent in Linear Regression.

What is Multicollinearity? A Visual Description

Bayesian probabilistic forecasts using categorical information | Part 1

Markov and the Mean Reversion Framework

What is a Logit Function and Why Use Logistic Regression?

How to analyse A/B experiments using bayesian "expected loss"

Simulations in Statistics - much more than the general wisdom tells.

Introduction

The things that will be affected. Don't ignore them.

??Interpretation

??Forcing the data to tell your story

??Nature of errors

??The mean-variance relationship

??What you see and believe is important may be actually not

?? And what happened to the hypotheses? o_O

?? What about the back-transformed confidence intervals?

?? At least the transformation gives me the normality I wanted. Right? Right?!

?? I have 0 or negative values in my data...

Summary

领英推荐

You only criticize. Propose something constructive!

Afterword

Adrian Olszewski的更多文章

Model-based hypotheses testing, part 2: the 2-sample Wald’s z-statistic for proportions with pooled variances IS the Rao test of logistic regression

Model-based hypotheses testing, part I: the 2-sample Wald’s z-statistic for proportions with un-pooled variances IS the AME of logistic regression

Logistic regression can replicate multiple parametric and non-parametric tests of proportions

Logistic regression has been a regression since its birth - and is used this way every day.

Small Data in Clinical Research

社区洞察

其他会员也浏览了

The Statistical Lie, you are always told...

Idea of Use and Abuse of Regression

How to Interpret the Intercept in 6 Linear Regression Examples

Understanding Gradient Descent in Linear Regression.

What is Multicollinearity? A Visual Description

Bayesian probabilistic forecasts using categorical information | Part 1

Markov and the Mean Reversion Framework

What is a Logit Function and Why Use Logistic Regression?

How to analyse A/B experiments using bayesian "expected loss"

Simulations in Statistics - much more than the general wisdom tells.