登录查看更多内容

The Difference Between Interaction and Association

The Analysis Factor

发布日期: 2024年5月1日

It’s really easy to mix up the concepts of association (as measured by correlation) and interaction.? Or to assume if two variables interact, they must be associated.? But it’s not actually true.

In statistics, they have different implications for the relationships among your variables. This is especially true when the variables you’re talking about are predictors in a regression or ANOVA model.

Association

Association between two variables means the values of one variable relate in some way to the values of the other.? It is usually measured by correlation for two continuous variables and by cross tabulation and a Chi-square test for two categorical variables.

Unfortunately, there is no nice, descriptive measure for association between one categorical and one continuous variable. Point-biserial correlation works only if the categorical variable is binary. But either one-way analysis of variance or logistic regression can test an association (depending upon whether you think of the categorical variable as the independent or the dependent variable).

Essentially, association means the values of one variable generally co-occur with certain values of the other.

Interaction

Interaction is different.? Whether two variables are associated says nothing about whether they interact in their effect on a third variable.? Likewise, if two variables interact, they may or may not be associated.

An interaction between two variables means the effect of one of those variables on a third variable is not constant—the effect differs at different values of the other.

What Association and Interaction Describe in a Model

The following examples show three situations for three variables: X1, X2, and Y. X1 is a continuous independent variable, X2 is a categorical independent variable, and Y is the continuous dependent variable.? I chose these types of variables to make the plots easy to read, but any of these variables could be either categorical or continuous.

Association without Interaction

In scenario 1, X1 and X2 are associated.? If you ignore Y, you can see the mean of X1 is lower when X2=0 than when X2=1. ?But they do not interact in how they affect Y—the regression lines are parallel.? X1 has the same effect on Y (the slope) for both X2=1 and X2=0.

A simple example is the relationship between height (X1) and weight (Y) in male (X2=1) and female (X2=0) teenagers.? There is a relationship between height (X1) and gender (X2). But for both genders, the relationship between height and weight is the same.

This is the situation you’re trying to take care of by including control variables.? If you didn’t include gender as a control, a regression would fit a single line to all these points. It would attribute all variation in weights to differences in heights.

This line would also be steeper, as it tried to fit all the points using one line.? As a result, it would overestimate the size of the unique effect of height on weight.

Interaction without Association

In a second scenario, X1 and X2 are not associated.? The mean of X1 is the same for both categories of X2. ?But how X1 affects Y differs for the two values of X2. That's the exact definition of an interaction.? The slope of X1 on Y is greater for X2=1 than it is for X2=0, in which it is nearly flat.

An example of this would be an experiment in which X1 was a pretest score and Y a posttest score.? Imagine you randomly assigned participants to a control (X2=1) or a training (X2=0) condition.

Lean Manufacturing & Six Sigma Worldwide 3 个月前

6 MISTAKES OF HYPOTHESIS TESTING

Srinivas V. 7 个月前

Behind the Data Curtain - Hypothesis Testing

Eshan Sharma 1 年前

If randomization is done well, the assigned condition (X2) should be unrelated to the pretest score (X1).? But they do interact—the relationship between pretest and posttest differs in the two conditions.

In the control condition, without training, the pretest and posttest scores would be highly correlated. But in the training condition, if the training worked well, pretest scores would have less effect on posttest scores.

Both Association and Interaction

In the third scenario, we’ve got both an association and an interaction.?? X1 and X2 are associated. Once again the mean of X1 is lower when X2=0 than when X2=1. ?They also interact with Y. The slopes of the relationship between X1 and Y are different when X2=0 and X2=1.? So X2 affects the relationship between X1 and Y.

A good example here would be if Y is the number of jobs in a county, X1 is the percentage of the workforce that holds a college degree, and X2 is whether the county is rural (X2=0) or metropolitan (X2=1).

It’s clear rural counties have, on average, lower percentages of college-educated citizens than metropolitan counties.? They also have fewer jobs.

It’s also clear that the workforce’s education level in metropolitan counties is related to how many jobs there are.? But in rural counties, it doesn’t matter at all.

This situation is also what you would see if the randomization in the last example did not go well or if randomization was not possible.

The differences between interaction and association will become clearer as you analyze more data. It's always a good idea to stop and explore your data. Use graphs or try different terms in your model to figure out exactly what's going on with your variables.

Originally published at https://www.theanalysisfactor.com/interaction-association/. Updated April 17, 2024.

Check out our upcoming FREE webinar, Interpreting Regression Coefficients: A Walk Through Output.

The Difference Between Interaction and Association

The Analysis Factor

Association

Interaction

What Association and Interaction Describe in a Model

Association without Interaction

Interaction without Association

领英推荐

Both Association and Interaction

The Analysis Factor的更多文章

社区洞察

其他会员也浏览了

Market Insight Model: Combining Investor Behavior, Quantitative Analysis, Sentiment Assessment, and Fundamental Factors

Inferential Statistics - t test

Conjoint Analysis

Romance and Data Science: How Technology is Changing the Way We Find Love

Addressing 0 values with econometrics

Schoolgirls, Chocolates and Statistical Inferencing

Latent Profile Analysis (LPA) & Latent Class Analysis (LPA) as Game-Changers for Statistical Analysis (3/5) ??

Bellabeat 1 Case Study - Analysis using R - Language

?? Understanding Hypothesis Testing: A Key Concept in Statistics ??

Question order bias exists. Use it.

Association

Interaction

What Association and Interaction Describe in a Model

Association without Interaction

Interaction without Association

领英推荐

Both Association and Interaction

The Analysis Factor的更多文章

Four Weeds of Data Analysis That are Easy to Get Lost In

The Unstructured Covariance Matrix: When it Does and Doesn't Work

Outliers: To Drop or Not to Drop

The 3 Stages of Mastering Statistical Analysis

Beyond R-squared: Assessing the Fit of Regression Models

When To Fight For Your Analysis and When To Jump Through Hoops

EM Imputation and Missing Data: Is Mean Imputation Really so Terrible?

Multiple Imputation in a Nutshell

When Assumptions of ANCOVA are Irrelevant

What’s in a Name? Moderation and Interaction, Independent and Predictor Variables

社区洞察

其他会员也浏览了

Market Insight Model: Combining Investor Behavior, Quantitative Analysis, Sentiment Assessment, and Fundamental Factors

Inferential Statistics - t test

Conjoint Analysis

Romance and Data Science: How Technology is Changing the Way We Find Love

Addressing 0 values with econometrics

Schoolgirls, Chocolates and Statistical Inferencing

Latent Profile Analysis (LPA) & Latent Class Analysis (LPA) as Game-Changers for Statistical Analysis (3/5) ??

Bellabeat 1 Case Study - Analysis using R - Language

?? Understanding Hypothesis Testing: A Key Concept in Statistics ??

Question order bias exists. Use it.