THE Secret Weapon for People Analytics: Quasi-Experiments

THE Secret Weapon for People Analytics: Quasi-Experiments

Read on substack here

Preface: This article serves as a primer on quasi-experiments, which might be the most underutilized method in people analytics. It will focus on providing a mental toolkit for quasi-experiments: how to think about them, how to spot opportunities to use them, and how to leverage them successfully.

In people analytics, causality matters

I have a soft spot for “data mining”: using any and all available data to optimize predictive accuracy for some target variable. There’s nothing like plugging a bunch of data into an algorithm, watching your computer go “brrrrr,” and experiencing the modern day magic of good predictions.

On the other hand, in order for people analytics projects to be successful, we often need to be able to explain the “why”. Our projects typically seek to influence decisions about how programs, policies, and processes are run, and in order to make these interventions effective, we need to understand the “true relationships” between variables. This is easier said than done. For example, if we found a correlation between X and Y, we still face the following barriers to inferring a causal relationship:

  • Confounding variables: The apparent relationship between X and Y could actually be caused by the presence of some outside factor Z
  • Spurious correlations: There are a myriad of ways that a correlation between X and Y could be an artifact of something else. The Simpson’s paradox is a good example of a correlation that disappears after accounting for another factor
  • Temporal precedence: In order for X to have caused Y, changes in X should be observed before we see changes in Y

This list is not exhaustive, and I highly recommend checking out this blog post by Eduardo Valencia Tirapu if you’re interested in diving deeper into the challenges of establishing causal relationships.

Academic researchers can use randomized experiments to get around these barriers, but so-called field experiments are not always feasible in real-life organizations. In many cases, treating a randomized group of employees differently may be unethical or even illegal. The effectiveness of field experiments can also suffer from issues like subject turnover, “outside events like organizational changes, or leakage (employees talk to each other, so the control group might find out about the treatment).

So, do we just give up on experiments and run correlational analyses? Absolutely not. Quasi-experiments are our secret weapon.

A quasi-experiment is a type of technique that shares the same goal as a true experiment: to infer a causal relationship between two variables. The difference is that quasi-experiments are the tool of choice when random assignment of people into experimental groups is not possible. This is particularly useful in people analytics for two reasons. First, it’s often not feasible to assign people to different conditions randomly. Second, it’s often not feasible to manipulate the variables we want to study, such as working remotely or receiving training. For example, we might use a quasi-experiment to understand the impact of a training program on job performance. In doing so, we can accommodate the business imperative to roll out the training to all relevant workers while still evaluating the training rigorously. This ability to balance flexibility with methodological rigor is what makes quasi-experiments such a valuable tool.

Spotting opportunities to use quasi-experiments

Quasi-experiments are an underutilized tool in people analytics, which is a significant missed opportunity. This may be partly due to the limited emphasis on quasi-experiments in I/O psychology and business analytics degree programs. Another challenge with quasi-experiments is conceptual accessibility. Rather than the statistical methods themselves, the difficulty lies in identifying suitable opportunities to leverage these methods effectively. These methods are as much art as they are science, to reference Adam Grant’s paper on the subject.

Here are some key indicators to help identify situations where a quasi-experiment may be useful:

  1. Have your ears up: In ongoing discussions among leaders, when there is a debate regarding the causes and effects of specific decisions or data points, being plugged into these conversations and identifying the right questions is the first step to identify opportunities to create insights.
  2. Notice big events or initiatives: Reorganizations, layoffs, or the implementation of new interventions like trainings, programs, or processes can each serve as potential opportunities for quasi-experiments. I mentioned events like these as potential confounding factors, but they also present chances to study impacts, because they often affect some workers more than others.
  3. Look out for natural experiments: When otherwise similar people are inadvertently exposed to different conditions, this is called a natural experiment. These can be tricky to spot but are very powerful for causal inference. In one famous example, a law was passed in one state but not in another, allowing for comparisons between similar individuals across state borders. This situation is commonplace in organizations, where different departments or divisions may subject similar workers to distinct policies or work environments, creating variation that can be utilized to establish quasi-treatment and control groups.

Go-to methods for quasi-experiments

Once you’ve identified an opportunity to use a quasi-experiment to answer a research question, you’ll need to determine an appropriate quasi-experimental design. There are many such designs, so the choice really depends on the nature of the situation, data availability, and potential confounds.

Here are a few of the most popular quasi-experimental methods and why I’ve found them useful:

  1. Matching - one of the most basic methods, the idea is to identify potential confounding variables (often demographic variables) and match subjects in the “treatment group” to otherwise similar subjects who are not subject to the treatment, creating a control group. Matching algorithms are simple to use and work really well at making the groups as similar as possible on the characteristics you select. Any remaining difference in the outcome variable is then assumed to be due to the treatment effect.
  2. Discontinuity - an elegant method that uses regression with a twist: in situations where there is some cutoff point for a treatment or intervention, a regression discontinuity design compares observations that lie closely on either side of the cutoff. In a famous example, researchers looked at the difference in lifetime earnings between people who had received 3.5 years of college education but not graduated, and?people who completed 4 years and did graduate. Spoiler: the people who graduated ended up earning a ton more, despite being only slightly more educated. This is called the “diploma effect”.
  3. Difference in differences (DID) - The DID technique is a simple, yet powerful technique that allows you to compare differences in outcomes over time between two groups: one that receives the treatment and one that does not. The difference between these two changes becomes the estimated causal effect of the treatment. This technique is extremely helpful when you know that there is some underlying trend already. For example, if you want to study the effects of a new TA program on time-to-fill across time, this method would allow you to adjust for other events, programs, or outside factors that are impacting the time-to-fill trend.

No alt text provided for this image
Visualization of the difference-in-differences technique. Credit: https://shopify.engineering/using-quasi-experiments-counterfactuals

This is by no means an exhaustive breakdown of quasi-experimental methods, so I highly encourage you to do some more research. If you’d like to learn how to perform one of these techniques, you can google and find programming tutorials for any of them.

Limitations

Quasi-experiments do have their limitations. Remember, inferring causality is different from establishing it, full stop. If you’ve run a brilliant quasi-experiment, it’s still possible that you missed a confounding factor that would nullify your findings. The results of the quasi-experiment also may not be generalizable beyond the population that you examined.

Even if executed flawlessly, results can be difficult to explain. When presenting the results of quasi-experiments, it’s best to be clear and concise as well as use visuals where possible. Like a good product, good analytics are complicated behind the scenes, but simple on the outside.?It's hard to strike this balance, and as mentioned, there is a real art to it, but that's what makes it fun.

Siddharth Mehta

Workforce Strategy leader at SABIC

1 年

Have you considered the work of Judea Pearl on using DAGs to extract causal-mechanisms from observational data? I’ve found that very useful. I’ll be happy to chat with you if this interests you. Loads of good material out there to tap into. Let me know. Keep doing the good work that you’re doing!

Siddharth Mehta

Workforce Strategy leader at SABIC

1 年

Thanks for amplifying the need to move away from the ML machinery and to instead focus on causal-modeling to get dividends in the people-analytics space.

回复
Rory O'Gallagher

Sr. Behavioral Scientist @ GE HealthCare | Collaboration, Productivity & Culture

1 年

If you have org network data, you can manage the spillover effect by sampling from different communities. Does that potentially introduce some other confound through the sampling technique? Yes, but it can be worth it if you anticipate the spillover effect of employees talking to each other or influencing each other in someway will create too much noise across all conditions. Depending on what your doing, it can transition nicely into an incubation space for spreading change. Damon Centola’s book on change talks about this principle and how spreading interventions is often best adopted organically through seeding interventions towards the periphery in separate communities where people have less countervailing pressures to conform.

Leonidas Guadalupe, Ph.D.

People Analytics Strategist and Scientist

1 年

I always say that because you don’t have causality, you have to build a preponderance of evidence to make your decision. Where there’s a lot of smoke there is probably fire.

要查看或添加评论,请登录

Jackson Roatch的更多文章

社区洞察

其他会员也浏览了