登录查看更多内容

Which econometric method should you use for health policy causal inference?

Jason Shafrin

Senior Managing Director, Center for Healthcare Economics & Policy at FTI Consulting; Adjunct Professor, University of Southern California

发布日期: 2024年10月25日

+ 关注

TL;DR

A paper by Ress and Wild (2024) provide the following recommendations in answering this question.

When aiming to control for a large covariate set, consider using the superlearner to estimate nuisance parameters.
When employing the superlearner to estimate nuisance parameters, consider using doubly robust estimation approaches, such as AIPW and TMLE.
When faced with a small covariate set, consider using regression to estimate nuisance parameters.
When employing regression to estimate nuisance parameters, consider using singly robust estimation approaches, such as propensity score matching or IPW.

How did they arrive at these recommendations? To find out, read on.

Description of plasmode simulation on study methodology

To answer the question "

", one has to make a number of research decisions.?

First, one must decide whether to simulate the effect of a policy intervention or incorporate real-world data into the simulation.? The advantage of the former approach is that we know the truth and can create any data generating scenario we want; because we (the researcher) have ourselves constructed the data generating process, we have a gold standard to compare against and can test out various data generating processes.? The problem with this approach, is its hypothetical nature.? Specifically, Ress and Wild (2024) write:

Many simulation studies…are characterized by relatively simple confounding structures with few variables, leading to varying results depending on the data structure modeled and the methods under consideration...Because the optimal choice for an estimation strategy depends on the research question, data features, population characteristics and method assumptions, simulation results are only applicable to the specific simulation setting.

Instead, the authors opt for a plasmode simulation. What is a plasmode simulation?

In a plasmode simulation, the covariates from a real dataset are used without alteration, while the values for the outcome variables are simulated based on the estimated associations between covariates and outcomes from the original data, ensuring that the true effect size is known. The advantage of this approach is that it preserves the high‐dimensional and complex covariate structure of the source data, providing a simulation environment that closely resembles real‐world conditions.

In short, while the underlying covariates are not changed, researchers can test the robustness of different estimation methods through controlled modifications to the real dataset, such as artificially inserting or removing certain relationships, introducing or removing biases, adding noise, or altering specific variables. This allows for the controlled examination of how statistical methods perform under different known conditions.

A second research decision is to determine which estimation methods should be evaluated. Ress and Wild (2024) consider the following approaches:

Propensity Score Matching: This method involves estimating the probability of treatment assignment based on observed covariates, allowing researchers to match treated and untreated units with similar propensity scores, thereby reducing selection bias in observational studies.
Inverse Probability of Treatment Weighting (IPTW): IPTW assigns weights to individuals based on the inverse of their probability of receiving the treatment they actually received, allowing for the creation of a pseudo-population where treatment assignment is independent of observed covariates, thus facilitating causal inference.
Entropy Balancing: This technique, developed in Hainmueller 2012, adjusts the weights of the sample to achieve covariate balance between treated and control groups by minimizing the distance between the weighted means of covariates, ensuring that the distribution of covariates is similar across groups.
Difference-in-Differences Analysis (DID): DID is a quasi-experimental design that compares the changes in outcomes over time between a treatment group and a control group, helping to estimate causal effects while controlling for unobserved confounding factors that are constant over time.
Augmented Inverse Probability Weighting (AIPW): AIPW combines IPTW with regression adjustment to improve efficiency and reduce bias by incorporating both the propensity score and a model for the outcome, allowing for more robust causal estimates. Specifically, AIPW is a doubly robust estimator because it produced unbiased estimates whenever either the propensity score model or the outcome regression is correctly specified.
Targeted Maximum Likelihood Estimation (TMLE): TMLE is a semi-parametric method that optimally combines machine learning and traditional statistical techniques to estimate causal effects while targeting specific parameters of interest, thus providing robust estimates even in complex settings.

领英推荐

What is Statistics and why Statistics Is Important ?

Lean Manufacturing & Six Sigma Worldwide 1 年前

Two-Way ANOVA (Analysis of Variance) - Comprehensive…

Lean Manufacturing & Six Sigma Worldwide 6 个月前

Sampling & its Types | Simple Random, Convenience…

Lean Manufacturing & Six Sigma Worldwide 11 个月前

Third, the authors must consider how to estimate nuisance parameters. The key nuisance parameters are the propensity score and the outcome model. Estimation of the nuisance parameters was performed using the superlearner package.

...we used the superlearner algorithm implemented in the SuperLearner [R] package (Polley et al., 2021), which allowed us to incorporate non‐parametric approaches. We included the following five algorithms as baselearners: generalized linear model with penalized maximum likelihood (glmnet function) (Friedman et al., 2010), random forest (ranger function) (Wright & Ziegler, 2017), gradient boosting (xgboost function) (Chen et al., 2015), support vector machines (svm function) (...Karatzoglou et al., 2006), and multivariate adaptive regression splines (earth function) (Friedman, 1991).

Fourth, one must consider a specific intervention to evaluate and how to simulate the data. The intervention the authors considered was an German initiative aiming to improve health care in a socially deprived urban area. Specifically, the intervention included (i) cross-sectoral network of health, social and community care providers and (ii) a community health advice and navigation service. (for more details see Rees and Wild 2023). To simulate the plasmode data for this intervention, Ress and Wild (2024) use the following procedure:

Estimate the association between treatment, outcome and covariates.
Use the estimated coefficients to predict the outcomes but modify the treatment coefficient to the desired effect size.
Draw J subsets of size s by resampling‐with‐replacement and perform steps 4 and 5 for each of those subsets.
Introduce noise by sampling the outcomes from suitable distributions using the simulated values from step 3 as expected values.
Analyze the simulated data.

Fifth, one must determine the set of performance metrics to use to evaluate the study. The performance metrics considered included:

Bias: calculated as the mean difference between the estimated and true treatment effect. Since the true treatment effect is known through the plasmode, bias can be calculated.
Standard error. The empirical standard error (SE) reflects the dispersion of the estimated effects around their mean. In other words, it measures the precision of the estimator.
Confidence level coverage. This is calculated as the proportion of confidence intervals (CIs) that contain the true effect. Let's say we are using a 95% CI. If only 80% of CI contained the true effect, the CI would be considred to narrow; conversely, if 99% of the CI contained the true effect, the CIs would be considered too wide.

Based on this approach, the authors find that there is no clear winner:.

We found that TMLE combined with the superlearner performed best in terms of bias and SE, but exhibited shortcomings in terms of CI coverage. When considering all performance measures and outcomes, the combination of matching and subsequent DiD analysis in conjunction with regression for nuisance parameter estimation performed best.

What are the takeaways from this research? The authors nicely lay this out at the end of their article:

When aiming to control for a large covariate set, consider using the superlearner to estimate nuisance parameters.
When employing the superlearner to estimate nuisance parameters, consider using doubly robust estimation approaches, such as AIPW and TMLE.
When faced with a small covariate set, consider using regression to estimate nuisance parameters.
When employing regression to estimate nuisance parameters, consider using singly robust estimation approaches, such as propensity score matching or IPW.

You can read the full article here. What do you think of the use of plasmode simulations?

Originally posted at Healthcare Economist.?

The views expressed herein are those of the author and not necessarily the views of?FTI Consulting, Inc., its management, its subsidiaries, its affiliates, or its other professionals.

Healthcare Economist

3,711 位关注者

Justin Rao, Ph.D.

CEO @ Gencomm.ai - Disruptive AI & ML-powered pricing product suite

4 个月

Hey Jason, nice post. A few things to think about: 1) The ensemble "superlearner" should have the property that it will fit at least as well as any base learner it uses, so for low-dimensional data, why not always use the super learner and include base learners that do well these sorts of data? I don't understand why we wouldn't always use the most powerful approach to estimating nuisance parameters, rather than leave it to a researcher decision to switch between regression and approaches that should be guaranteed to beat it. 2) I see they used the R package for the ensemble learner. I would encourage interested folks to look at other automated ML algorithms that follow this base approach (H20, Autogluon, etc.) -- they all handle the encoding of data differently and when you have dates and strings in your data, this will really matter.

1 次回应

Juan Moreira

Empleado administrativo en Honorable Cámara de Diputados de la Nación

4 个月

Hello. Where do you see the next pandemic going to start? What are the traits? What do you have in mind to solve it? How much do vaccines cost financially? Thanks for sharing. Hug from Argentina.

查看更多评论

要查看或添加评论，请登录

Jason Shafrin的更多文章

More evidence that there are diminishing marginal returns to quality of life gains

2025年2月25日

More evidence that there are diminishing marginal returns to quality of life gains

Quality-adjusted life years (QALYs) are commonly used to evaluate the impact of new medical technologies on patient…
Quotation of the Day

2025年2月25日

Quotation of the Day

There is nothing so useless as doing efficiently that which should not be done at all. Peter Drucker

1 条评论
What is Project IDEATE?

2025年2月25日

What is Project IDEATE?

How best can you design an outcomes based agreement (OBA)? What are the key factors to consider? Project IDEATE is a…
Can the NIH replace all private sector clinical R&D?

2025年2月22日

Can the NIH replace all private sector clinical R&D?

A paper by Proudman et al. (2024) finds that doing so would be extremely expensive.

4 条评论
Vaccine choice = more measles infection

2025年2月21日

Vaccine choice = more measles infection

Texas is moving to allow for more freedom of choice with respect to receipt of vaccines. The Economist reports: Texas…

3 条评论
Healthcare Economist named a top 100 Economics blog

2025年2月20日

Healthcare Economist named a top 100 Economics blog

In Feedspot's 2025 ranking of economics blogs, Healthcare Economist was once again ranked in the top 100 (#44). I was…

16 条评论
LLM in HEOR: An evaluation framework

2025年2月19日

LLM in HEOR: An evaluation framework

Health economics and outcomes research has already started to use AI tools such as large language models (LLM) across a…

2 条评论
Why don't more people take up public services for which they are eligible?

2025年2月14日

Why don't more people take up public services for which they are eligible?

One reason are the administrative obstacles and frictions that eligible individuals have to overcome to receive the…
Comparing Regulatory and Health Technology Assessment Reviews of Medicines in the United States and Europe

2025年2月13日

Comparing Regulatory and Health Technology Assessment Reviews of Medicines in the United States and Europe

That is the subtitle of a paper by Vreman et al. (2020).
Public vs. Private Sector R&D Investments

2025年2月12日

Public vs. Private Sector R&D Investments

In 2011, National Institutes of Health (NIH)-funded discoveries contributed $69 billion to the nation’s gross domestic…

1 条评论

See all articles

Which econometric method should you use for health policy causal inference?

Jason Shafrin

Senior Managing Director, Center for Healthcare Economics & Policy at FTI Consulting; Adjunct Professor, University of Southern California

TL;DR

Description of plasmode simulation on study methodology

领英推荐

Healthcare Economist

3,711 位关注者

Jason Shafrin的更多文章

社区洞察

其他会员也浏览了

Exploring Factor Analysis in Research: Key Types and Examples

8 Essential Qualitative Data Collection Methods

What is Statistics and why Statistics Is Important ?

STATISTICS

Statistical Sampling Market Research: What is it? Its types and techniques

The Difference Between Clustered, Longitudinal, and Repeated Measures Data

Data Management for Health Economics and Outcomes Research (HEOR)

Sampling size evaluation

Business Analytics Concepts, Techniques, and Applications

Modeling and Predicting Demand During Pandemics using Time Series Models

TL;DR

Description of plasmode simulation on study methodology

领英推荐

Healthcare Economist

3,711 位关注者

Jason Shafrin的更多文章

More evidence that there are diminishing marginal returns to quality of life gains

Quotation of the Day

What is Project IDEATE?

Can the NIH replace all private sector clinical R&D?

Vaccine choice = more measles infection

Healthcare Economist named a top 100 Economics blog

LLM in HEOR: An evaluation framework

Why don't more people take up public services for which they are eligible?

Comparing Regulatory and Health Technology Assessment Reviews of Medicines in the United States and Europe

Public vs. Private Sector R&D Investments

社区洞察

其他会员也浏览了

Exploring Factor Analysis in Research: Key Types and Examples

8 Essential Qualitative Data Collection Methods

What is Statistics and why Statistics Is Important ?

STATISTICS

Statistical Sampling Market Research: What is it? Its types and techniques

The Difference Between Clustered, Longitudinal, and Repeated Measures Data

Data Management for Health Economics and Outcomes Research (HEOR)

Sampling size evaluation

Business Analytics Concepts, Techniques, and Applications

Modeling and Predicting Demand During Pandemics using Time Series Models