登录查看更多内容

Meta-analysis part 5: Meta-regression in and Moderators R

Darko Medin

Data Scientist and a Biostatistician. Developer of ML/AI models. Researcher in the fields of Biology and Clinical Research. Helping companies with Digital products, Artificial intelligence, Machine Learning.

发布日期: 2022年12月10日

The long awaited Meta-analysis in R part 5. This part include practical performance of a method called Meta-regression. Meta-regression is important because we can use to evaluate the confounders and moderators in our main Meta-analysis effects (eg. therapy effects in Oncology Meta-analysis). These moderators can often significantly influence the main effects and such are of essence to identify (eg. some subjects are more or less likely to have a therapy effect because of the moderator value).

Before we start coding, lets explore the dataset that will be used in this tutorial. The dataset is simulated Meta-analysis data which contains RRs (risk ratios) and their corresponding intervals. Since these risk ratios are simulated we can hypothesize that these would in theory be therapy effects Risk ratios for a potential study. Also the R[1] and RStudio[2] and 'metafor' package [3] will be used to facilitate the tutorial for you.

Here is the dataset opened in Microsoft Excel :

Just to clarify the column names :

rr- risk ratio (keep in mind that in academia we use RR, but rr is just fine for coding)

lci -lower 95% confidence interval

uci-upper 95% confidence interval

Now lets get to the coding part:

Open R and load the libraries:

As it can be seen we are using the 'metafor' package, which is one of the best for Meta-regression in R in my opinion. to load the dataset just specify the 'path' of the dataset on your computer and use the code read_excel() as i specified it. You can view the loaded dataset to make sure everything is loaded well.

Before i proceed with the acturak meta-regression functions i will show how to calculate log(Risk Ratios) from Risk Ratios and log confidence intervals (95%) from confidence intervals. This is very important as much of the Meta-regression includes converting these, using RRs and logRRs in different situations and of course their corresponding confidence intervals. Using standard errors of RRs and standard errors or logRRs is also important.

Now lets evaluate the practical part...

Simple log() funcion is used to convert RRs and confidence intervals to log scale, creating new variables, logRR (log Risk Ratio), loglci (log lower 95% CI bound), loguci (log upper 95% CI bound). To calculate the se of logRR, we must keep in mind that CI is calculated by multplying se by 3.92. Reverse calculation will enable us to calculate logRRse = CI/3.92, or loguci-loglci/3.92. Make sure to use the brackets always as i did to avoid mistakes with order of calculation : SE(logRR)=lowerci(logRR)-upperci(logRR)/3.92. Finally, i will create artificial moderator variable 'mod' with slope of 1.75 and add 6 plus some random noise.

You can use View(data) again to view the newly added variables...

Now we have logRR and logRRse the two 'ingredients' needed for 'metafor' functions to produce next meta-regression outputs.

Lets proceed to the next block of code, which is implementing meta-regression with the moderator variable i defined and DerSimonian&Laird estimator [4].

Using the rma() function, setting arguments yi=logRR and sei=logRRse , moderator='mod' and method='DL', the metaregression framwork is set and the object meta_dl outputted. Ok lets evaluate the results summary at the console now...We can see the output prameters similarly to the previous Meta-analysis tutorials, however, this time we have nother important test output. Test for moderator - notice the p value is <0.0001, which is intepreted as a presence of statistically significant moderator.

To further evaluate this hypothetical modertor we should first look at its beta coefficient which is presented in these results as estimate=0.604 and its 95% confidence interval ci.lb=0.544 and ci.up = 0.664. The whole 95% confidence interval is above 0 and its not even close to 0, which means that the moderator effect is highly probable and its positive meaning the increase in the moderator variable is associated with an increase in logRR for the hypothetical therapy effect.

But, to investigate this moderator - logRR association furthers, lets plot the Meta-regression and visually analyze the Meta-regression.

This block of code will be used to create the Meta-regression plot. Notice i used the regplot() function. This function is quite novel and is one of the best functions for Meta-regression ploting in 'metafor' package and R overall. I specified the level of significance to 0.95, which means the confidence interval will be plotted according to 95% estimate probability.

Krzysztof Orzechowski 1 年前

Causal Inference & the Do-Calculus

Amir Behbehani 1 年前

Optimizing Spark Configuration with Genetic Algorithm…

Patrick Nicolas 1 个月前

Lets view the plot...

The plot looks just fantastic (one of the reasons i said this is one of the best options for plotting the meta-regressions). We can see the positive association on the plot and we can see how increasing values of moderator (x-axis) are associated with increasing the logRR (y axis). The full red line is a regression line or the best fit between effect logRR and the moderator variable. The dashed lines are boundaries of the 95% confidence interval which is colored in gray. Typically the confidence interval widens in the extremes and is narrow in the center.

When evaluating the Meta-regression plot always observe 3 aspects:

Is the trend quite clear - in this case it is.
Are the bubble close the regression line
How many bubbles are outside the confidence interval

Well all 3 aspects seem optimal on the regression plot, with clear trend, bubble close to the trend line and only 1 bubble slightly off the confidence interval. Clearly this situation in a real world setting would be an indicator of a statistically significant moderator.

But interpreting logRR can be even made easier by converting to RR. Lets do this by adding transf=exp to the regplot function.

As you can see i also added label=TRUE. These two modifications will convert logRR to RR scale and add labels for individual studies for even better interpretation.

Lets see the output.

Now we have the RR scale, meaning we can interpret visually how Risk ratio is changing with the increase/decrease of the moderator variable. One hypothetical example would be how certain therapy effect is changing with age as a moderator.

Notice how the line is not straight anymore. Now its exponential. This is because conversion of the logRRs to RRs is done by exponentiating them and the association between RRs and moderator then becomes exponential too...

Individual labels of the studies are added so individual studies can now be compared to the overall model. Studies 10,1 and 12 are weighted with highest degree of confidence (larger bubbles mean smaller confidence interval, which means higher certainty of the effect). They are virtually on the regression line, while majority of them are very close. We can also see a sharp increase in RR between 5-6 and onwards (typical for exponential relation). Most importantly each individual value of the moderator can now be associated with its corresponding value of RR or the effect (hypothetical therapy effect in this tutorial).

Meta-regression plots are essential for Clinical Research testing out different therapies and making strategies for new ones, because they allow for a very detailed evaluation of moderators in a meta-analysis.

Thanks for reading and learning!

by,

Darko Medin

References:

1.https://www.r-project.org/

2.https://posit.co/

3.https://www.metafor-project.org/doku.php/metafor

4.DerSimonian R, Laird N. Meta-analysis in clinical trials. Control Clin Trials. 1986

Darko Medin

Data Scientist and a Biostatistician. Developer of ML/AI models. Researcher in the fields of Biology and Clinical Research. Helping companies with Digital products, Artificial intelligence, Machine Learning.

1 年

Here is the exemplar tutorial repository with the full R code : https://github.com/DarkoMedin/Metaregression_R

要查看或添加评论，请登录

查看全部

Meta-analysis part 5: Meta-regression in and Moderators R

Darko Medin

Data Scientist and a Biostatistician. Developer of ML/AI models. Researcher in the fields of Biology and Clinical Research. Helping companies with Digital products, Artificial intelligence, Machine Learning.

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

Machine Learning Analysis for Biomedical Data with OmicsLogic

Revolutionizing Life Sciences: Snowflake's Role in Handling Big Data and ML

Network Meta-analysis in R part II. The Network effects forest plots in Network meta-analysis

ehrapy: A New Open-Source Framework for Comprehensive EHR Data Analysis

Layering AI with Ensemble Modeling: A Lasagna Approach to Health Data

Ensuring the Integrity of Regression Analysis: The Vital Role of Assumptions in Clinical Data Interpretation

Addressing Response Drift and Randomness in AI for Data Management & Clinical Teams

Network Update #34

Why Clinical Trial Statisticians Need graphicalMPC R Software

Evo Framework

领英推荐

Simulated and Synthetic Data Generation - Edition 1

2024年10月31日

Simulated and Synthetic Data Series by Darko Medin - An ORIENTATION

2024年10月20日

Simulated and Synthetic Data Generation - The Effective Statistician Workshop ORIENTATION - Lead by Darko Medin

2024年10月13日

INTRODUCTION TO DEEP LEARNING

2024年10月3日

BioAIworks - The novel AI platform

2024年9月25日

The latest developments around ADataScience website

2024年9月17日

Data Science Python and R webinar - The ORIENTATION

2024年9月9日

Causal inference packages for RWE and Observational data - a curated list

2024年9月2日

ADataScience - a new platform and a complement to Advanced Statistics and Data Scieance Linkedin Newsletter.

2024年8月26日

Causal inference by Justin Belair and Darko Medin : An Orientation

2024年8月9日

社区洞察

其他会员也浏览了

Machine Learning Analysis for Biomedical Data with OmicsLogic

Revolutionizing Life Sciences: Snowflake's Role in Handling Big Data and ML

Network Meta-analysis in R part II. The Network effects forest plots in Network meta-analysis

ehrapy: A New Open-Source Framework for Comprehensive EHR Data Analysis

Layering AI with Ensemble Modeling: A Lasagna Approach to Health Data

Ensuring the Integrity of Regression Analysis: The Vital Role of Assumptions in Clinical Data Interpretation

Addressing Response Drift and Randomness in AI for Data Management & Clinical Teams

Network Update #34

Why Clinical Trial Statisticians Need graphicalMPC R Software

Evo Framework