Meta-analysis part 5: Meta-regression in and Moderators R
Darko Medin
Data Scientist and a Biostatistician. Developer of ML/AI models. Researcher in the fields of Biology and Clinical Research. Helping companies with Digital products, Artificial intelligence, Machine Learning.
The long awaited Meta-analysis in R part 5. This part include practical performance of a method called Meta-regression. Meta-regression is important because we can use to evaluate the confounders and moderators in our main Meta-analysis effects (eg. therapy effects in Oncology Meta-analysis). These moderators can often significantly influence the main effects and such are of essence to identify (eg. some subjects are more or less likely to have a therapy effect because of the moderator value).
Before we start coding, lets explore the dataset that will be used in this tutorial. The dataset is simulated Meta-analysis data which contains RRs (risk ratios) and their corresponding intervals. Since these risk ratios are simulated we can hypothesize that these would in theory be therapy effects Risk ratios for a potential study. Also the R[1] and RStudio[2] and 'metafor' package [3] will be used to facilitate the tutorial for you.
Here is the dataset opened in Microsoft Excel :
Just to clarify the column names :
rr- risk ratio (keep in mind that in academia we use RR, but rr is just fine for coding)
lci -lower 95% confidence interval
uci-upper 95% confidence interval
Now lets get to the coding part:
Open R and load the libraries:
As it can be seen we are using the 'metafor' package, which is one of the best for Meta-regression in R in my opinion. to load the dataset just specify the 'path' of the dataset on your computer and use the code read_excel() as i specified it. You can view the loaded dataset to make sure everything is loaded well.
Before i proceed with the acturak meta-regression functions i will show how to calculate log(Risk Ratios) from Risk Ratios and log confidence intervals (95%) from confidence intervals. This is very important as much of the Meta-regression includes converting these, using RRs and logRRs in different situations and of course their corresponding confidence intervals. Using standard errors of RRs and standard errors or logRRs is also important.
Now lets evaluate the practical part...
Simple log() funcion is used to convert RRs and confidence intervals to log scale, creating new variables, logRR (log Risk Ratio), loglci (log lower 95% CI bound), loguci (log upper 95% CI bound). To calculate the se of logRR, we must keep in mind that CI is calculated by multplying se by 3.92. Reverse calculation will enable us to calculate logRRse = CI/3.92, or loguci-loglci/3.92. Make sure to use the brackets always as i did to avoid mistakes with order of calculation : SE(logRR)=lowerci(logRR)-upperci(logRR)/3.92. Finally, i will create artificial moderator variable 'mod' with slope of 1.75 and add 6 plus some random noise.
You can use View(data) again to view the newly added variables...
Now we have logRR and logRRse the two 'ingredients' needed for 'metafor' functions to produce next meta-regression outputs.
Lets proceed to the next block of code, which is implementing meta-regression with the moderator variable i defined and DerSimonian&Laird estimator [4].
Using the rma() function, setting arguments yi=logRR and sei=logRRse , moderator='mod' and method='DL', the metaregression framwork is set and the object meta_dl outputted. Ok lets evaluate the results summary at the console now...We can see the output prameters similarly to the previous Meta-analysis tutorials, however, this time we have nother important test output. Test for moderator - notice the p value is <0.0001, which is intepreted as a presence of statistically significant moderator.
To further evaluate this hypothetical modertor we should first look at its beta coefficient which is presented in these results as estimate=0.604 and its 95% confidence interval ci.lb=0.544 and ci.up = 0.664. The whole 95% confidence interval is above 0 and its not even close to 0, which means that the moderator effect is highly probable and its positive meaning the increase in the moderator variable is associated with an increase in logRR for the hypothetical therapy effect.
But, to investigate this moderator - logRR association furthers, lets plot the Meta-regression and visually analyze the Meta-regression.
This block of code will be used to create the Meta-regression plot. Notice i used the regplot() function. This function is quite novel and is one of the best functions for Meta-regression ploting in 'metafor' package and R overall. I specified the level of significance to 0.95, which means the confidence interval will be plotted according to 95% estimate probability.
领英推荐
Lets view the plot...
The plot looks just fantastic (one of the reasons i said this is one of the best options for plotting the meta-regressions). We can see the positive association on the plot and we can see how increasing values of moderator (x-axis) are associated with increasing the logRR (y axis). The full red line is a regression line or the best fit between effect logRR and the moderator variable. The dashed lines are boundaries of the 95% confidence interval which is colored in gray. Typically the confidence interval widens in the extremes and is narrow in the center.
When evaluating the Meta-regression plot always observe 3 aspects:
Well all 3 aspects seem optimal on the regression plot, with clear trend, bubble close to the trend line and only 1 bubble slightly off the confidence interval. Clearly this situation in a real world setting would be an indicator of a statistically significant moderator.
But interpreting logRR can be even made easier by converting to RR. Lets do this by adding transf=exp to the regplot function.
As you can see i also added label=TRUE. These two modifications will convert logRR to RR scale and add labels for individual studies for even better interpretation.
Lets see the output.
Now we have the RR scale, meaning we can interpret visually how Risk ratio is changing with the increase/decrease of the moderator variable. One hypothetical example would be how certain therapy effect is changing with age as a moderator.
Notice how the line is not straight anymore. Now its exponential. This is because conversion of the logRRs to RRs is done by exponentiating them and the association between RRs and moderator then becomes exponential too...
Individual labels of the studies are added so individual studies can now be compared to the overall model. Studies 10,1 and 12 are weighted with highest degree of confidence (larger bubbles mean smaller confidence interval, which means higher certainty of the effect). They are virtually on the regression line, while majority of them are very close. We can also see a sharp increase in RR between 5-6 and onwards (typical for exponential relation). Most importantly each individual value of the moderator can now be associated with its corresponding value of RR or the effect (hypothetical therapy effect in this tutorial).
Meta-regression plots are essential for Clinical Research testing out different therapies and making strategies for new ones, because they allow for a very detailed evaluation of moderators in a meta-analysis.
Thanks for reading and learning!
by,
Darko Medin
References:
1.https://www.r-project.org/
2.https://posit.co/
3.https://www.metafor-project.org/doku.php/metafor
4.DerSimonian R, Laird N. Meta-analysis in clinical trials. Control Clin Trials. 1986
Data Scientist and a Biostatistician. Developer of ML/AI models. Researcher in the fields of Biology and Clinical Research. Helping companies with Digital products, Artificial intelligence, Machine Learning.
1 年Here is the exemplar tutorial repository with the full R code : https://github.com/DarkoMedin/Metaregression_R