Meta-analysis: Iteration using Purrr

Meta-analysis: Iteration using Purrr


Author: Valirie Ndip Agbor


Aim: To demonstrate how to use the?map?function in the?purrr?package in R to perform a repetitive meta-analysis of proportions.

Pre-requisites:

  • Know how to fit a correct meta-analytic model
  • Understand base R and working with?lists
  • Basic knowledge of the?tidyverse
  • Basic understanding of?for loops?and?the apply?family of functions in R



The map function: brief overview

The map function

  1. takes a list?of input,
  2. applies a function?to each item in that list
  3. Returns an output with length?equal to the length of the input list
  4. There are different variants of the map function depending on, for example

  • The number of the input list: map( )?for a single list, map2( )?for two lists; and pmap( )?for more than two list
  • The desired output: map( )?always returns a list as the output, map_chr( )?returns a character vector, map_dbl( )?returns a vector of double etc.


The map function takes two main?arguments

.x: can be a list or atomic vector

.f:?the function to be applied to the elements of .x


See Resources at the end for the documentation on the?purrr?package and the map function.



Load the datasets

It is helpful for the columns of your datasets to have the same names to facilitate the iteration.

Load the datasets and store them in a list using the?map?and?read_excel?functions

Load required packages

# Install any packages that you may be lacking

require(tidyverse)
require(readxl)   # To read Excel Spreadsheets into R
require(meta)         

Set working directory

setwd("C:/Users/Valirie N. Agbor/Documents/Purrr meta-analysis demo")

# Get list of names of datasets 
data_files_names <- list.files(path = "./datasets/")

# Get paths to datasets
data_file_paths <- paste0("./datasets/", data_files_names)

# Print file path
# You can see were have four paths, one for each of the four datasets

data_file_paths

[1] "./datasets/Prevalence of disease A.xlsx"
[2] "./datasets/Prevalence of disease B.xlsx"
[3] "./datasets/Prevalence of disease C.xlsx"
[4] "./datasets/Prevalence of disease D.xlsx"        


Read datasets

The?arguments for the map function

  • .x?= a vector of file paths containing the datasets to be read into R. The vector has a length of?four,?indicating we have?four?datasets
  • .f =?read_excel?function passed in a formula using the tilda (~). See the documentation for?map

data_list <- purrr::map(
  .x = data_file_paths, # Iterate through file paths 
  .f = ~readxl::read_excel(.x) 
  # Read each dataset using the read_excel function
  # Refer to the current data with ".x"
)
        

The?output?is a list of?four?tibbles: tibbles are like data frames but with some enhanced properties.

Note: The?map( )?variant always returns a list.

# Print data
data_list %>% map(head, n=3)


[[1]
# A tibble: 3 × 5
  Author  Year Sample_size Cases Stduy_name
  <chr>  <dbl>       <dbl> <dbl> <chr>     
1 Name 1  2008          98    35 A         
2 Name 2  2005         100    30 A         
3 Name 3  2021          51    21 A         

[[2]]
# A tibble: 3 × 5
  Author       Year Sample_size Cases Study_name
  <chr>       <dbl>       <dbl> <dbl> <chr>     
1 Costa Quaio  2008          98    26 B         
2 name 1       2014          52     8 B         
3 name 2       2021          51     4 B         

[[3]]
# A tibble: 3 × 5
  Author  Year Sample_size Cases Study_name
  <chr>  <dbl>       <dbl> <dbl> <chr>     
1 Name 1  2021          51     7 C         
2 Name 2  2013          44    14 C         
3 Name 3  2017         163    62 C         

[[4]]
# A tibble: 3 × 5
  Author  Year Sample_size Cases Study_name
  <chr>  <dbl>       <dbl> <dbl> <chr>     
1 Name 1  2005          19     1 D         
2 Name 2  2008          98     1 D         
3 Name 3  2013          44     5 D      ]        



Perform meta-analysis of proportions

We will pool the results using a random-effects?meta-analysis model and stabilise the variance using the?Freeman-Tukey double arcsine transformation.


The following columns will be used to run the meta-analysis

  • Cases: The number of participants with the condition
  • Sample_size: Sample size
  • Author: Surname of the first author
  • Year: Year of publication
  • study: This will be created by combining the columns?Author?and?Year

# Create the "study" column for each of the datasets 

data_list <- map(
  .x = data_list,  # Loop through each dataset with map
  ~mutate( # Create the variable "study" using the mutate function
    .x,    # 
    study = paste0(Author, ", ", Year)
  )
)

# Have a look at the newly created column 
data_list[[1]] %>% select(study) %>% head(n=5)

# A tibble: 5 × 1
  study       
  <chr>       
1 Name 1, 2008
2 Name 2, 2005
3 Name 3, 2021
4 Name 4, 2022
5 Name 5, 2016        


Fit meta-analysis

MA_random <- 
  purrr::map(.x = data_list, 
       ~meta::metaprop(
            data        = .x,    
            event       = Cases, 
            n           = Sample_size,
            studlab     = study,  
            sm          = "PFT", 
            level       = 0.95, 
            method.tau  = "DL", 
            pscale      = 100,
            digits.pval = 4, 
            prediction  = TRUE, 
            level.ma    = 0.95, 
            random      = TRUE, 
            fixed       = FALSE
            )
       )

# The outcome is a list of four meta-analysis results 
# One for each dataset 
str(MA_random, max.level = 1)

List of 4
 $ :List of 140
  ..- attr(*, "class")= chr [1:2] "metaprop" "meta"
 $ :List of 140
  ..- attr(*, "class")= chr [1:2] "metaprop" "meta"
 $ :List of 140
  ..- attr(*, "class")= chr [1:2] "metaprop" "meta"
 $ :List of 140
  ..- attr(*, "class")= chr [1:2] "metaprop" "meta"        


Make forest plots

We will use another function from the?purrr?package called?walk.

The?walk?function is useful here for its side effect to print out the forest plots.

See documentation for the purrr package

You will notice that the syntax for?walk( )?is the same as that for?map( ).

purrr::walk(
  .x = MA_random,   # Iterate through each meta-analytic object
  .f = ~meta::forest(.x)   # Make forest plot 
)
        
No alt text provided for this image
Outcome A
No alt text provided for this image
Outcome B
No alt text provided for this image
Outcome C
No alt text provided for this image
Outcome D


Resources

If you want to know more

Read

purrr?documentation:?https://www.tidyverse.org/blog/2022/12/purrr-1-0-0/

map?documentation:?https://purrr.tidyverse.org/reference/map.html

R for data science – Chapter 27:?https://r4ds.hadley.nz/iteration


Watch

  1. Tutorial by?Charlotte Wickham?(recommended):

Part 1:?Solving iteration problems with purrr

Part 2:?Solving iteration problems with purrr

2. Tutorial by?Hadley Wickham:?https://www.youtube.com/watch?v=EGAs7zuRutY&t=131s

Yusuff Adebayo Adebisi

Epidemiologist. Pharmacist. PhD Researcher at the University of Glasgow.

1 年

Thank you for sharing, Valirie

Yusuff Adebayo Adebisi

Epidemiologist. Pharmacist. PhD Researcher at the University of Glasgow.

1 年
Ulrich ZOUNTCHEME

MD, MPH, Epidemiologist, Health Information Expert

1 年
回复
Ayodipupo Sikiru Oguntade, DPhil

Cardiologist and Cardiovascular Physician-Scientist Survival analysis|Mendelian Randomisation|Stroke| Heart failure|Obesity|UK Biobank

1 年

Well done bro

要查看或添加评论,请登录

社区洞察