Network Update #65
AI
Large Language Model Influence on Diagnostic ReasoningA Randomized Clinical Trial (thanks ???? Jan Zachnik for sharing) This randomized clinical trial found that physician use of a commercially available LLM chatbot did not improve diagnostic reasoning on challenging clinical cases, despite the LLM alone significantly outperforming physician participants. The results were similar across subgroups of different training levels and experience with the chatbot. These results suggest that access alone to LLMs will not improve overall physician diagnostic reasoning in practice. These findings are particularly relevant now that many health systems offer Health Insurance Portability and Accountability Act–compliant chatbots that physicians can use in clinical settings, often with no to minimal training on how to use these tools. Link
ASA Biopharm Report that features AI/ML in Drug Development! Our theme for this winter issue is Implementation of AI/ML in Drug Development. We’re thrilled to feature articles by leading experts from the FDA, industry and academia, who share diverse perspectives on regulatory consideration, as well as the application of AI/ML in clinical trials and real-world data. Additionally, in our Leadership section, you’ll find a thoughtful discussion on statistical thinking in the age of AI, as well as reflections and experience-sharing from a past BIOP Section Chair.
Some of the topics covered:
Programming
Bartosz Jab?oński - Are you curious about how to wrap your SAS code into a reusable package, but you are not convinced yet? See this short article describing advantages of using SAS packages. Paper was presented at #PHUSE EU 2024 conference. Official proceedings will be available soon, meanwhile here is the article. Link
Ted Laderas - Created SAS - R Cheat sheet. This guide aims to familiarize SAS users with R.
Jagadish K. - Tibbles are a modern, user-friendly version of data.frame that make data manipulation easier, more efficient, and less error-prone. Learn about 5 reasons why should R programmers prefer tibbles over data.frame. Link
Jagadish K. - When working with large datasets, choosing the right data import function in hashtag#R can significantly improve performance. I tested the efficiency of three popular functions, read_csv from base R’s utils and vroom and read_csv from readr using microbenchmark and system.time() for comparison. microbenchmark reports the results in milliseconds whereas system.time() reports the results in seconds. # Performance Ranking (from fastest to slowest):read_csv from utils – Clearly the fastest, ideal for handling large datasets in R.vroom – Reliable but slower compared to utils, especially with larger files.read_csv from readr – Reliable but slower compared to utils and vroom. Link
Samarth Patel - I developed an #R #Shiny application to streamline #pharmacometric (PMx) programming, creating a one-stop solution for the team that covers everything from data requests to post-delivery data exploratory activities. This app features a live Gantt chart, Request tracker, CDISC guide, Analysis request form, and pharmacometric data explorer and many more. Link
Bartosz Szymecki - If you create boxplots in SAS and R using small samples, beware of potential differences! Read this interesting post by Bartosz to learn more about the way R and SAS calculate quantiles by default. Link
Michael Rimler - Check out the latest PHUSE blog looking back on how it has enabled its community’s interest in using #opensource technology for clinical data research. Link
Biostatistics
Congratulations to Daniel Bruce and Robert Szulkin for their contribution to advancing insights into hepatocellular carcinoma (HCC), the most common form of liver cancer. Long-term risk of HCC in a DAA-treated national hepatitis C cohort, and a proposed risk score.
The new study, published in the Journal of Infection and Public Health, examines risk factors associated with HCC in a Hepatitis C population that, after treatment, has no detectable residual infection. The article also presents a prediction model identifying patient groups at low, medium, or high risk of developing HCC.
Conclusion: Pre-treatment liver stiffness is strongly associated with HCC risk, which remains stable during the first five years after Hepatitis C treatment. This prediction model can identify which patient groups need regular HCC monitoring and which may not. Read more - Link
Robert Rachford - Writes about Ethical Guidelines for Statistical Practice. I use this guide to not only to remind myself of my duties as an unbiased evaluator and interpreter of data, but to also help remind me what the best practices are when designing clinical trials and analyzing patient data. Link
Ryan Batten, PhD(c) - writes about an article - A Principled Approach to Characterize and Analyze Partially Observed Confounder Data from Electronic Health Records. Missing data is always a problem for real-world data. Why? If you don't correctly account for it, the results can be biased. There are lots of methods available for dealing with this! However sometimes it can be tricky to know where to start. A recent article I came across might just be a new favourite: Weberpals et al. (2024).The authors clearly lay out diagnostics that can be used to characterize the missing data process. They even use an example and have the R code available! Link
F. Javier Rubio - New preprint with P. Basak, A. Linero, and C. Maringe Relative Survival Analysis Using Bayesian Decision Tree Ensembles
We develop Bayesian additive regression trees (BART) for excess hazard and net survival estimation, including proportional and non-proportional hazards models. We also present methods to quantify variable importance and to produce posterior summaries.
We present an application using colon cancer data from England, highlighting the insights our proposed methodology offers when paired with state-of-the-art data linkage methods. Link
Denis Odinokov - Contrastive Principal Component Analysis (cPCA) is an extension of Principal Component Analysis (PCA) introduced by Abubakar Abid et al. in 2018 [1]. It was developed to identify patterns unique or enriched in a target dataset relative to a background dataset. While traditional PCA identifies directions of maximal variance within a single dataset, cPCA modifies this approach to focus on contrastive variance — variance that distinguishes the target dataset from the background. This method is particularly effective for comparative analyses, such as treatment-control experiments. Link
Toshimitsu Hamasaki - Statistics in Biopharmaceutical Research (SBR), an official publication of the American Statistical Association - ASA, has published its last issue for the year 2024. This issue focuses on "Randomization Methods to Design and Analyze Trial Estimands, Adjust for Covariates, and Implement Efficient Designs" and includes one editorial note, 14 articles, three comments, and one rejoinder. The topics discussed in this issue were motivated by the National Institute of Statistical Sciences (NISS) Ingram Olkin Forum on Randomization Tests, organized by Nancy Flournoy and James Rosenberger, which addressed disruptions in clinical trials due to the COVID-19 pandemic. Link
Tim Morris - Simulation studies for methodological research in psychology: A standardized template for planning, preregistration, and reporting.
It is written for psychologists but there is a lot of good stuff in there on simulation studies in general. Superbly led by Bj?rn Siepe, Franti?ek Barto? and Samuel Pawel, joined by Anne-Laure Boulesteix, Daniel Heck and me.
Read more: Link
Hongfei Li - I’m delighted to share that our collaborative?research with Qian H Li, Chuan Tian, and Kevin Hou, titled Issues in cox proportional hazards model with unequal randomization, has been published in the Journal of Biopharmaceutical Statistics! This paper raises awareness about the non-zero bias and type I error inflation that can arise when unequal randomization is applied in Cox proportional-hazards models. It provides meaningful insights to enhance clinical trial design and analysis, contributing to more robust and reliable decision-making in drug development. Link
Alex Ocampo - I am happy to announce that our manuscript on Prognostic factors for disability worsening and improvement in multiple sclerosis using a multistate model has been published in Multiple Sclerosis Journal - MSJ!
I am especially proud of this work, as to overcome sparsity in the multi-state transition data, I wrote code for a Bayesian implementation of the popular msm() function in R. This allowed us to fit a very granular model with many disability states, to more honestly characterize MS prognosis and capture how the influence of different factors can change across the disability spectrum. Link
Bryan McComb - Randomization and Blinding: Protecting Trial Integrity. Randomization and blinding are two pillars of rigorous clinical trial design. For a biostatistician entering the pharma industry, mastering these concepts is critical to ensuring the credibility and validity of trial results. But how do you ensure that randomization and blinding methods are robust enough to maintain trial integrity while still being flexible enough to address practical challenges? Read more - Link
Antonio Remiro-Azócar, PhD - The article Effect modification and non-collapsibility leads to conflicting treatment decisions: a review of marginal and conditional estimands and recommendations for decision-making by Phillippo et al. is available as a pre-print. The paper relates to marginal versus conditional treatment effects. The context is evidence synthesis and indirect treatment comparisons. Nevertheless, many of the findings are applicable to the analysis of individual RCTs and transportability analyses. Link
EFSPI (European Federation of Statisticians in the Pharmaceutical Industry) - Q3 2024 Newsletter ?? Event Highlights – A look back at the 9th Regulatory Statistics Workshop, which brought over 774 statisticians and experts together in Basel! Thanks to Kaspar Rufibach and Helle Lynggaard for this incredible success!?? Collaboration with American Statistical Association - ASA – Strengthening transatlantic ties and sharing knowledge across statistical communities. Read more - Link
Miguel Hernán - An expert in Causal Inference together with Jamie Robins wrote a book Causal Inference: What If that provides a cohesive presentation of concepts of, and methods for, causal inference. Much of this material is currently scattered across journals in several disciplines or confined to technical articles. We expect that the book will be of interest to anyone interested in causal inference, e.g., epidemiologists, statisticians, psychologists, economists, sociologists, political scientists, computer scientists… The book is divided in 3 parts of increasing difficulty: causal inference without models, causal inference with models, and causal inference from complex longitudinal data. Link
Oliver Sailer - I'm happy to share that our paper on Pharmacometrics-Enhanced Bayesian Borrowing for Pediatric Extrapolation – A Case Study of the DINAMO Trial has been published in Therapeutic Innovation & Regulatory Science.
Bayesian borrowing is one way to integrate efficacy data from adults in the statistical analysis of pediatric trials. However, direct borrowing from data in adults might introduce bias due to differences in body weight and other differences in patient characteristics between adults and children. Instead, we perform an analysis that uses covariate adjusted pharmacometric model predictions of efficacy based on available data as the informative prior component in a Bayesian analysis of the pediatric trial. The prior is further robustified by adding a weakly informative robust mixture component.
In our paper we report this pre-specified Bayesian borrowing analysis of the DINAMO trial in young people with type 2 diabetes. Link
RWE
Radek Wasiak - Cracking the Code: How #RealWorldData Unlocks Rare Disease Assets. 'To Sell is Human' has been one of my favourite business books. It has led me to re-evaluate the essence of business development, particularly its necessity to be firmly evidence-based within the life sciences sector. In my latest blog post, I discuss how #RealWorldData can bolster business development efforts related to the acquisition or in-licensing of rare disease assets, drawing on a series of recent discussions and experiences. Link
Open Case Studies: Statistics and Data Science Education through Real-World Applications. This educational resource provides self-contained, multimodal, peer-reviewed, and open-source guides (or case studies) from real-world examples for active experiences of complete data analyses. We developed an educator’s guide describing how to most effectively use the case studies, how to modify and adapt components of the case studies in the classroom, and how to contribute new case studies (opencasestudies.org/OCS_Guide). Link
Events & Webinars
Thomas Jaki - Bayesian Adaptive Trials Short Course. Bayesian methods allow prior information to be formally incorporated and updated when new data become available and while such approaches are now routinely used in dose-finding trials, their use for other trials is still rare. This two-day short course will provide an introduction to Bayesian methods with applications to clinical trials. Link
24th-25th February 2025, Regensburg, Germany
Summer School on Modern Methods in Biostatistics and Epidemiology. Immerse yourself in an enriching and memorable learning experience at the stunning Brandolini Colomban castle, where our comprehensive courses in biostatistics and epidemiology will enrich your knowledge of data science and elevate your research skills to new levels. Link
June 2-14, 2025, Cison di Valmarino Treviso - ITALY Castello Brandolini Colomban
Response-adaptive randomization in clinical trials: myth vs reality. Response-adaptive randomization (RAR) is part of a wider class of data-dependent sampling algorithms, for which clinical trials have typically been used as a motivating application. In that context, patient allocation to treatments is defined by using the accrued data on responses to alter randomization probabilities, in order to achieve different experimental goals.?
Presenter: Sofia S. Villar
LSHTM, Keppel Street London WC1E 7HT United Kingdom, Room G41, Hybrid
Tuesday 10 December 2024 Time 12:50 - 13:50
ASA Biopharmaceutical Section - 2025 ASA Biopharmaceutical Section STATBOLIC Conference. As we enter an era of groundbreaking therapies for metabolic diseases, it's more important than ever to come together and tackle the emerging questions in drug development. To address these challenges, 8 leading pharmaceutical companies, 2 renowned academic institutions, and 2 regulatory agencies are joining forces for a one-day, one-of-a-kind workshop focused on metabolic disorders. Topics covered:
Novel Designs in Evaluating Multiple Indications Related to Metabolic Disorders
Statistical Challenges in Clinical Trials in Evaluating Cardiovascular, Kidney, and Liver Diseases
Statistical Challenges in Clinical Trials in Evaluating Treatments in Weight Reduction
Date: Feb 6, 2025, Time 08:30 AM - 05:30 PM, Location :Rockville, MD US Link
2025 UCSF-Stanford CERSI Bayesian Thinking in Clinical Research Course
This is a virtual course comprised of twelve 90-minute sessions delivered live by experts in the field of Bayesian statistics and its applications to clinical trials
from January 23, 2025, through April 10, 2025, from 10 – 11:30 am Pacific Time (1 – 2:30 pm Eastern Time). Link
Doctorate Statistician | Strategic Achiever | Adaptive Leader | Mission-Driven
1 天前Krzysztof Orzechowski, your network updates are always such a fantastic resource. Thank you!
Experienced SAS/R Programmer | 12 Years into SDTM, ADaM, TFLs, Oncology, Infectious Diseases & Other Therapeutic Areas
1 天前thank you Krzysztof Orzechowski
Vaccines Area Head, Flu/Covid (GSK) | Open Source Technologies Director (PHUSE) | Co-creator of pharmaverse
1 天前And the #phuse #opensource blog was NOT written by ChatGPT or any LLM. Written by yours truly and edited by real humans! ??