Sample Size Estimation for NEDA as Primary Endpoint in Multiple Sclerosis Clinical Trials
Manolo Ernesto Beelke ???????
Strategic Medical Affairs & Clinical Development Expert | CMO | Advisor to Pharma & Biotech | Driving Regulatory Success & Market Access | 28+ Years in CNS, Neurology & Rare Disease | manolobeelke.com
Author: Manolo E. Beelke
Email: [email protected]
Web: manolobeelke.com
Abstract
This article explores the challenges and methodologies associated with using No Evidence of Disease Activity (NEDA) as a primary endpoint in multiple sclerosis (MS) clinical trials. Traditional endpoints like Annualized Relapse Rate (ARR) and Expanded Disability Status Scale (EDSS) progression required large sample sizes and high disease activity at baseline. With the advent of new disease-modifying treatments (DMTs) and earlier MS diagnoses, overall disease activity in clinical trials has decreased, necessitating the adoption of more sensitive endpoints. NEDA-3 and NEDA-4, which incorporate various parameters of disease activity, offer promising alternatives. This article reviews sample size estimation methods, compares traditional and composite endpoints, and discusses the potential for reduced sample sizes using NEDA.
Introduction
Multiple sclerosis (MS) research has seen remarkable advancements in recent years, particularly with the development of new disease-modifying treatments (DMTs). These treatments demonstrate superior efficacy compared to older therapies, leading to a significant shift from placebo-controlled studies to head-to-head trials comparing active treatments. This change, driven by ethical concerns and regulatory requirements (Cohen et al., 2008), presents unique challenges for clinical trial design, especially in determining appropriate sample sizes and study durations (De Stefano et al., 2015).
The Shift in MS Clinical Trials: From Placebo-Controlled to Active Comparisons
Historically, placebo-controlled trials have been a standard in clinical research. However, with the advent of highly effective DMTs, such studies are increasingly viewed as unethical. Current MS research emphasizes head-to-head comparisons between active treatments to ensure that all patients receive potentially beneficial therapies. This shift necessitates larger sample sizes and longer study durations to detect smaller therapeutic effects, posing significant challenges for researchers (De Stefano et al., 2015).
Comparative Analysis of ARR and EDSS Endpoints
The tables below provide a comparative analysis of the efficacy of various disease-modifying treatments (DMTs) for multiple sclerosis (MS), using traditional endpoints: Annualized Relapse Rate (ARR) and Expanded Disability Status Scale (EDSS). These endpoints are still vital for assessing the impact of treatments on MS progression and patient well-being.
ARR Endpoint Analysis
The ARR endpoint measures the frequency of relapses over a specified period. A reduction in ARR indicates the efficacy of a treatment in reducing the occurrence of relapses.
Table 1 highlights the reduction in relapse rates achieved by different DMTs compared to placebo. In the FREEDOMS II trial, fingolimod achieved a 55% reduction in ARR over 24 months, indicating a significant decrease in relapse frequency. The TOWER study reported a 36% reduction for the higher dose (14 mg) of teriflunomide and a 22% reduction for the lower dose (7 mg) over 48 months. The DEFINE and CONFIRM trials showed a 45% and 53% reduction in ARR for the twice-daily (BID) and thrice-daily (TID) dosages of dimethyl fumarate, respectively.
EDSS Endpoint Analysis
Table 2 focuses on the proportion of patients experiencing sustained disability progression, a critical measure of long-term MS impact. In FREEDOMS II, fingolimod treatment resulted in a 27% reduction in the risk of disability progression compared to placebo. The TOWER study showed a 32% reduction in risk for the higher dose of teriflunomide (14 mg) and a 12% reduction for the lower dose (7 mg). The DEFINE and CONFIRM studies demonstrated a 34% reduction in disability progression risk for the BID dosage and a 38% reduction for the TID dosage of dimethyl fumarate.
Implications of Classical Endpoints
The differences in efficacy reductions between ARR and EDSS underscore the multifaceted nature of MS treatment effects. While ARR reflects short-term treatment efficacy by measuring relapse frequency, EDSS provides insight into long-term outcomes by assessing disability progression. These endpoints are crucial for understanding the comprehensive benefits of DMTs in clinical trials.
Using ARR and EDSS together allows for a more complete evaluation of a treatment's impact, as each endpoint captures distinct aspects of the disease process and patient experience. This dual analysis can inform treatment decisions and guide future research directions in MS.
The Need for New Endpoints in MS Research
As the efficacy of new disease-modifying treatments (DMTs) continues to improve, traditional clinical endpoints like the annualized relapse rate (ARR) and the Expanded Disability Status Scale (EDSS) are becoming less sensitive in detecting differences between treatments. This trend is accompanied by increased awareness of multiple sclerosis (MS) and a shorter time window from the first clinical event to the diagnosis of MS. Moreover, the period from diagnosis to the start of treatment has also decreased, allowing for earlier intervention and better disease management.
For clinical trials, this evolving landscape means that the baseline disease activity in MS patients has been decreasing over time due to earlier diagnoses and the availability of more effective DMTs.
A notable trend in recent clinical trials is the increasing proportion of patients achieving NEDA at baseline. This trend may reflect improved treatment efficacy and earlier intervention strategies. For example:
This trend is continues even furhter when comparing these older studies with more recent trials. The observed increase in baseline NEDA suggests that modern trials are increasingly enrolling patients with lower disease activity.
These changes reflect not only advancements in treatment but also shifts in clinical practice toward earlier and more aggressive therapeutic approaches (Cohen et al., 2012; Fox et al., 2014).
This trend might affect sample size calculations, as lower baseline disease activity could lead to smaller observed treatment effects, necessitating larger sample sizes to detect statistically significant differences.
The increased rates of NEDA among patients are indicative of both enhanced treatment efficacy and the shift toward earlier, more comprehensive treatment strategies.
Given these developments, the scientific community is advocating for the adoption of more comprehensive and sensitive endpoints in MS research. NEDA is considered a promising candidate due to its ability to provide a holistic assessment of disease activity and treatment effectiveness. The use of NEDA can better capture the benefits of new DMTs, offering a more nuanced understanding of their impact on MS progression and patient quality of life (Sormani & Bruzzi, 2015; Giovannoni et al., 2015).
Understanding NEDA: Definition and Components
NEDA is a composite measure of disease activity, referred commply as NEDA-3 and NEDA-4. NEDA-3 is defined by three key parameters:
NEDA-4 is an expanded definition of the above adding the following parameter:
The Predictive Value of NEDA in MS
Recent studies have shown that achieving NEDA-3 or NEDA-4 over a two-year period can predict long-term disability outcomes with accuracy comparable to longer observation periods of five and even seven years (Sormani et al., 2015). This makes NEDA a valuable endpoint for assessing the long-term efficacy of MS treatments.
NEDA as a Primary Endpoint in MS Clinical Trials
Using NEDA as the primary endpoint represents a shift from traditional clinical measures to a more composite and sensitive measure of disease activity.
NEDA's composite nature, combining MRI and clinical assessments, enhances its sensitivity in detecting disease activity, potentially reducing required sample sizes and study durations (De Stefano et al., 2015).
The benefits of NEDA as a primary endpoint ar the following:
While NEDA offers a comprehensive measure of disease control, it also presents several biostatistical challenges:
TWhile NEDA offers a comprehensive measure of disease control, it also presents several biostatistical challenges:
Therefore, it is essential to approach their construction in MS with comparable caution to avoid bias and ensure balanced representation of each component.
Despite these challenges, if properly implemented, NEDA could serve as a powerful new primary endpoint in clinical trials, especially in pivotal Phase III trials.
Methodology for Sample Size Estimation in MS Trials Using NEDA as Primary Endpoint
Sample size estimation is crucial for the design of clinical trials to ensure they are adequately powered to detect meaningful differences between treatment groups. For NEDA, sample size calculations must consider the composite nature of the endpoint and the specific proportions of patients expected to achieve NEDA in the treatment and control groups (Sormani et al., 2013).
领英推荐
Key Parameters for Sample Size Calculation
Sample Size Formula for Comparing Two Proportions
The sample size formula for comparing two proportions is given by:
Example Calculation
Assume the following hypothetical data:
Using the formula:
Step-by-step calculation:
1. Calculate the numerator part of the formula
2. Calculate denominator part of the formula:
3. Substitute all values into the formula:
Thus, a sample size of 96 participants is required based on these assumptions. It is crucial to adjust this number according to the known or estimated withdrawal rate and the study's specific population.
This example demonstrates the methodology for calculating sample size, but actual trial design may require adjustments based on more detailed assumptions and variations in patient populations.
Analysis of Sample Size Variations
Influence of P1 and P2 Values
The sample size required for a clinical trial is heavily influenced by the values of P1 and P2. As the difference between P1 and P2 (effect size) increases, the required sample size decreases. Conversely, as P1 and P2 values become closer, indicating a smaller effect size, the required sample size increases to ensure the study has sufficient power to detect this smaller difference.
The table below illustrates how sample sizes vary with different P1 and P2 values:
From Table 3, it is evident that as P1 and P2 values approach each other, indicating smaller differences between the treatment and control groups, the required sample size increases. For example, in the study by Sormani et al. (2015), the relatively small difference between P1 (0.75) and P2 (0.55) necessitates a larger sample size of 277 per group. The discreapncy between the calculated sample size and the reported sample size in this publications is however related further to other factors, whcih require further adjustments of the sample size:
1. Interim and Subgroup Analyses
Interim analyses are conducted to evaluate data at predefined points during the trial. They help in early stopping for efficacy, futility, or safety reasons. However, they require adjustments to the overall sample size to maintain the study's integrity and statistical power. The O'Brien-Fleming and Pocock methods are commonly used to adjust for interim analyses, often leading to an increase in the required sample size. Similarly, Multiple sclerosis research often involves analyzing various subgroups (e.g., based on disease severity), requiring additional participants to maintain power within each subgroup.
2. Dropouts and Non-Compliance
Anticipated dropouts and non-compliance rates are factored into the initial sample size calculations. Researchers typically inflate the sample size to ensure that the final number of participants completing the study is adequate for statistical analysis. For instance, if a 20% dropout rate is expected, the sample size is increased by 25% to compensate.
3. Design Effects in Clustered or Stratified Designs
Clustered or stratified designs, often used to improve the precision of effect estimates or to ensure balanced subgroups, require additional sample size adjustments. The design effect, calculated based on the intra-cluster correlation coefficient (ICC), inflates the sample size to account for the correlation within clusters.
4. Effect Size Assumptions
The assumed effect size significantly impacts sample size calculations. Conservative estimates of smaller effect sizes typically result in larger sample sizes to ensure sufficient power. Conversely, optimistic estimates of larger effect sizes may underestimate the required sample size.
Sample Size Reduction Using More Sensitive Endpoints
In clinical trials for multiple sclerosis (MS), the choice of endpoint can significantly influence the required sample size. More sensitive endpoints, like No Evidence of Disease Activity (NEDA), can lead to smaller sample sizes compared to traditional endpoints such as the Annualized Relapse Rate (ARR) and Expanded Disability Status Scale (EDSS) progression. This efficiency is crucial for reducing costs, accelerating study timelines, and minimizing patient exposure to potentially ineffective treatments.
NEDA serves as a composite measure of disease activity, integrating multiple parameters to provide a comprehensive assessment of treatment efficacy. Specifically, NEDA-3 includes the absence of relapses, new or enlarging MRI lesions, and sustained EDSS progression. NEDA-4 extends this by also considering the absence of significant brain volume loss. As highlighte in Table 4, these endpoints are not only more sensitive to changes in disease status but also more closely aligned with long-term clinical outcomes, making them superior for evaluating DMTs.
Looking forward, there is a growing need for continued innovation in clinical trial design, including the exploration of new biomarkers and personalized medicine approaches. Current research is exploring NEDA-5 and NEDA-6 as further developments of the NEDA composite endpoint. These advanced endpoints aim to provide an even more comprehensive assessment of disease activity, potentially further reducing sample sizes and shortening trial durations. Whether these new iterations will indeed streamline the clinical trial process will be determined in the near future.
Conclusion
Accurate sample size estimation is crucial for designing effective MS clinical trials. By using composite endpoints like NEDA, researchers can potentially reduce the number of required participants, shorten study durations, and improve the efficiency of trials. Understanding and applying appropriate methodologies for sample size estimation will enhance the reliability of trial results and significantly contribute to the advancement of MS research.
Sample size determination is a multifaceted process influenced by numerous factors beyond the basic calculations of P1 and P2 values. Adjustments for interim analyses, dropouts, design effects, and conservative effect size assumptions are critical to ensuring that clinical trials are adequately powered to produce reliable and meaningful results. Recognizing these complexities is essential for researchers, clinicians, and stakeholders involved in the design and execution of clinical trials.
Looking forward, there is a growing need for continued innovation in clinical trial design, including the exploration of new biomarkers and personalized medicine approaches. These advancements could further refine the assessment of treatment efficacy and optimize patient outcomes. Additionally, the methodologies discussed in this article have broader implications for clinical trials in other chronic diseases, where sensitive and comprehensive endpoints are equally vital.
The evolving landscape of MS treatment and research underscores the importance of sophisticated trial designs. By embracing these innovative approaches, the scientific community can more accurately measure the efficacy of emerging treatments, leading to faster and more cost-effective clinical trials. This progress not only benefits the scientific understanding of MS but also has a direct impact on patient care, potentially offering more tailored and effective therapeutic options.
In conclusion, as the field of MS research advances, the adoption of more refined and sensitive endpoints like NEDA is essential. Continued collaboration among researchers, clinicians, and industry stakeholders is crucial to harness these advancements, ultimately leading to better therapeutic options and improved quality of life for individuals living with MS. The commitment to innovation and rigorous methodology in clinical trials will pave the way for future breakthroughs in MS treatment and beyond.
References
?