Whether to randomize or not has nothing to do with the sample size of a clinical trial

Whether to randomize or not has nothing to do with the sample size of a clinical trial

Statisticians -- and drug developers in any role:

In this post I will bust some myths around the size of RCTs.

Have you ever been involved in team discussions early in a development program where, once someone brought up

Should we not randomize?

the immediate reaction was

No, this is not possible, we are not planning a pivotal trial yet.

Here is the thing: randomization is completely independent of sample size.

Let me explain.

As discussed in this post the purpose of randomization is to give an estimate of the relative causal effect which is unbiased. Now, this property (unbiasedness) has nothing to do with how many patients are in the trial. We can randomize 10 patients vs. 10 patients – this will give us an unbiased estimate of the effect of interest. Of course, given that we only have 20 patients in our trial the variability of this estimate will be high, but again, that has nothing to do with the benefit of randomization.

So, where does the perception come from that randomization implies large trial? I guess from the fact that pivotal trials typically have to be randomized and fully powered, which often leads to large sample size. But again, fully powered and randomized are two completely independent concepts. Fully powered refers to operating characteristics of a hypothesis test, which can be based on randomized or non-randomized data.

So, in my opinion, getting an unbiased relative effect estimate also for decision-making in early development bears a lot of value and, importantly, we do not sacrifice much precision. My preferred way to think about this is: Assume you develop an oncology molecule and want to run a trial with primary endpoint response proportion (not rate, but that is for another day). Instead of asking

Should we run a single-arm or Phase 2 randomized trial?

rather ask

We plan to run a 40 patient trial, how can we generate the most evidence?

You have two options:

  • Single-arm trial. Assume response proportion with the new treatment is 40%, so we get an estimate of 16 / 40 with 95% Wilson confidence interval around this response proportion estimate from 26.3% to 55.4%. However, we have no idea how large the potential bias of a comparison against control actually is.
  • 1:1 randomized trial (randomization ratio could even be changed, but that does not change the message). Then, we get an unbiased relative effect estimate comparing the two arms – very precious evidence for decision-making at this stage. In the treatment arm we get an estimate of the response proportion of 8 / 20 with confidence interval from 21.9% to 61.3%.

So, what is the bottom line? In the small RCT we get an unbiased relative effect estimate, but compared to the single-arm trial we only sacrifice little precision (compare the confidence intervals) in assessing the response proportion with the new treatment.

In a pivotal setting, in e.g. rare diseases or pediatrics, we might complement a small (underpowered) RCT with RWD using, e.g. dynamic borrowing.

So, to conclude: the following three concepts can be tuned almost (the second and third are of course related) completely independently of each other.

  • Randomization gives an unbiased effect estimate.
  • Sample size determines how precisely we can estimate a relative treatment effect and absolute effects in each arm.
  • Fully powered refers to a hypothesis test for the effect of interest.

Again: Whether to randomize or not has nothing to do with the sample size of a clinical trial.

Jens Praestgaard

Retired from Novartis Institutes for BioMedical Research (NIBR)

1 年

"Of course, given that we only have 20 patients in our trial the variability of this estimate will be high, but again, that has nothing to do with the benefit of randomizatio" This is so patenty not true just from first principles of statistical inference.

回复
Arindam Pal

Director, Clinical Development

1 年

Kaspar Rufibach it’s economy and not sample size what stops people from randomising. Taking the message to preclinical experiments, one can’t dream of randomizing n=6,8,10,12 animals in typical crossover studies. This goes no different in humans for Phase 1 studies.

Giusi Moffa

Accidental Statistician on a mission to promote (health) data analytics we can trust.

1 年

aka: bias-variance trade-off ??

Adrian Olszewski

Clinical Trials Biostatistician at 2KMM (100% R-based CRO) ? Frequentist (non-Bayesian) paradigm ? NOT a Data Scientist (no ML/AI/Big data) ? Against anti-car/-meat/-cash and C40 restrictions

1 年

Kaspar Rufibach Absolutely! That would be of a big value for the community! ?? (Consider adding also #clinicaltrials #biostatistics #statistics #clinicalresearch #research - as this applies to other experimental research too ; BTW the clinicaltrial tag that you used is rarely used on LI I noticed).

回复
Dr. Alexander Schacht

Author, Speaker, Podcaster, Leadership Trainer. Fear is a reaction. Courage is a decision. The Effective Statistician! Medical affairs/RWE/HTA expert statistician.

1 年

I wasn’t even aware about this misconception.

要查看或添加评论,请登录

Kaspar Rufibach的更多文章

社区洞察

其他会员也浏览了