Bias Binding?
Some important books in the library of SJS

Bias Binding?

By randomizing the order in which the administrative regions change the treatment regimen, SWITCH SWEDEHEART overcomes many of the concerns of confounding results, which would be present in non–randomized pre- and post intervention evaluations. [1] (P71)

Ex and X

On 17 August (2023), in a debate on the platform formerly known as Twitter, Elmir Omerovic @elmir1omerovic drew attention to an interesting, 'stepped-wedge, cluster-randomized, open-label multicenter trial to compare prasugrel and ticagrelor for treatment of patients with acute coronary syndrome'[1]P70. In a related thread on 19 August (2023), I discussed bias with Judea Pearl ,@yudapearl who stated

But then concepts such as ATE, CATE, Bias, Generalizability, etc., should reside outside Experimental Design, since these are asymptotic properties, not concerned with precision.

To which I (@stephensenn) replied

Yes and no. You can talk about estimands without considering design but bias is a slippery concept. Some designs eliminate terms that others average over. From one perspective this averaging introduce a bias from another it contributes to the variance but not the bias.

It seems to me that the SWITCH SWEDEHEART study is a good one to illustrate this point, which I shall now proceed to do.

Divine Design?

Seven administrative regions will be divided into 3 clusters of similar size, with the smaller regions (based on population size) combined to form a single cluster ( Figure 1 ). At the start of the study, all regions will utilize and prescribe ticagrelor as the P2Y12 inhibitor of choice for patients with ACS. The order in which the regions will implement the transition from ticagrelor to prasugrel will be randomly assigned using a random number generator (R studio, version 1.4). After 9 months, 1 cluster will change from ticagrelor to prasugrel as the P2Y12 inhibitor of choice for patients with ACS. An additional cluster will change to prasugrel every 9 months until all clusters have transitioned. Study enrollment will cease 9 months after the last cluster has changed to prasugrel. The study will be terminated when all patients have concluded a 1-year follow-up.[1]P72

The above lines from the paper describing the design of the SWITCH SWEDEHEART study succinctly and expertly summarise the design. Three treatment sequences are to be used and a different cohort assigned to each and these can be expressed in terms of Table 1 below, which is an extract from the Mathcad program I used to analyse the design.

No alt text provided for this image
Designs of the SWITCH SWEDEHEART study.

Table 1. Sequences to be used.

Note that there is a certain symmetry to the design in that period 4 is to prasugrel what period 1 is to ticagrelor and period 2 is to ticagrelor what period 3 is to prasugrel. Note also that the design defines 12 sets of observations that may be grouped together in terms of the cohort and period they occur in and of course the treatment given. I shall refer to these 12 combinations as cells.

I am now going to make some simplifying assumptions that I know will not be true, simply to illustrate some features of the design. They are

  1. There will be equal numbers of patients in each cell.
  2. There will be no further information available for any patient apart from period, cohort, treatment given and outcome.
  3. An appropriate estimate will be linear.

These assumptions together mean that a) any estimator will be a (weighted) linear combination of the 12 cell statistics (for example the proportion of patients having a bad outcome) and b) the efficiency of any estimate can be examined by looking at the weights (although it may depend on further information).

Weighing things up

Our task is now to find the 12 cell weights in order to estimate the contrast of interest (say the difference in the event rate between the two treatments). The following must apply to these weights.

  1. The weights must sum to -1 (say) over T and to 1 over P.
  2. For each of the six weights in a cell in which T appears, there must be a corresponding weight that is its negative in the matching cell in which P appears (bearing in mind the 'symmetry' previously mentioned). For example this implies that the weight in period 4 of cohort III must be -1 times the weight in period 1.
  3. If we wish to eliminate trend effects, the weights must sum to 0 in any column.

What solution would these constraints imply? The answer is given in Table 2 below.

No alt text provided for this image
Weights that satsify the leinear constraints. thos applying to T have been circled.

Table 2. General scheme of weights.

If you look at these weights you will see that the constraints have reduced the problem to one of finding three unknowns only and not 12. The weights sum to 0 in any column, thus making sure that the trend effects will be eliminated from our estimate and they sum to -1 over T (the circled weights) and to 1 over P (the un-circled weights). They will thus, from one perspective, give us an unbiased estimate of the P-T contrast. This is a point to which I shall return.

But what values should the weights w1, w2,& w3 be given? If unbiasedness is your only criterion, anything at all. For example, I could assign the value 16 to w1 and 18 to w2 on the grounds that Pythagoreans liked these numbers and 42 to w3 because Douglas Adams chose this to be the 'Answer to the Ultimate Question of Life, the Universe, and Everything' in The Hitchhiker's Guide to the Galaxy. These would be eccentric and extremely bad choices but they still would produce unbiased estimates.

What I could do for example, is assume that the 12 cell statistics would be independent with identical variances. In that case the minimum variance solution would be one that minimised the sum of the squares of the weights. This would then lead to an ordinary least squares solution that looks like this, which corresponds to having set w1 and w2 to zero and w3 to -1/4=-0.25.

No alt text provided for this image
Ordinary Least Squares Solution

Table 3. OLS weights

Note that period 1 and period 4 values will have no effect on the estimate whatsoever. Why does this happen? It happens because in period 1 only T is given and in period 4 only P is given but in order to eliminate period effects the weights have to sum to 0 in any column. Thus, whatever weights I assign, subject to this constraint, cannot contribute to the signal but only to the noise. Thus setting them to zero is optimal.

Bias is in the eye of the beholder

I have described all the solutions so far as being unbiased. However, the attentive reader will have noticed that the weights do not add to zero for all cohorts. They do for cohort II but not for cohorts I and III. Thus, if there are cohort effects they will contribute to the estimate. Why then am I entitled to call the associated estimators unbiased?

I am exploiting a legal loophole here. SWITCH SWEDEHEART is randomised. Over all randomisations it is unbiased. Thus I can imagine repeating this experiment infinitely many times and the cohort effect will be eliminated by averaging.

Does this let me off the hook? Not in the world of statistics, although this point is often misunderstood. We cannot run infinitely many experiments only one. However, if we have randomised we can make an allowance for the uncertainty that any cohort effect would bring. The cohort effect becomes a so-called random effect and contributes to the variance. (Incidentally this will also mean that the simple variance solution I have assumed so far will not hold and the OLS solution will not be optimal.) Provided we don't obsess about the point estimate but consider estimation as being a matter of issuing a probability statement we can deal with such random variation.

Difference or sum?

Unfortunately with this design, a variance component, the one between cohorts, will be estimated very poorly and the optimal solution and the variance of the accompanying estimate will depend on this variance component. Many statisticians would choose, therefore to eliminate it rather than averaging (conceptually) over it. We can do this by imposing a further constraint that the weights must add to zero in any row. This now treats the cohort effects as being fixed. If we do this and then apply the ordinary least squares solution, the result is as given in Table 4.

No alt text provided for this image
OLS weights if cohort effects have to be eliminated.

Table 4. OLS weights eliminating cohort effects.

Now the weights not only add to 0 over any period and to -1 over T and 1 over P but to 0 in any cohort.

No free lunch

However, all these bias corrections come at a price. If only I knew that the cohort effects were zero and that the period effects were zero, i could simply multiple every T cell, of which there are 6 by -1/6 and every P cell, of which there are 6 by 1/6. The resulting estimate would have a variance proportional to 12 x (1/6 x 1/6)= 1/3. Instead, the sum of the squared weights in Table 3 is 0.75 and thus 2.25 times as high. If I take the weights in Table 4, the sum of their squares is 1.2 and thus 3.6 times as high. In fact, in order to eliminate the cohort effect, then despite the fact that I am estimating the P-T contrast, I have had to give one of the cells in which T is given a positive weight. This seems a very unnatural thing to do. Other things being equal, a poor response to T in this cell, will have the effect of making T look better and P look less good.

Note that the two solutions presented can be seen as specific cases of a more general one, in which I allow for two variances, between and within clusters. The first solution corresponds to this between variance being zero and the second to its being infinite. However, as already explained, we don't have enough degrees of freedom to estimate the between cluster variance reliably. In any case, this intermediate solution also pays a price for the lack of orthogonality in design.

What lessons can we draw?

The reader, of course, may not agree with the lessons that I draw but these are what seem important to me.

  1. Bias is a weasel word. One person's bias is another person's variance.
  2. It is a mistake to regard estimates as of primary importance and variances as secondary. Sometimes you cannot sensibly discuss estimates without examining variances carefully. In any case the result of an experiment ought to be a probabilistic inference, not a best guess with no knowledge as to how good the guess may be.
  3. Design matters.
  4. There are many different types of clinical trial but doing better than the simple parallel group trial practising current control is harder than one may suppose.


  1. E. Omerovic, D. Erlinge, S. Koul, O. Frobert, J. Andersson, J. Ponten, F. Bjorklund, R. Kastberg, M. Petzold, C. Ljungman, K. Bolin and B. Redfors (2022) Rationale and design of switch Swedeheart: A registry-based, stepped-wedge, cluster-randomized, open-label multicenter trial to compare prasugrel and ticagrelor for treatment of patients with acute coronary syndrome. Am Heart J, 70-77.


Stephen Senn的更多文章

  • May the fourth be with you

    May the fourth be with you

    Be merciless in your pedantry: give no quartile The photograph is of the Laxey Wheel on the Isle of Man . If you look…

    10 条评论
  • Twin Piques

    Twin Piques


    5 条评论
  • Having a Sense of Proportion

    Having a Sense of Proportion

    The arguments are asymptotic but are relevant to situations where the sampling fluctuations are large enough to be of…

    9 条评论
  • A Pronounced Mistake

    A Pronounced Mistake

    Narrow fabric I come from a family of ribbon makers whose business was based in Basle. In fact, ribbons were in the…

    3 条评论
  • Match fit

    Match fit

    Matching and fitting in observational studies and the relevance or otherwise of the comparison with randomised studies…

    16 条评论
  • Tensions over Testing

    Tensions over Testing

    Bear with me The navigational solution to getting off Ben Nevis is a technique called a ‘dog-leg’. This is a technique…

  • Beware of Interactions

    Beware of Interactions

    Parallel trials but not lines In a previous post I used an example from Chuang-Stein and Tong(1996) to illustrate…

  • The Main Chance

    The Main Chance

    Almost nobody on LinkedIn will remember The Main Chance, a British television series that ran from 1969-1975 featuring…

    18 条评论
  • Being Just about Adjustment in Clinical Trials

    Being Just about Adjustment in Clinical Trials

    Estimation of the magnitude of effects and of the relevant precision in general needs inclusion of strata parameters…

  • Second things first

    Second things first

    Zero confidence As I have previously pointed out, the idea that point estimates are primary and estimates of their…

    3 条评论