Is surviving COVID in our blood? Real-world data gives support to the answer.
Bryan Farrow
Senior Director of Product Marketing leading a cross-functional marketing team at Certara
"How bad could this get?"
A lot of us are asking that question now, on topics from pandemics to political strife. Care providers hear it everyday, from patients with more personal concerns. How is their case--of cancer, heart disease, depression, you name it--likely to progress? Clinicians don't have crystal balls. What they do have are prognostic biomarker studies.
These studies, which examine relationships between baseline observations and outcomes, beat at the heart of personalized care. The best ones draw on large, diverse, and carefully defined cohorts for good reasons: reducing bias, detecting signal from noise, and understanding how strata like sex, age, and race affect outcomes, to name just a few. But "large" and "diverse" are relative terms: healthcare institutions are often limited to enrolling their own patients. Do their findings hold up when tested against populations that are magnitudes larger than the study cohort?
Meta-analyses help us answer that question. Consider a recent article from Biomarkers in Medicine, Red blood cell distribution width as a prognostic biomarker for viral infections: prospects and challenges, authored by Oloche Owoicho and peers. The review surveys the prognostic power of RDW (red blood cell distribution width, or the range in volume and size of a patient's red blood cells) in the course of viral infections. Does elevated RDW at baseline predict greater severity in cases of hepatitis, COVID-19, and other infectious diseases? The answer, as expected, is nuanced, and we're well advised not to place extraordinary bets based on a single value from a complete blood count. But if one takeaway sticks right now, with one SARS-CoV-2 variant raging and another entering the fray, it's the association between elevated RDW and 30-day mortality from COVID-19. Most analyses included on the authors' table of studies point to a link. These studies enrolled anywhere from 45 to 1,641 patients each. The aggregate results are suggestive. Still, the limited sample sizes of the individual studies might temper our impulse to make any final conclusion. The authors make explicit other limitations, too, including a lack of uniform lab standards and reference ranges.
So what can we do to ground or dismiss our intuition that a real association is at play here? We could search the literature for more related studies. There’s plenty to be said for that approach. Different researchers with different backgrounds, methodologies, and data sources can enlarge our view of a problem, suggesting hidden variables and alternate explanations. But two biases will hound us on this road: our own confirmation bias (are we taking contradictory results as seriously as confirmatory ones?) and publication bias (are those contradictory results seeing print in the first place?).
I want to suggest another way forward: trying to replicate the results on a new and larger dataset. As a TriNetX user, I can query more than 100 million de-identified patients on a single network--records whose data has been standardized to terminologies like ICD-10 and LOINC, and that have, in millions of cases, been matched to public and private death registries, medical claims, and pharmacy claims. Paired with our platform’s analytic features, the power of this resource has humbled even the most data-fluent of clinicians.
What follows is just sketch work, not a statistically rigorous exercise. I make some assumptions and take some risky shortcuts, such as treating continuous variables, like RDW, as binary ("elevated" vs. "non-elevated"). But I offer it as a recipe for forming hypotheses, quickly obtaining directional results, and inspiring more scrupulous work on a problem.
So...
Q: Can I show that elevated RDW at baseline predicts an increased risk of 30-day mortality from COVID?
Bonus: Can I establish anything about its independence as a predictor? (How predictive does elevated RDW remain after controlling for elevation in other markers, like WBC?)
To start, I'll define a cohort on the TriNetX query builder. From a pool of 152 million de-identified records at 125 healthcare organizations around the world, I ask for records showing:
Why did I require more than an RDW value at baseline? Eventually, I want to understand how variations in other values impact the predictive power of RDW on its own. (If I control for, say, C-reaction protein levels, is elevated RDW still associated with increased mortality? If the association is weakened, by how much?) The analytes listed here are some, but by no means all, of the markers that related literature has offered as predictors of severity. Naturally, I won't be able to say anything about the absolute independence of RDW as a predictor with this cohort. In fact, I'm not even aiming to show its complete independence from these analytes. Rather, I want to get a sense how of how my analysis changes when co-variates are included, so I could better plan my next steps.
My query returned a cohort of 46,330 patients: a nice, robust set. Here are some baseline characteristics of interest:
Next, I'll split this cohort into two groups: High RDW (at least one RDW result >=14.5% and none <14.5% at baseline) and Normal RDW (at least one RDW result <14.5% and none >=14.5% at baseline). Why a cutoff of 14.5%? While sources vary in their definition of "elevated", and often set the thresholds differently for men and women, 14.5% was the most cited value in my unscientific survey of credible reference ranges.
Now, these are broad groups. If exactitude is the goal, categorizing continuous variables into high and normal bins is not a best practice. If I do find any differences between these groups, I can't assume a correlation between RDW and mortality. To show such a relationship, my best bet would be an analysis that treated the predictor and the dependent variable as continuous (in the latter case, by considering days' survival, without this artificial, 30-day cut off). Depending on the model I use, I may need to transform either or both of the variables, and censor mortality endpoints for patients lost to follow-up. And I could do that, by downloading the patient-level data from my initial cohort. But at this stage I'm looking for clues, not assurances.
Running these modified queries produces two groups:
(Some patients from the original cohort fell out of my analysis during this stratification because they showed both high and normal results during the baseline window of +/- 1 day from the COVID diagnosis.)
To compare 30-day mortality between these two groups, I'll turn to our Compare Outcomes analytic. In under a minute, after specifying the COVID diagnosis (with lab values) as my index event, specifying death as my outcome, and limiting my time horizon to 30 days, I retrieve:
These results pique my interest. Both the relative risk of 2.32 to 2.52 and the hazard ratio of 2.25 (for curves that appears quite proportional) suggest a real and sizeable difference between these two groups. If I haven't quite established the predictive power of RDW to my own satisfaction, I have by now strengthened my conviction that there's something here.
What I haven't done, even a little bit, is assess the independence of RDW as a prognostic biomarker. I hinted at this earlier, but what exactly is independence? My favorite explanation comes from a 2005 article from Daniel Brotman and Michael Lauer in Archives of Internal Medicine.
"Gray hair is undoubtedly a risk factor for myocardial infarction (MI): on average, a gray-haired person is more likely to suffer an MI than a person without gray hair. However, if 2 men are identical in age but only 1 has gray hair, it is unlikely that the gray-haired man is at increased risk for MI. Thus, hair color is probably not an independent risk factor for MI when age is considered."
Is elevated RDW analogous to the gray hair in the example above?
I can begin answering that question by controlling for elevated and non-elevated levels of the other analytes before making my comparison. Here too, a rigorous approach would require exact values for all or most patients; an approach TriNetX supports by licensing patient-level data. For "coarse controlling", I will further stratify my groups: for every analyte (not just RDW) I will label the patient as above the normal range or not.
When I compare the risk of mortality between my two groups taking these strata into account, does elevated RDW remain predictive? If so, how predictive, compared to the uncontrolled results?
Rather than proceed strata by strata for every analyte, which could introduce a multiple comparisons problem requiring adjustment, I am going to match patients with propensity scores. Balancing my cohorts in this way helps me compare apples with apples and orange with oranges. (Although, as "above normal" still allows for a lot of variation, you could say I am comparing fruit with fruit and vegetables with vegetables, at best.)
Here are the measures of association after balancing the cohorts with propensity score matching:
Interesting. Elevated RDW still seems to show a strong association with greater mortality risk. My sense is that PSM matching, while helpful, might not be the best tool here. PSM relies on logistic regression, which works best when there is little to no correlation between the input variables. My four input variables all tend to rise when infection or inflammation is present, so in my cohorts, they are likely rising together. Assuming the sub-groups are large enough, maybe eight strata-by-strata comparisons is the way to go. Let's pick one feature, leukocytes, and focus on patients in the high range. Here's what we get when we compare high and normal RDW patients, both with a WBC >=11x10^3/uL:
Remarkable. High RDW still holds on to its predictive power, with a relative risk greater than 2.
It looks like RDW isn't gray hair after all. Of course, that doesn't mean it's the only or best predictor. In fact, the relative risk for high leukocyte patients, even when RDW is high, comes out to 1.72. Also, I didn't control for a whole host of clinical observations and co-morbidities like COPD, all of which may be far better predictors of severity.
What I did do, in an hour's worth of exploration, is start confirming some important research. I need to get my hands on the raw data and keep feeding my data hobby.
Until then, I'm going to measure all my red blood cells.