The Hardest Problem There Is
Stephen Ruberg
Distinguished Statistical Scientist at Analytix Thinking, LLC -- Fellow of the American Statistical Association
Some former Lilly colleagues of mine and I just published a paper on subgroup identification and how we used a general platform for evaluating subgroup identification tools in practice. Subgroup identification is the Holy Grail of medicine and, accordingly, the statistical issues related to it are the hardest problem there is. For predictive subgroups, how do we sort through a plethora of covariates to find the one(s) that predicts who will respond best (or worst) to a treatment (or other intervention)? It is also related to "digital medicine" and the multitudinous efforts there for identifying prognostic subgroups. How do we sort through multitudinous covariates to identify patients who are most likely to progress toward some disease or deleterious outcome (Alzheimer's, sepsis, COVID hospitalization/ICU/death)? In both areas - predictive and prognostic subgroup identification - there are many, many, many more failures than successes. Yet, the medical literature is still full of "promising subgroups" that will rescue mediocre or failed treatments, and the digital medicine literature is still filled with ML/AI algorithms/models that will revolutionize medicine, but fail in practice.
Here is our article that describes our work on predictive subgroups and hnow difficult it is to identify the right subgroup in a clinical trial setting. It is Open Access so take a look. There is a lot more to be done, so feel free to join in the challenge.
Also, here is a paper that further describes why this is the Hardest Problem There Is.
And stay tuned for more research and work I am doing on prognostic and diagnostic algorithms. Hopefully I will have more to share in the coming months.
Finally, If you do not believe me about digital medicine and the misuse of AI, just read this.
Senior Solutions Architect, Healthcare and Life Sciences at NVIDIA
1 年Looking forward to reading this - thanks for the insightful intro!
Digital Health | Clinical Innovation | Data Science | Design of Experiments | Business Analytics | Objectives -> Data -> Evidence -> Decisions
1 年It’s great effort to make simulated data available to test methods in a structured way - long overdue … while it’s an interesting read as it is, I wonder if you would be open to share methods ranking scores-wise?