Factors to Consider When Building and Evaluating AI-Based Tools in Medicine
? Population at Risk: The population in which the model is developed and the population(s) for which the model is deployed should match closely if the model is to perform adequately.
? Outcome of Interest: The outcome used for modeling must closely match the outcome of interest in the clinical setting. Poor proxies will create more harm and less value.
? Time Horizon: The timeframes used in the model must be relevant and applicable to the populations for which the model is being applied.
? Predictors: Are clinical predictors measurable within the model’s clinical use context, and can they be measured without bias? If you are using multiple predictors, do they add value, or just complexity?
? Mathematical Model: Despite the current vogue for complex (and sometimes inscrutable) machine learning models, do the job with the simplest model that will achieve your goal.
? Model Evaluation: What can you truly do?
? Translation to Clinical Decision Support: How is the model going to be used in the clinical setting? What is the value of the tool, beyond the assessment of the model?
? Clinical Implementation: Monitoring and maintenance bring the model back full circle to the beginning.