登录查看更多内容

Can machine learning help predict opioid addiction?

Sean Shiverick, MS, PhD

Research | Analytics | Consulting

发布日期: 2017年12月19日

Health informatics is generating huge amounts of data at a rapid pace, from electronic medical records (EMRs), clinical research data, to population-level public health data. In 2014, over 2 million Americans were dependent or abused prescription opioids such as oxycodone or hydrocodone (CDC, 2017), and overdose deaths from prescription opioids have quadrupled since 1999, resulting in more than 180,000 deaths between 1999 to 2015 (NIDA, 2017, Rudd, et al., 2016). For millions of people struggling with substance abuse, addiction and relapse are chronic health conditions. What are the risk factors of opioid addiction for patients prescribed pain medications for routine medical procedures? How can the tools and techniques of data science help address the opioid crisis? One approach is to use supervised learning to identify demographic characteristics and features important for predicting prescription pain reliever abuse.

In a recent project, I compared the performance of several supervised learning procedures (e.g., linear models, decision trees, and random forests) on data from the National Survey on Drug Use and Health for 2015 (https://datafiles.samhsa.gov). The NSDUH is a comprehensive survey on all aspects of substance use, misuse, abuse, dependency, and addiction for a wide range of both prescription medications and illicit drugs, and includes a number of demographic characteristics (e.g., age, education level, employment, marital status), mental health attributes (e.g., adult depression), substance treatment, and mental health treatment. The NSDUH-2015 dataset consists of 57,146 observations with over 2,000 features, many of which are binomial: "Have you used X in the past year?" (e.g., Hydrocodone, Oxycodone, Tramadol, Morphine, Fentanyl, Oxymorphone, Demerol, Hydromorphone). Several aggregated variables were constructed for Any Pain Reliever Use, Pain Reliever Misuse and Abuse (Likert scale, 0-9), Heroin Use, Tranquilizer Use, Sedative Use, Cocaine Use, Amphetamine Use, etc.

First, the data were fit to the Lasso regression model (L1 penalty) using the glmnet package in R (Hastie, et al., 2009), which automatically calculates coefficient estimates for a wide range of lambda values. As lambda becomes very large, the lasso forces the values of many non-relevant coefficients to be equal to zero. The lasso has an advantage over ridge regression in that the resulting coefficient estimates are sparse, and only a subset of the predictors are selected in the model. Cross validation was used to select the optimal value of lambda, and the features with the highest coefficients were, in descending order: substance Treatment, Heroin use, Cocaine use, Amphetamine use, and Tranquilizer use.

Decision trees are commonly used for classification or regression and provide a solution that is easy to interpret. The data was fit to decision tree regression model that was pre-pruned to a maximum depth of 4. Substance treatment was selected as the root node at the top of the tree, with the branch to the left (low or no treatment) further dividing by Cocaine use. High scores for cocaine use branched further according to Heroin use, which ended in terminal leaf nodes. This indicates that individuals who reported using illicit drugs such as cocaine and heroin were also likely to abuse prescription pain medication. Following the right branch from the root node, high scores for treatment then divided according to Tranquilizers, suggesting that for individuals who received treatment, prescription tranquilizer use was associated with abuse of opioid pain relievers.

Random forests is an ensemble method that builds many different uncorrelated trees and then averages them to reduce variance. The advantage of random forests is that it provides a more accurate model, but can be more difficult to interpret. A random forests regression model was fit on pain reliever misuse and abuse with 500 trees and three variables considered at each split (e.g., mtry=3). The model accounted for 26 percent of the variance in opioid pain reliever misuse and abuse. The random forest model calculates feature importance by the percent increase in MSE and increase in node purity. The most important features selected for predicting pain reliever medication abuse were Tranquilizers, Treatment, Heroin use, Cocaine use, and Amphetamine use, in order of importance (tranquilizers and treatment with approximately equal ratings).

Comparing different supervised learning methods can be useful for deciding which model is the best choice for a given dataset. All of the models considered here selected the same five features as most important for predicting opioid pain reliever abuse; however, the models differed in their selection of the feature that was most informative of pain reliever abuse. A silver lining is that people who reported misusing prescription pain relievers were also likely to have received substance treatment. More than any demographic characteristic, the use of prescription tranquilizers and illicit drugs such as heroin, cocaine, or amphetamines were associated with the abuse of pain medications. Although the majority of respondents in the sample (90 percent) had never used pain reliever medication, approximately ten percent of the sample reported misusing opioid pain relievers, and only 1.6 percent reported ever using heroin. The opioid crisis may be driven in part by the widespread availability of pain medications and synthetic opioids. Additional evidence is needed in order to identify demographic characteristics associated with prescription opioid abuse. Even for people with no previous history of drug use, exposure to highly addictive opioid medications may put them at risk for opioid dependence or addiction.

References

Centers for Disease Control and Prevention (CDC, 2017). Drug overdose deaths in the United States continue to increase in 2015. https://www.cdc.gov/drugoverdose/epidemic/index.html

Trevor Hastie, Robert Tibshirani, and Jerome Friedman. (2009). The Elements of Statistical Learning. Springer. https://web.stanford.edu/~hastie/ElemStatLearn/

Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani. (2013). An Introduction to Statistical Learning. Springer. https://www-bcf.usc.edu/~gareth/ISL/

National Institute on Drug Addiction (NIDA, 2017). Opioid Overdose Crisis. https://www.drugabuse.gov/drugs-abuse/opioids/opioid-overdose-crisis

Rose A. Rudd, Noah Aleshire, Jon E. Ziebell, and R. Matthew Gladden. Increases in Drug and Opioid Overdose Deaths — United States, 2000–2014. Centers for Disease Control and Prevention (CDC) Morbidity and Mortality Weekly Report (MMWR). January 1, 2016 / 64(50);1378-82. https://www.cdc.gov/mmwr/preview/mmwrhtml/mm6450a3.htm\

William Liao

Product Analytics @ HubSpot

7 年

Fantastic! Thanks for sharing, Sean.

1 次回应

查看更多评论

要查看或添加评论，请登录

Sean Shiverick, MS, PhD的更多文章

Comparing AI-Generated Text with Human Language

2023年8月26日

Comparing AI-Generated Text with Human Language

Large Language Models (LLMs) Recent advances in artificial intelligence (AI) have led to the development of large…
Controlling the Difficulty of Automatically Generated Questions

2023年8月9日

Controlling the Difficulty of Automatically Generated Questions

Automated Question Generation (AQG) Over the past two decades, educators and researchers have been actively developing…

3 条评论
Predictive models of student performance for data-driven learning analytics.

2019年6月17日

Predictive models of student performance for data-driven learning analytics.

The development of analytic approaches for predictive modeling allows researchers and educators to detect patterns in…

2 条评论
Discussing climate change with relatives at the holidays

2018年12月28日

Discussing climate change with relatives at the holidays

How do you talk about climate change with family members who do not believe it is real? At a recent family event, I had…

10 条评论
Comparing Classifier Models of Prescription Opioid Misuse

2018年12月17日

Comparing Classifier Models of Prescription Opioid Misuse

The misuse and abuse of prescription opioids (MUPO) has become a major health crisis in the U.S.

16 条评论
Modeling Opioid Pain Reliever Misuse and Abuse

2018年5月22日

Modeling Opioid Pain Reliever Misuse and Abuse

Opioid abuse is often modeled as a discrete outcome that describes the likelihood that an individual will misuse or…

4 条评论
Exploratory data analysis, formulating questions, and visualization.

2017年9月2日

Exploratory data analysis, formulating questions, and visualization.

Coming up with good questions is probably one of the hardest parts about designing a research study. If you have taken…
Statistical Learning and Machine Learning: Similarities and Differences.

2017年9月1日

Statistical Learning and Machine Learning: Similarities and Differences.

Venn diagrams are often used in data science to illustrate areas of overlap and distinctions between statistics and…

1 条评论

See all articles

Can machine learning help predict opioid addiction?

Sean Shiverick, MS, PhD

Research | Analytics | Consulting

Sean Shiverick, MS, PhD的更多文章

社区洞察

其他会员也浏览了

Integrating Primary Care in Addiction Treatment: A Strategic Advantage

Analyzing the impact of community-based interventions on reducing trauma-related harm for drug users in Brussels

This Just In! The 2/13 Edition of the CTeL "inTeL" is Now Here!

Risks PE Firms Should Consider

Spotlight on the News

Risks PE Firms Should Consider

Educate Clinicians and Collaborate to Combat Opioid Addiction: AHA

Brain implants used to fight drug addiction in US

Real-world evidence study: assessing the benefits of Rapid Access Addiction Medicine clinics in Ontario for problematic opioid use

Why Aren’t More Physicians Prescribing Buprenorphine?

Sean Shiverick, MS, PhD的更多文章

Comparing AI-Generated Text with Human Language

Controlling the Difficulty of Automatically Generated Questions

Predictive models of student performance for data-driven learning analytics.

Discussing climate change with relatives at the holidays

Comparing Classifier Models of Prescription Opioid Misuse

Modeling Opioid Pain Reliever Misuse and Abuse

Exploratory data analysis, formulating questions, and visualization.

Statistical Learning and Machine Learning: Similarities and Differences.

社区洞察

其他会员也浏览了

Integrating Primary Care in Addiction Treatment: A Strategic Advantage

Analyzing the impact of community-based interventions on reducing trauma-related harm for drug users in Brussels

This Just In! The 2/13 Edition of the CTeL "inTeL" is Now Here!

Risks PE Firms Should Consider

Spotlight on the News

Risks PE Firms Should Consider

Educate Clinicians and Collaborate to Combat Opioid Addiction: AHA

Brain implants used to fight drug addiction in US

Real-world evidence study: assessing the benefits of Rapid Access Addiction Medicine clinics in Ontario for problematic opioid use

Why Aren’t More Physicians Prescribing Buprenorphine?