See how the right data fuels AI-assisted drug discovery in these 3 case studies

See how the right data fuels AI-assisted drug discovery in these 3 case studies

Driven by concerns over the typically long and costly timelines in drug development, pharma companies and many emerging biotechs have endeavored to figure out how AI and machine learning can assist medicinal chemists in drug discovery. Already, some AI-assisted discovery programs are proving successful, cutting the typical 3-5 years needed to identify preclinical candidates down to 12-18 months.

Key to quickening this process is making the time-consuming Design, Make, Test and Analyze (DMTA) cycle more efficient (“cycle less, but better”). The design phase in particular could benefit from AI tools and modeling – however, how well AI can shorten that cycle depends on the quality of data used in modeling.

One such source of quality data is Reaxys. Manually curated, Reaxys is an expertly organized medicinal chemistry database containing normalized substance-target affinity data for over 8.4 million unique substances and 39,000 targets, sourced from 770,000 documents and patents. It also includes comprehensive pharmacokinetic, efficacy, toxicity, safety and metabolic profiles, as well as data from in vivo animal studies.

The massive body of bioactivity data in Reaxys is well-suited to train AI-based models that answer questions for compound identification and optimization.

In the following examples, which are featured in this newly released white paper, you can see how Reaxys target and bioactivity data have been used for virtual compound screenings and a priori risk assessment of adverse drug reactions. Both uses decrease DMTA iterations by maximizing the likelihood that selected compounds will succeed before synthesizing and testing each.

  • A model trained on Reaxys bioactivity data finds matrix metalloprotease inhibitors among a library of natural products in Reaxys [1]

Matrix metalloproteases (MMPs) are responsible for the degradation of extracellular matrix components. Excess expression and activity induced by ultraviolet light contribute to skin aging, which may be ameliorated by an MMP inhibitor. Gimeno, A. et al. developed a virtual screening (VS) workflow to identify candidate compounds that target the conserved catalytic region of binding sites in a set of five MMPs.

The VS included four filtering steps:

(1) A random forest model trained on bioactivity data, such as IC50 and Ki for over 50,000 compounds, from Reaxys and ChEMBL

(2) Protein-ligand docking using structures from the Protein Data Bank

(3) A pharmacophoric filter

(4) An electrostatic similarity analysis

They applied the VS to the Specs compound library (more than 45,711 compounds) and extracted hits identified in two or more VS. Of those, they sourced 20 compounds to validate the VS workflow in vitro. Having validated the method, they ran all natural products in Reaxys with a molecular weight of 300–600 Da through the VS workflow. The screening resulted in 183 identified candidates, of which 49 were hits in three or more VS. That two compounds had already been reported to inhibit MMPs and another two were natural products already used in skin applications underscores the quality of the hits. The authors plan to examine the remaining compounds for possible skin treatments.

  • Reaxys structure-activity data train a virtual screening model that improves hit rates for bromodomain inhibitors [2]

Bromodomains are variations on a protein domain that recognize acetylated lysine residues and transduce the corresponding signal into normal or abnormal phenotypes. As such, bromodomain inhibitors are actively pursued as clinical candidates to treat cancer and multiple sclerosis.

Seeking to identify novel binders of the bromodomain BRD4, Casciuc, I. et al. used docking and structure-activity data from 1,221 compounds in Reaxys and 672 compounds in ChEMBL to train automated virtual screening (VS) models. They built several support vector machines (SVMs), generative topographic mapping, and structure pharmacophore models to virtually screen 2 million compounds in a proprietary library from Enamine.

An initial compound selection based on consensus between the different models underwent docking analysis to further reduce the pool to 3,000 molecules that were then tested as ligands of BRD4. Concurrently, 3,000 compounds were randomly screened from the same library for similar testing.

The VS models delivered 29 experimentally confirmed BRD4 ligands, representing a 2.6-fold improved hit rate over the random screening.

  • ?Pharmacological and chemical data from Reaxys reveal patterns to predict adverse drug reactions [3]

Looking to anticipate adverse drug reactions (ADRs), Ferro, C. J. et al. used physicochemical, blood-brain barrier, pharmacokinetic, and pharmacological property data to predict the likelihood of ADR for each of four commonly used oral anticoagulants: apixaba, dabigatran, edoxaba and rivaroxaban.

They built a predictive model with Reaxys data covering off-target effects, normalized target-affinity data, volume of distribution, plasma protein binding, renal excretion, and blood-brain barrier penetration properties like pKa and clogD7.5. The model highlighted property thresholds predictive of ADR risk. Based on these, the authors made predictions about possible ADRs associated with each of the four anticoagulants and used real-world data from the MHRA Yellow Card database and prescription rates in the UK to confirm or refute the predictions.

In general, the predictions held true. Importantly, the authors predicted that dabigatran would have the least clean off-target profile based on chemical properties related to on-target efficiency, like the degree of nonspecific interacting lipophilic components in a drug. And indeed, dabigatran showed the most overall ADRs and the highest rate of fatalities.

Read the full white paper “Fit-for-purpose data is key to meaningful AI for drug discovery” to learn more.

#Research #DrugDiscovery #WhitePaper


[1] Gimeno, A. et al. 2021. Identification of broad-spectrum MMP inhibitors by virtual screening. Molecules 26: 4553. doi: 10.3390/molecules26154553

[2] Casciuc, I. et al. 2019. Pros and cons of virtual screening based on public “Big Data”: in silico mining for new bromodomain inhibitors. Eur. J. Med. Chem. 165: 258. doi: 10.1016/j.ejmech.2019.01.010

[3] Ferro, C.J. et al. 2020. Relevance of physicochemical properties and functional pharmacology data to predict the clinical safety profile of direct oral anticoagulants. Pharmacol Res Perspect. e00603. doi: 10.1002/prp2.603

Abdelrahman M. Saad

Final-year Medical Student - Clinical Researcher | Bioinformatics Enthusiast | Business Developer | Leveraging AI, Data Science and Business Acumen for Healthcare Transformation

10 个月
Joseph Pareti

AI Consultant @ Joseph Pareti's AI Consulting Services | AI in CAE, HPC, Health Science

10 个月
回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了