AI Deep Learning Accelerates Drug Development

AI Deep Learning Accelerates Drug Development

Deep learning is a subset of artificial intelligence (AI) that mimics the neural networks of the human brain to learn from large amounts of data, enabling machines to solve complex problems. Deep learning technology has made significant progress in the biomedical field. Researchers have developed a series of application based on deep learning for disease diagnosis, protein design, and medical image recognition. The pharmaceutical industry is also beginning to recognize the importance of deep learning technology, hoping to leverage it to accelerate drug development and reduce costs.

Application of Deep Learning in Drug Development

Previous studies have demonstrated that deep learning technology offers significant advantages in several key areas of drug development, including optimization of chemical synthesis routes, ADME-Tox prediction, target identification and validation and generation of novel molecules.


Figure 1. A broad overview of drug development and the place of virtual screening in this process[1].

Virtual Screening: Protein-Ligand Affinity

Deep learning can learn and identify potential binding patterns by comparing known protein-small molecule binding instances. During the training process, the deep learning models continuously optimize their parameters to enhance the accuracy and reliability of their predictions.

Yelena Guttman et al. developed a CYP3A4 inhibitor prediction model based on DeepChem framework. They created a KNIME workflow for data curation and employed the DeepChem module in Maestro to build a categorical classifier. This classifier was then used to virtually screen approximately 68,900 compounds from the FooDB database, leading to the successful identification of two new CYP3A4 inhibitors[2].


Figure 2. Prediction of CYP3A4 Inhibitors Based on DeepChem[2].A workflow in KNIME analytics platform 4.0.314 was created to prepare and analyze the virtual screening.


ADME-Tox Prediction

Poor pharmacokinetic properties as well as toxicity issues are considered the main reasons for terminating the development process for drug candidates. Thus, there is an increasing need for robust screening methods to provide early information on absorption, distribution, metabolism, excretion, and toxicity (ADME-Tox) properties of compounds. Many studies have shown by leveraging these extensive ADME datasets, deep learning models can automatically identify and extract complex relationships between compound features and their corresponding ADMET properties. These trained models can then be used to predict the ADME properties of new compounds, thereby accelerating the process of drug discovery and development.

Liu et al. utilized directed message passing neural networks (D-MPNN, Chemprop) to predict the Nrf2 dietary-derived agonists and safety of compounds in the FooDB database. They successfully identified Nicotiflorin, a drug that exhibits both agonistic activity of Nrf2 and safety, which was validated in vitro and in vivo[3].


Figure 3. Using Deep-Learning Model D-MPNN to Assess Drug Safety[3].

Optimize Chemical Synthesis Routes

In recent years, it has been seen that artificial intelligence (AI) starts to bring revolutionary changes to chemical synthesis. However, the lack of suitable ways of representing chemical reactions and the scarceness of reaction data has limited the wider application of AI to reaction prediction. Deep learning is increasingly being applied to chemical synthesis, enabling the automatic identification and extraction of features and patterns from large datasets. This capability enhances the prediction of the efficiency and selectivity of new synthesis routes, significantly accelerating drug development and production.

Li et al. introduced a novel reaction representation, GraphRXN, for reaction prediction. GraphRXN directly takes the 2D molecular structures of organic components as input and learn the task-related representations of chemical reaction automatically during training and achieves on-par or slightly better performance over the baseline models[4]. Segler et al. combined Monte Carlo tree search with an expansion policy network that guides the search, and a filter network to pre-select the most promising retrosynthetic steps[5]. These study have demonstrated that deep learning model could yield moderate to good accuracy in reaction prediction regardless of limited size of the datasets and many complex influencing variables.


Figure 4. A deep-learning graph framework, GraphRXN, was proposed to be capable of learning reaction features and predicting reactivity[4].

Drug Screening Based on Deep Learning

The application of deep learning in the field of virtual screening primarily involves using neural networks to predict the activity or properties of compounds, thereby identifying potential candidate drugs or materials in a virtual environment. Commonly used deep learning models include Convolutional Neural Networks (CNN), Graph Neural Networks (GNN), Recurrent Neural Networks (RNN), Generative Adversarial Networks (GAN) and Transformer models.

  • CNNs excel at identifying patterns and features in structured data, such as chemical structures represented as images or graphs. Recent studies have demonstrated their effectiveness in predicting drug-drug interactions and assessing molecular properties by analyzing chemical substructures and other relevant features.
  • GNNs are designed to work directly with graph-structured data, making them particularly suitable for representing molecular structures where atoms are nodes and bonds are edges. They have shown remarkable performance in drug discovery by capturing the complex relationships between molecules and their properties.
  • RNNs are designed to handle sequential data, making them particularly effective for tasks where context from previous inputs is essential.?
  • GANs consist of two neural networks—a generator and a discriminator—that work against each other to create new data instances.
  • Transformers have gained popularity for their ability to handle sequential data and capture long-range dependencies, making them suitable for tasks like natural language processing and time-series analysis.?


In summary, deep learning is revolutionizing drug development by enhancing efficiency, accuracy, and cost-effectiveness across multiple stages of the process. As technology continues to evolve, its integration into pharmaceutical research is likely to deepen, paving the way for innovative therapeutic solutions.

References:

[1] Rifaioglu AS, et al. Brief Bioinform. 2019 Sep ;20(5):1878-1912.

[2] Guttman Y, et al. J Agric Food Chem. 2022 Mar ;70(8):2752-2761.

[3] Liu S, et al. J Agric Food Chem. 2023 May ;71(21):8038-8049.

[4] Li B, et al. J Cheminform. 2023 Aug;15(1):72.

[5] Segler MHS, et al. Nature. 2018 Mar ;555(7698):604-610.


Products:

Virtual Screening

MedChemExpress (MCE) provides high quality virtual screening service that enables researchers to identify most promising candidates. Based on the laws of quantum and molecular physics, our virtual screening services can achieve highly accurate results. Our optimized virtual screening protocol can reduce the size of chemical library to be screened experimentally, increase the likelihood to find innovative hits in a faster and less expensive manner, and mitigate the risk of failure in the lead optimization process.

?

50K Diversity Library

MCE 50K Diversity Library consists of 50,000 lead-like compounds with multiple characteristics such as calculated good solubility (-3.2 < logP < 5), oral bioavailability (RotB <= 10), drug transportability (PSA < 120). These compounds were selected by dissimilarity search with an average Tanimoto Coefficient of 0.52. There are 36,857 unique scaffolds and each scaffold 1 to 7 compounds. What’s more, compounds with the same scaffold have as many functional groups as possible, which make abundant chemical spaces.

?

MegaUni 10M Virtual Diversity Library

With MCE's 40,662 BBs, covering around 273 reaction types, more than 40 million molecules were generated. Compounds which comply with Ro5 criteria were selected. Inappropriate chemical structures, such as PAINS motifs and synthetically difficult accessible, were removed. Based on Morgan Fingerprint, molecular clustering analysis was carried out, and molecules close to each clustering center were extracted to form this drug-like and synthesizable diversity library. These selected molecules have 805,822 unique Bemis-Murcko Scaffolds (BMS) with diversified chemical space. This library is highly recommended for AI-based lead discovery, ultra-large virtual screening and novel lead discovery.

?

MegaUni 50K Virtual Diversity Library

MegaUni 50K Virtual Diversity Library consists of 50,000 novel, synthetically accessible, lead-like compounds. With MCE's 40,662 Building Blocks, covering around 273 reaction types, more than 40 million molecules were generated. Based on Morgan Fingerprint and Tanimoto Coefficient, molecular clustering analysis was carried out, and molecules closest to each clustering center were extracted to form a drug-like and synthesizable diversity library. The selected 50,000 drug-like molecules have 46,744 unique Bemis-Murcko Scaffolds (BMS), each containing only 1-3 compounds. This diverse library is highly recommended for virtual screening and novel lead discovery.


要查看或添加评论,请登录

MedChemExpress LLC的更多文章

其他会员也浏览了