Using AI to Identify the Best Targets for Drug Discovery Programs: A Step-by-Step Guide to PandaOmics 4.0
PandaOmics is an AI tool for target identification, biomarker discovery, and indication prioritization.

Using AI to Identify the Best Targets for Drug Discovery Programs: A Step-by-Step Guide to PandaOmics 4.0

Target selection is where the drug discovery process begins, and it is critical to the success of the program. Choose the wrong target, and you can waste years of research, and millions of dollars, only to have the drug fail in clinical trials.?

Just 10% of drugs in development make it from Phase I to approval. Breaking that down even further, the probability of success for each stage is 63% in Phase I, 31% in Phase II and 58% in Phase III, with huge discrepancies between disease areas. Cancer programs having the lowest clinical trial success rate at just 5%, compared to 26% for hematology programs.?

The major reason for these failed clinical trials is inadequate efficacy, and that, in turn, can be linked to poor target selection.?

Insilico Medicine’s generative AI target discovery tool PandaOmics, now in its fourth iteration, was designed to directly address the issues that make target selection so challenging – vast quantities of diverse data; the complexity of biology and disease interactions; and the time required to identify patterns and test hypotheses – with an intuitive interface and no coding required.?

Below, a step-by-step guide to Insilico’s PandaOmics tool, adapted from a recent webinar by Kyle Tretina, PhD , Alliance Manager of AI Platforms at Insilico Medicine.

How PandaOmics Works

PandaOmics is an AI tool for target identification, biomarker discovery, and indication prioritization that draws on an enormous database of pre-created datasets and meta-analyses, which is updated by Insilico's team of biologists and bioinformaticians. It includes thousands of OMICs data sets for tens of thousands of samples, including the newly supported methylation data type, and hundreds of pre-calculated meta analyses.?

The latest iteration of the app, PandaOmics 4.0, offers a 10-fold increase in the number of pre-calculated disease projects that are available to the user compared to Insilico’s last version and was named for American biochemist and pharmacologist, Gertrude B. Elion. Elion won the 1988 Nobel Prize in Physiology or Medicine for her work on rational drug design – based on understanding of the target – which led to breakthrough new treatments for leukemia, malaria, and other diseases.?

American biochemist and pharmacologist, Gertrude B. Elion, who won the 1988 Nobel Prize in Physiology or Medicine for her work on rational drug design.

“Our app aligns with the pioneering spirit of scientists such as Dr. Elion by offering tools that empower researchers to make extraordinary discoveries into target exploration,” says Petrina Kamya, Ph.D. , Head of AI Platforms at Insilico Medicine and President of Insilico Medicine Canada.?

The tool utilizes over 20 AI and statistical models and allows researchers to rank and prioritize targets and indications in a matter of minutes that's quickly and quantitatively informed by both the scientific data and the business priorities of the user.?

In order to understand how PandaOmics can best help researchers, here’s a case study looking at how it might be used to identify promising targets for one of the most critical public health issues: Type 2 diabetes.?

PandaOmics Case Study: Type 2 Diabetes?

Type 2 diabetes is an epidemic and there's a growing need for safer and more effective treatments.

Type 2 diabetes impacts one in 10 Americans, or over 36 million, and rates are rising, particularly among younger people. This epidemic not only exacts a heavy toll on individual health, but also exerts an escalating burden on healthcare systems worldwide.

Diabetes rates among young people are on the rise. (source: ?SEARCH for Diabetes in Youth Study).

While there are some treatments available, such as the semaglutide Ozempic, it remains an?important area for research in drug discovery, including discovering treatments that are safer?and effective at lower doses.

Diabetes involves a complex interplay of molecular events that disrupts glucose homeostasis across multiple organs, creating a systemic challenge to our bodies. It includes metabolic disturbances driven by pancreatic beta cell dysfunction, and insulin resistance which impedes glucose uptake. The liver, meanwhile, is responsible for releasing stored glucose, and diabetes disrupts this process as well –?exacerbating glucose dysregulation.?

The complexity of this multi-organ disease makes it a perfect challenge for AI to solve. AI can find the patterns behind these processes and identify opportunities to inhibit the progression using tailor-designed new treatments.

Finding Targets with PandaOmics

PandaOmics includes Datasets pages, Genes pages, and Diseases pages with pre-created meta-analyses.

Within the PandaOmics tool, there are:?

  1. Datasets pages that provide a description of all the associated metadata within the system;?
  2. Genes pages that describe the functions of genes as well as Insilico's new indication prioritization feature; and
  3. Diseases pages, which are pre-created meta-analyses, hundreds of which have been added in the latest PandaOmics, 4.0.

Step 1: Meta-analysis Summary

The Summary page includes all the multi-OMICs datasets that went into the meta-analysis, including RNA sequencing, methylation, microarray, and genetic data.

On the Summary page, users can look more deeply into the multi-OMICs datasets that went into the meta-analysis, including RNA sequencing, methylation, microarray, and genetic data, representing, in this case, 21 comparisons from 19 datasets and 349 samples. In addition, they can look at the summary of related text data, including grants, funding, trials, and publications that can be easily browsed, as well as a variety of scores that look at the trends within that text data which tells you how much attention this particular indication has been getting over time and how it’s been trending over the last five years. In addition, there's a summary of the clinical drugs and trials data as well as a great resource of other related data sets that users might want to incorporate in their own meta analysis.

Step 2: Finding the Best Target

In the heat map, every row is a gene and every column is a score relating to different types of data. These are scored against the disease – in this case, diabetes

With PandaOmics 4.0, users have access to a whole suite of analysis modules, including the Target ID module – the “heart” of PandaOmics. In the heat map, every row is a gene and every column is a score relating to different types of data – including OMICs and text data – and mathematical approaches. These are scored against the disease – in this case, diabetes. The higher the score on the scale of zero to one, the more strongly that score supports that target as a good target for the indication of interest.

Step 3: Filtering Genes to fit your target hunting strategy

In addition to the datasets, PandaOmics shows users associated genes. Filters allow users to further refine these choices based on, for instance, druggability, specificity of tissue expression, or specific gene sets, target families, whether or not a structure is available, and the level of clinical development. Each individual score can be turned on and off as well.?

Step 4: Exploring gene-disease associations with Large Language Models

On the gene disease page, all data relevant to that particular gene in the context of diabetes is summarized.

Once the user has identified a potential target of interest – for instance, FGFR1 or the insulin receptor – they can click on that gene’s name in the Target ID module to be redirected to the?gene disease page where all the data relevant to that particular gene in the context of diabetes is summarized. With a click of a button, the page will automatically generate a gene disease report with the aid of large language models.

Using Insilico’s curated data in combination with user data from the meta-analysis, LLMs establish these gene disease associations. The resulting gene-disease report contains four different modules: Target Feasibility, OMICs Evidence, Mechanism of Action, and Path to Market.?

The gene-disease report contains four different modules: Target Feasibility, OMICs Evidence, Mechanism of Action, and Path to Market.?

PandaOmics also includes the therapeutic molecule development potential, which includes indications that are being developed for that target, recent clinical trials, as well as a list of all the drugs that are currently in public databases and targeting that gene.?

Step 5: Competitive Analysis

PandaOmics provides users with a very concise view of all the info they need to determine the business proposition for taking a drug that targets this gene to market.

PandaOmics helps users to interpret the data that was uploaded into the meta analysis in a way that makes it easy to interpret. It includes differential expression data for methylation and other RNA or protein-level expression, and in-depth discussion of the various different genetic evidences from the meta-analysis.

Ultimately, PandaOmics provides users with a very concise view of all the information they need to determine the business proposition for taking a drug that targets this gene to market.

Step 6: New Genetics Enhancements

A genetic variants tab allows the user to access both gene and variant level information.

PandaOmics 4.0 includes a wealth of genetics-related enhancements, including a dedicated genetic variants tab within every meta-analysis. This tab allows the user to access both gene and variant level information so users can expand the table for any individual gene and get fully linked notations for any single nucleotide polymorphism (SNP), clinical significance, and type of variant. There are also different statistics available here as well as the annotation indicating whether or not the gene is a predicted cancer driver for those indications that are cancers.

Step 7: Improved Knowledge Graph

The knowledge graph helps researchers make informed decisions about drug targets and biomarkers faster.

PandaOmics features a transformer-based knowledge graph which provides automated analyses of scientific publications and allows users to easily see the relationships between genes, diseases, compounds, and biological processes.?

By diving into this graph, users get insights into the molecular underpinnings of the disease or the gene of interest within the disease context. The knowledge graph helps researchers make informed decisions about drug targets and biomarkers faster.

PandaOmics 4.0 uses GPT 4.0, the most recent version of GPT models from OpenAI. There are more relationships in the knowledge graph, and the relationships are more updated – showing data including the latest publications available.

Step 8: Indication Prioritization Tab

With the indication prioritization tab, users can find an indication that is appropriate and supported by both the OMICs and text data.

The indication prioritization tab is a new module that is drawing a lot of interest from users. In the latest version of PandaOmics, users can quantitatively prioritize diseases for a given target with a layout similar to the Target ID page. So whereas in the Target ID module there’s an indication and users are trying to identify genes that might make really good targets for that indication, here, it’s doing the reverse. Users have a gene of interest, and a drug that targets this particular gene. They can use the indication prioritization tab to find an indication that is appropriate and supported by both the OMICs and text data.

Every row is an indication and every column is a score, similar to the target ID page, with scores scaled from zero to one. Users can group by pathology or tissue, or leave them ungrouped. They can also sort by overall rank or individual scores.

Every row is an indication and every column is a score, similar to the target ID page, with scores scaled from zero to one.

How to License PandaOmics?

The default licensing period for PandaOmics is 12 months, and pricing depends on the number of users included in the subscription as well as the size of the organization. Insilico offers special rates for research groups, nonprofit organizations and governmental bodies and recently launched an academic grant application to streamline the process for interested research partners. If you are in a group actively working on a grant application, please reach out to us. We'll provide support with the documentation and we'd love to be your AI partner for your research project.

Insilico also offers a 7-day trial period to give potential clients access to the full version of PandaOmics.?

Learn more and request a demo! https://insilico.com/pandaomics
#ai #drugdiscovery #targetidentification #targetid #platform #biotech #pharma #target #diabetes #type2diabetes #bioinformatics #biology #disease


Serge S.

AI Technology Consultant | AI and ML expertise

1 年

Informative article on PandaOmics' advanced features. Any user feedback?

回复

要查看或添加评论,请登录

Insilico Medicine的更多文章

社区洞察

其他会员也浏览了