Genetic mechanisms of disease: a comparison of ClinGen and Gene2Phenotype resources
Understanding the genetic mechanism of disease is at the heart of interpreting genetic variation, including copy number variants (CNVs). Genes that can cause disease through a mechanism of loss of function because of decreased or absent gene product would in theory be sensitive to the effects of copy loss variants. Similarly, those that can cause disease through an increase in gene product would be sensitive to the effects of full gene copy gains.
?
Establishing mechanism of genetic disease requires gathering evidence, mostly from the published literature, demonstrating a link between loss or gain of gene product and disease. Once established, this mechanism can then be applied to the evaluation of any genetic variant observed in the gene. Access to pre-curated genetic mechanisms of disease for routine gene panel testing can dramatically reduce the time spent interpreting CNVs.
?
There are currently two public resources of curated genetic disease mechanisms that variant scientists rely on for variant interpretation, one from ClinGen – the Dosage Sensitivity resource, and the other from EMBL-EBI - the Gene2Phenotype (G2P) resource. These resources differ in the number of genes evaluated to date, the age of the curation and the nomenclature used to describe the mechanism of disease. Moreover, the ClinGen resource is used in most U.S. labs, while the G2P resource is commonly used by labs in the U.K.
?
The ClinGen Dosage Sensitivity database is an expertly-curated public resource that includes the haploinsufficiency (HI) and triplosensitivity (TS) status of different genes. ClinGen has been curating HI/TS status since 2011 and the database currently includes 376 genes with sufficient evidence (score 3) of HI for at least one disorder and only two genes with sufficient evidence (score 3) of TS. Moreover, as shown in the graph, 60% of genes evaluated for HI were last evaluated over seven years ago.
The EMBL-EBI’s G2P resource has been curating genetic mechanisms of disease genes since 2019. Data are currently available across six gene panels comprising 4497 disease-gene pairs (2857 genes). This includes Absent or Decreased Gene Product Level for 2968 disease genes and Increased Gene Product Level for eleven disease-gene pairs.
?
We compared these two sources of curated gene-based disease mechanisms to determine the scope and concordance of disease mechanisms for a 47 gene cardiovascular disease panel. All disease-gene relationships were confirmed to be at a minimum of Moderate level of clinical validity (using the GenCC database). Because so few genes have established gene mechanisms relevant to full gene copy gains, we limited our analysis to mechanisms relevant to copy loss variation.
?
We described the differences in nomenclature for disease mechanism in a previous post. Briefly, where ClinGen uses Haploinsufficiency to describe sensitivity to copy loss, G2P describes Decreased or Absent Gene Product Level. G2P allows for more than one mechanism (functional consequence) for a disease-gene pair. To facilitate comparison with ClinGen HI scores, we simply captured from G2P whether Absent/Decreased level was a mechanism or not.
?
For this 47 gene panel, we noted that there are more genes classified with Absent/Decreased gene product level in G2P (n=18) than classified as HI in ClinGen (n=10). In other words, using G2P assessment of mechanism increased the number of genes that would be considered sensitive to copy loss.
?
We found no instances where the mechanism of disease was conflicting between ClinGen and G2P.
?
For the 11 genes missing HI mechanism in ClinGen and with G2P status unknown, we manually curated evidence from the literature and found it to be insufficient to determine the mechanism. With no clear disease mechanism, copy loss variants found in these genes would likely not surpass a classification of VUS.
?
In conclusion, based on our evaluation, it would make sense for labs to consider utilizing the G2P resource to establish mechanism of disease for the purpose of variant interpretation, especially CNVs. Not only are the curations more current, but they are also available for nearly 8 times as many genes. Having information on genetic mechanism of disease available prior to interpreting variants can facilitate semi-automated interpretation of SNVs and CNVs. Moreover, knowing which genes are sensitive to copy loss variation can guide upfront gene panel design decisions about whether to include the evaluation of CNVs in specific genes.