登录查看更多内容

How to choose Normalization methods (TPM/RPKM/FPKM) for mRNA expression

Novogene

Your Trusted Genomics Partner

发布日期: 2024年5月17日

Why do mRNA expression values need to be normalized?

The unification of mRNA expression value measurements across studies, or the normalization of mRNA data, is a significant problem in biomedical and life science research. The abundance of transcripts is measured digitally by reading count. To eliminate technical biases in sequenced data, such as sequencing depth(deeper sequencing depth produces more read counts for one gene) and gene length(longer gene length produces more read counts at the same sequencing level), normalization of gene expression measurements is required.

Notes: READS COUNTS: Obtained from the original sequencing data, the count number is the total number of reads mapped to a certain gene; in the sequencing analysis process, the measured short reads are firstly mapped to the reference genome, and then the software is used to calculate the number of reads mapped to a certain gene, which means that read count is an integer value.

RPKM (Reads Per Kilobase per Million mapped reads）was made for single-end RNA-seq, where every read corresponded to a single fragment that was sequenced. FPKM (Fragments Per Kilobase per Million mapped fragments) is very similar to RPKM. We divide the number of fragments of a gene by the total sequencing depth, and the ratio is divided by the gene length. Note that, strictly speaking, the gene length mentioned above represents the total length of exons from one gene.

The difference between RPKM and FPKM is that F stands for fragments and R stands for reads. In the case of PE (Pair-end) sequencing, each fragment will have two reads, and FPKM only calculates the number of fragments that can be compared to the same transcript for both reads, while RPKM calculates the number of reads that can be compared to the transcript. The FPKM only counts the number of fragments that can be matched to the same transcript. In the case of SE (single-end) sequencing, the results calculated by FPKM and RPKM will be the same.

FPKM and RPKM ultimately normalize the abundance of transcripts from different samples (or the same sample under different conditions) to a standard that allows quantitative comparison by dividing both L (transcript length) and N (total number of Reads (Fragment)).

TPM (transcripts per kilobase million) is very much like FPKM and RPKM, but the only difference is that at first, normalize for gene length, and later normalize for sequencing depth. However, the differencing effect is very profound. Therefore, TPM is a more accurate statistic when calculating gene expression comparisons across samples. While using TPM, the sum of all TPMs are the same in each sample. This makes the comparison of the proportion of reads mapped to a gene in each sample very convenient.

How to choose the normalization method?

The TPM normalization results are sample independent and the TPMs are guaranteed to be the same across samples; however, the FPKM and TPM are about the same for each gene in each sample, so many people still use FPKM or RPKM to compare expression values of the same gene across samples. As with any high sequencing throughput technology, the analytical method is critical to interpret the data, and the RNA-seq analysis process is always evolving. Therefore, the appropriate method should be selected based on a combination of research directions.

Tim Sandle, Ph.D., CBiol, FIScT 8 个月前

The Importance of Plasmid DNA Quality Controls

GenScript 3 个月前

ADVANCE BIOTEC RESEARCH IN PAKISTAN DEFENSE LABS: GENE…

Col (R) Hassan Yousuf 1 年前

Reference

Dillies, Marie-Agnès, et al. “A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis.” Briefings in bioinformatics 14.6 (2013): 671-683.
2.Fundel, K., et al. “Normalization strategies for mRNA expression data in cartilage research.” Osteoarthritis and cartilage 16.8 (2008): 947-955.

Enjoyed the article? Share it with those who might also be interested in it!

About Novogene

Novogene specializes in the application of advanced molecular biotechnology and high-performance computing in the research fields of life science and human health. Established in March 2011, Novogene strives to become a global leader in providing genetic science services and technology products. Novogene has set up operations and laboratories in the United States, the United Kingdom, Netherlands, Germany, as well as in China, Singapore and Japan.

Novogene has served over 7,300 global customers, covering 90 countries and regions across 6 continents. It has cooperated extensively with many academic institutions and completed several advanced-level, international genomics research projects. By 2023, Novogene has co-published and/or been acknowledged in more than 22,850 articles in Science Citation Index, with an accumulative impact factor of more than 148,250.

Novogene's partners are worldwide and include more than 4,200 scienti?c research institutions and universities, more than 680 hospitals and over 2,400 pharmaceutical and agricultural enterprises. Currently, Novogene has obtained 356 software copyrights and 66 patents.

If you are interested in the sequencing services provided by Novogene and would like to get further information, please reach us here .

How to choose Normalization methods (TPM/RPKM/FPKM) for mRNA expression

Novogene

Your Trusted Genomics Partner

Why do mRNA expression values need to be normalized?

How to choose the normalization method?

领英推荐

About Novogene

更多精彩文章

社区洞察

其他会员也浏览了

The ABCs of NGS sample and library QC

Using Transcriptome Sequencing to Identify DEGs in Response to Chemical Challenges

Mastering reproducibility in qPCR

Impact of AI on Omics & CDMO’s

Understanding Capillary Electrophoresis and its Methods

Single Cell RNA-Seq Analysis with OmicsLogic: Empowering Your Research

How are genes named after their discovery?

AAV Data Hub Relaunch

Bioinformatics and Beyond: July 2024

How to evaluate a PCR machine?

Why do mRNA expression values need to be normalized?

How to choose the normalization method?

领英推荐

About Novogene

WGBS vs RRBS

2024年11月26日

A Beginner’s Guide to ATAC-seq

2024年11月19日

Weekly Research News Digest

2024年11月15日

A Beginner’s Guide to ChIP-Seq/RIP-Seq

2024年11月12日

Weekly Research News Digest

2024年11月8日

In the Lab: A Closer Look at DNA Methylation Sequencing Techniques

2024年11月5日

Weekly Research News Digest

2024年11月1日

Weekly Research News Digest

2024年10月25日

Exploring the Human Pan-Genome using Long-Read Sequencing

2024年10月23日

Weekly Research News Digest

2024年10月21日

社区洞察

其他会员也浏览了

The ABCs of NGS sample and library QC

Using Transcriptome Sequencing to Identify DEGs in Response to Chemical Challenges

Mastering reproducibility in qPCR

Impact of AI on Omics & CDMO’s

Understanding Capillary Electrophoresis and its Methods

Single Cell RNA-Seq Analysis with OmicsLogic: Empowering Your Research

How are genes named after their discovery?

AAV Data Hub Relaunch

Bioinformatics and Beyond: July 2024

How to evaluate a PCR machine?