FPKM

Description

Fragments Per Kilobase of transcript per Million mapped reads (FPKM) is a simple expression level normalization method. The FPKM normalizes read count based on gene length and the total number of mapped reads.

Overview

FPKM is implemented at the GDC on gene-level read counts that are produced by STAR¹ and generated using custom scripts². The formula used to generate FPKM values is as follows:

FPKM = [RM_g * 10⁹ ] / [RM_t * L]

RM_g: The number of reads mapped to the gene
RM_t: The total number of read mapped to protein-coding sequences in the alignment
L: The length of the gene in base pairs

The scalar (10⁹) is added to normalize the data to "kilo base" and "million mapped reads."

FPKM files are available as tab delimited files with the Ensembl gene IDs in the first column and the expression values in the second. See FPKM-UQ for an alternative method of gene expression level normalization.

References

External Links

Categories: Workflow Type