HTSeq-FPKM-UQ

Description

Fragments Per Kilobase of transcript per Million mapped reads upper quartile (FPKM-UQ) is a RNA-Seq-based expression normalization method. The FPKM-UQ is based on a modified version of the FPKM normalization method.

Overview

FPKM-UQ is implemented at the GDC on gene-level read counts that are produced by HTSeq1 and generated using custom scripts2. The formula used to generate FPKM-UQ values is as follows:

FPKM = [RMg * 109 ] / [RM75 * L]

  • RMg: The number of reads mapped to the gene
  • RM75: The number of read mapped to the 75th percentile gene in the alignment.
  • L: The length of the gene in base pairs

Like HTSeq - count files, FPKM-UQ files are available as tab delimited files with the Ensembl gene IDs in the first column and the expression values in the second.

Notes

  • The scalar (109) is added to normalize the values to "kilo base" and "million mapped reads."
  • FPKM-UQ values tend to be much higher than FPKM values because of the large difference between the total mapped number of reads in an alignment and the mapped number of reads to one gene.

Tools

  1. HTSeq Website

References

  1. Anders, S., Pyl, P.T. and Huber, W., 2014. HTSeq-a Python framework to work with high-throughput sequencing data. Bioinformatics, p.btu638.
  2. GDC mRNA-Seq Documentation

Categories: Workflow Type