HTSeq is a Python package that calculates the number of mapped reads to each gene.
The first step in generating gene expression values from an RNA-Seq alignment at the GDC is generating a count of the reads mapped to each gene1. These counts are performed using HTSeq2 and are calculated at the gene level. HTSeq-Count files are available in a tab-delimited format with one Ensembl gene ID column and one mapped reads column for each gene. These files are then processed further with custom scripts to generate FPKM and FPKM-UQ values.
- GDC mRNA-Seq Documentation
- Anders, S., Pyl, P.T. and Huber, W., 2014. HTSeq-a Python framework to work with high-throughput sequencing data. Bioinformatics, p.btu638.
Categories: Workflow Type