Aligned Reads


A read is a sequence obtained from a single sequencing experiment. An aligned read, is a sequence that has been aligned to a common reference genome. Typically these reads can number from the hundreds of thousands to tens of millions.


The GDC supports the submission of aligned reads, in addition to unaligned reads. A data file containing aligned reads can be used as input for most GDC workflows1. During harmonization, reads are aligned to the GRCh38 human genome with standardized protocols based on data type2,3. Generated aligned read files also contain unaligned reads to facilitate the retrieval of raw data by end users.

Aligned reads are available at the GDC Data Portal for: Whole Exome Sequencing, Whole Genome Sequencing, Transcriptome Sequencing, Targeted Sequencing, and ATAC-Seq

Data Formats

Aligned reads are maintained in Binary Alignment Map (BAM) format.


  1. GDC Data Dictionary - Submitted Aligned Reads
  2. DNA-Seq Documentation
  3. RNA-Seq Documentation

Categories: Data Type