Gene and Mutation Summary Pages

Many parts of the GDC website contain links to Gene and Mutation summary pages. These pages display information about specific genes and mutations, along with visualizations and data showcasing the relationship between themselves, the projects, and cases within the GDC. The gene and mutation data that is visualized on these pages are produced from the Open-Access MAF files available for download on the GDC Portal.

Gene Summary Page

Gene Summary Pages describe each gene with mutation data and provides results related to the analyses that are performed on these genes.

Summary

The summary section of the gene page contains the following information:

Symbol: The gene symbol
Name: Full name of the gene
Synonyms: Synonyms of the gene name or symbol, if available
Type: A broad classification of the gene
Location: The chromosome on which the gene is located and its coordinates
Strand: If the gene is located on the forward (+) or reverse (-) strand
Description: A description of gene function and downstream consequences of gene alteration
Annotation: A notation/link that states whether the gene is part of The Cancer Gene Census

External References

A list with links that lead to external databases with additional information about each gene is displayed here. These external databases include: Entrez, Uniprot, Hugo Gene Nomenclature Committee, Online Mendelian Inheritance in Man, Ensembl, and CIViC.

Cancer Distribution

A table and two bar graphs (one for mutations, one for CNV events) show how many cases are affected by mutations and CNV events within the gene as a ratio and percentage. Each row/bar represents the number of cases for each project. The final column in the table lists the number of unique mutations observed on the gene for each project.

Protein Viewer

Mutations and their frequency across cases are mapped to a graphical visualization of protein-coding regions with a lollipop plot. Pfam domains are highlighted along the x-axis to assign functionality to specific protein-coding regions. The bottom track represents a view of the full gene length. Different transcripts can be selected by using the drop-down menu above the plot.

The panel to the right of the plot allows the plot to be filtered by mutation consequences or impact. The plot will dynamically change as filters are applied. Mutation consequence and impact is denoted in the plot by color.

Note: The impact filter on this panel will not display the annotations for alternate transcripts.

The plot can be viewed at different zoom levels by clicking and dragging across the x-axis, clicking and dragging across the bottom track, or double clicking the pfam domain IDs. The Reset button can be used to bring the zoom level back to its original position. The plot can also be exported as a PNG image, SVG image or as JSON formatted text by choosing the Download button above the plot.

Most Frequent Mutations

The 20 most frequent mutations in the gene are displayed as a bar graph that indicates the number of cases that share each mutation.

A table is displayed below that lists information about each mutation including:

DNA Change: The chromosome and starting coordinates of the mutation are displayed along with the nucleotide differences between the reference and tumor allele
Type: A general classification of the mutation
Consequences: The effects the mutation has on the gene coding for a protein (i.e. synonymous, missense, non-coding transcript)
# Affected Cases in Gene: The number of affected cases, expressed as number across all mutations within the Gene
# Affected Cases Across GDC: The number of affected cases, expressed as number across all projects. Choosing the arrow next to the percentage will expand the selection with a breakdown of each affected project
Impact: A subjective classification of the severity of the variant consequence. This determined using Ensembl VEP, PolyPhen, and SIFT. The categories are outlined here.

Note: The Mutation UUID can be displayed in this table by selecting it from the drop-down represented by three parallel lines

Clicking the Open in Exploration button will navigate the user to the Exploration page, showing the same results in the table (mutations filtered by the gene).

Mutation Summary Page

The Mutation Summary Page contains information about one somatic mutation and how it affects the associated gene. Each mutation is identified by its chromosomal position and nucleotide-level change.

Summary

ID: A unique identifier (UUID) for this mutation
DNA Change: Denotes the chromosome number, position, and nucleotide change of the mutation
Type: A broad categorization of the mutation
Reference Genome Assembly: The reference genome in which the chromosomal position refers to
Allele in the Reference Assembly: The nucleotide(s) that compose the site in the reference assembly
Functional Impact: A subjective classification of the severity of the variant consequence.

External References

A separate panel contains links to databases that contain information about the specific mutation. These include dbSNP, COSMIC, and CIViC.

Consequences

The consequences of the mutation are displayed in a table. The set of consequence terms, defined by the Sequence Ontology.

The fields that describe each consequence are listed below:

Gene: The symbol for the affected gene
AA Change: Details on the amino acid change, including compounds and position, if applicable
Consequence: The biological consequence of each mutation
Coding DNA Change: The specific nucleotide change and position of the mutation within the gene
Strand: If the gene is located on the forward (+) or reverse (-) strand
Transcript(s): The transcript(s) affected by the mutation. Each contains a link to the Ensembl entry for the transcript

Cancer Distribution

A table and bar graph shows how many cases are affected by the particular mutation. Each row/bar represents the number of cases for each project.

The table contains the following fields:

Project ID: The ID for a specific project
Disease Type: The disease associated with the project
Site: The anatomical site affected by the disease
# Affected Cases: The number of affected cases and total number of cases displayed as a fraction and percentage

Protein Viewer

The protein viewer displays a plot representing the position of mutations along the polypeptide chain. The y-axis represents the number of cases that exhibit each mutation, whereas the x-axis represents the polypeptide chain sequence. Pfam domains that were identified along the polypeptide chain are identified with colored rectangles labeled with pfam IDs. See the Gene Summary Page for additional details about the protein viewer.

The panel to the right of the plot allows the plot to be filtered by mutation consequences or impact. The plot will dynamically change as filters are applied. Mutation consequence and impact is denoted in the plot by color.

Note: The impact filter on this panel will not display the annotations for alternate transcripts.

The plot can be viewed at different zoom levels by clicking and dragging across the x-axis, clicking and dragging across the bottom track, or double clicking the pfam domain IDs. The Reset button can be used to bring the zoom level back to its original position. The plot can also be exported as a PNG image, SVG image or as JSON formatted text by choosing the Download button above the plot.