Before Submitting Data to the GDC Portal
The National Cancer Institute (NCI) Genomic Data Commons (GDC) Data Submission Portal User's Guide is the companion documentation for the GDC Data Submission Portal and provides detailed information and instructions for its use.
Steps to Submit Data to the GDC
The following tasks are required to submit data to the GDC Data Portal.
Complete the GDC Data Submission Request Form. After submission, the reqest will be reviewed by the GDC Data Submission Review Committee. During this time, create an eRA Commons account if you do not already have one.
If the study is approved, contact a Genomic Program Administrator (GPA) to register the approved study in dbGaP. This includes registering the project as a GDC Trusted Partner study, registering cases, and adding authorized data submitters. For more information, see Data Submission Process.
Contact the GDC User Services to create a submission project. The User Services team will require a project ID, which is a two-part identifier, where the first portion is the Program followed by a hyphen (-) and the second portion is the Project. This must be alphanumeric and all caps only. An example would be
TCGA-BRCA. You must also create a project name, which can be longer and has fewer requirements on length or character usage. An example would be
Breast Invasive Carcinoma.
Familiarize yourself with the Data Model and Data Dictionary to understand how different data relate to each other and what data is permissible. Explore the GDC Metadata Validation Service to relate other vocabularies to permissible properties and values in the GDC Data Dictionary.
The GDC Data Submission Portal is a platform that allows researchers to submit and release data to the GDC. The key features of the GDC Data Submission Portal are:
- Upload and Validate Data: Project data can be uploaded to the GDC project workspace. The GDC will validate the data against the GDC Data Dictionary.
- Browse Data: Data that has been uploaded to the project workspace can be browsed to ensure that the project is ready for processing.
- Download Data: Data that has been uploaded into the project workspace can be downloaded for review or update by using the API or the Data Transfer Tool.
- Review and Submit Data: Prior to submission, data can be reviewed to check for accuracy and completeness. Once the review is complete, the data can be submitted to the GDC for processing through Data Harmonization.
- Release Data: After harmonization, data can be released to the research community for access through the GDC Data Portal and other GDC Data Access Tools.
- Status and Alerts: Visual cues are implemented in the GDC Data Submission Portal Dashboard to easily identify incomplete submissions via panel displays summarizing submitted data and associated data elements.
- Transactions: A list of all actions performed in a project is provided with detailed information for each action.
Sections to the Data Submission Portal Guide
- Data Submission Overview: Graphical explinations used to display the life cycle of projects and their data.
- Data Submission Process: An overview of the data submission process using the GDC Data Submission Portal.
- Data Submission Walkthrough: Step-by-step instructions on GDC data submission and their relationship to the GDC Data Model.
- Pre-Release Data Portal: Instructions on how to use the Pre-Release Data Portal for projects that have been harmonized but not released.
The GDC will not accept any data for patients age 90 and over including any follow-up events in which the event occurs after a patient turns 90 to ensure that HIPAA compliance is maintained. To comply with these requirements data submitters may omit any data (entire cases or specific nodes) that would violate this rule or by obfuscating dates. Please see the Date Obfuscation section for more information.
The Release Notes section of this User's Guide contains details about new features, bug fixes, and known issues.