Clinical Data


Clinical data is a collection of data related to patient diagnosis, demographics, exposures, laboratory tests, and family relationships.


A core focus of the GDC is to enable the comparison of cancer genomic data to patient clinical data. This may include patient diagnosis, exposures, laboratory tests, and family relationships. To allow for comparisons across projects the GDC adheres to a common set of clinical terms.

Clinical data vocabulary in the GDC is defined in the GDC Data Dictionary1. A simple list of all GDC clinical terms can be found on the GDC Website2. Whenever possible each clinical data property is associated with a Common Data Element defined in the caDSR II Browser, which is part of the Center for Biomedical Informatics & Information Technology. CDEs and their component definitions in the NCI Thesaurus provide very precise descriptions of clinical data, which allows data consumers and submitters to understand precisely what data in the GDC represents.

Additional information on the goals and origins of the GDC clinical terminology can be found in this GDC scientific report3. Additional information on how the clinical data relates to overall GDC Data Model can be found on the GDC Website4.


In the GDC, clinical data is searchable in the API or Data Portal. This only includes data that has been indexed and aligns with the GDC Data Dictionary1,2. Additional clinical data may be stored in clinical supplement files in different formats depending on the project. For example, this may include XML or biotab for TCGA or xlsx for TARGET.


  1. GDC Clinical Data Dictionary Entries
  2. GDC Clinical Data Harmonization
  3. Selecting Common Cross-Study Clinical Data Elements
  4. GDC Data Model

Categories: Data Category