An entity in the GDC is a unique component of the GDC Data Model.


The GDC Data Model is the primary method of organizing all data within the GDC1. More specifically, all the data within the GDC can be thought of as a Directed Acyclic Graph (DAG) composed of interconnected entities. A graphical representation of the GDC Data Model can be found here.

Each entity in the GDC has a set of properties. An example entity is a case (patient). A case is linked to a number of other entities in the data model, such as those that contain Biospecimen and Clinical data. For example, the demographic entity, which is linked to case, contains fields for properties such as ethnicity, race, and gender. The GDC Data Model defines how each of the entities are connected and the GDC Data Dictionary defines the entities and the relationships between them2.

Each entity is assigned a unique identifier in the form of a version 4 UUID.

Data submitters can create and update submittable entities in the GDC Data Model and upload data files registered in the model using the GDC Data Submission Portal, the GDC API, and the GDC Data Transfer Tool3.


  1. GDC Data Model
  2. GDC Data Dictionary
  3. GDC Data Submission Portal
  • N/A