Data Dictionary Release Notes

v.1.11

  • GDC Product: GDC Data Dictionary
  • Release Date: January 20, 2018

New Features and Changes

  • Added a link between the sample and analyte entities in the data model
  • Added a link between the sample entity and other sample entities in the data model
  • Created structural_variant_calling_workflow entity
  • Created structural_variation entity
  • Removed clinical_test entity
  • Removed exon_expression entity
  • Modified relationships of entities
    • Changed relationship between submitted_genomic_profile and read_group to one-to-many
    • Changed relationship between projects and masked_somatic_mutations from one-to-one to many_to_one
    • Changed relationship between projects and aggregated_somatic_mutations from one-to-one to many_to_one
    • Changed relationship of rna_expression_workflow to downstream entities from one-to-one to one-to-many
    • Changed relationship of alignment_workflow to downstream entities from many-to-one to many-to-many
  • Modified project entity
    • Added new state
      • processed
    • Added new fields
      • release_requested
      • awg_review
      • is_legacy
  • Modified clinical_supplement entity
    • Added new data_format field
      • CDC JSON
  • Modified case entity
    • Enumerated primary_site field
  • Modified diagnosis entity
    • Enumerated fields
      • primary_diagnosis
      • tissue_or_organ_of_origin
      • site_of_resection_or_biopsy
      • tumor_grade
      • morphology
      • ajcc_clinical_stage
      • ajcc_pathologic_stage
      • laterality
      • method_of_diagnosis
    • Changed type of fields
      • year_of_diagnosis: changed type to int
      • age_at_diagnosis: changed type to int
    • Added new permissible values to fields
      • figo_stage
        • Stage IIC
      • lymphatic_invasion_present
        • Not Reported
      • perineural_invasion_present
        • Not Reported
      • ann_arbor_clinical_stage
        • Not Reported
        • Unknown
      • residual_disease
        • Not Reported
        • Unknown
      • vascular_invasion_type
        • Macro
        • Micro
        • No Vascular Invasion
        • Not Reported
        • Unknown
  • Modified exposure entity
    • Enumerated fields
      • alcohol_intensity
      • alcohol_history
      • cause_of_death
    • Removed 6 as a valid field from tobacco_smoking_status
  • Modified demographic entity
    • Enumerated field
      • comorbidity
    • Added new permissible values to field
      • cause_of_death
        • Infection
        • Toxicity
        • Spinal Muscular Atrophy
        • End-stage Renal Disease
  • Modified family history entity
    • Enumerated fields
      • relationship_primary_diagnosis
      • relationship_type
  • Modified treatment entity
    • Enumerated fields
      • treatment_anatomic_site
      • treatment_type
    • Changed type of fields
      • days_to_treatment: changed type to int
      • days_to_treatment_end: changed type to int
      • days_to_treatment_start: changed type to int
  • Modified follow_up entity
    • Enumerated field
      • comorbidity
    • Removed None as valid field from comorbidity
  • Modified demographic entity
    • Added new field
      • age_at_index
  • Modified read_group entity
    • Added new fields
      • fragment_minimum_length
      • fragment_maximum_length
      • fragment_mean_length
      • fragment_standard_deviation_length
  • Modified slide image entity
    • Added new data_formats
      • JPEG and TIFF
    • Added new data_type
      • Cell Culture
    • Added new experimental strategy
      • Cell Culture
    • Added new property
      • Magnification
    • Added new field
      • date_time
  • Modified sample entity
    • Added new permissible values to fields
      • method_of_sample_procurement
        • Autopsy
      • sample_type
        • 2D Classical Conditionally Reprogrammed Cells
        • 2D Modified Conditionally Reprogrammed Cells
        • 3D Organoid
        • 3D Air-Liquid Interface Organoid
        • 3D Neurosphere
        • Adherent Cell Line
        • Liquid Suspension Cell Line
      • composition
        • 2D Classical Conditionally Reprogrammed Cells
        • 2D Modified Conditionally Reprogrammed Cells
        • 3D Organoid
        • 3D Air-Liquid Interface Organoid
        • 3D Neurosphere
        • Adherent Cell Line
        • Liquid Suspension Cell Line
  • Modified genomic_profile_harmonization_workflow entity
    • Added new permissible values to field
      • workflow_type
        • GENIE Simple Somatic Mutation
        • GENIE Copy Number Variation
  • Modified somatic_mutation_calling_workflow entity
    • Added new permissible values to field
      • workflow_type
        • Pindel
  • Modified rna_expression_workflow entity
    • Added new permissible values to field
      • workflow_type
        • RSEM - Quantification
        • STAR - Counts
        • RNA-SeQC - Counts
        • RNA-SeQC - FPKM
        • Kallisto - Quantification
        • Kallisto - HDF5
  • Modified somatic_aggregation_workflow
    • Added new permissible values to field
      • workflow_type
        • Pindel Variant Aggregation and Masking
        • GENIE Variant Aggregation and Masking
  • Modified gene_expression entity
    • Added new data_types
      • Isoform Expression Quantification
      • Exon Expression Quantification
    • Added new data_format
      • HDF5

Bugs Fixed Since Last Release

  • Fixed issue when submitting tobacco_smoking_status via tsv

Known Issues and Workarounds

Release with API v1.10.0

  • GDC Product: GDC Data Dictionary
  • Release Date: August 22, 2017

New Features and Changes

  • Created follow_up entity to support longitudinal clinical data
  • Deprecated clinical_test entity
  • Modified acceptable values for Read Group properties
    • library_selection : "Hybrid Selection, Affinity Enrichment, Poly-T Enrichment, Random, rRNA Depletion, miRNA Size Fractionation, Targeted Sequencing"
    • library_strategy : "Targeted Sequencing"
  • Modified Diagnosis entity
    • Added field iss_stage
    • Added field best_overall_response
    • Added field days_to_best_overall_response
    • Added field progression_free_survival
    • Added field progression_free_survival_event
    • Added field overall_survival
    • Added field days_to_diagnosis
  • Modified Treatment entity
    • Added field regimen_or_line_of_therapy
  • Modified Demographic entity
    • Added field cause_of_death
    • Added field days_to_birth
    • Added field days_to_death
    • Added field vital_status
  • Modified Case entity
    • Added field days_to_lost_to_followup
    • Added field lost_to_followup
    • Added field index_date
  • Added new tumor code, tumor id, and sample types to Sample entity to support OCG
    • tumor_code : "Acute leukemia of Ambiguous Lineage (ALAL), Lymphoid Normal, Tumor Adjacent Normal - Post Neo Adjuvant Therapy"
    • tumor_code_id : "15, 17, 18"
  • Created somatic_mutation_index entity
  • Updated caDSR CDE links in data dictionary
  • Added new sample_type : tumor to sample entity
  • Made classification_of_tumor on diagnosis entity non-required
  • Added support for FM-AD to Genomic Profile Harmonization Workflow entity
  • Added data_type : Gene Level Copy Number Scores to Copy Number Segment entity

Bugs Fixed Since Last Release

None

Known Issues and Workarounds

  • Portion weight property is incorrectly described in the Data Dictionary as the weight of the patient in kg, should be described as the weight of the portion in mg

Release with API v1.7.1

  • GDC Product: GDC Data Dictionary
  • Release Date: March 16, 2017

New Features and Changes

  • Added "submittable" property to all entities
  • Changed Read Group to category biospecimen
  • Added many new clinical properties available for submission
  • Added sample codes from Office of Cancer Genomics (OCG) to analyte and aliquot
  • Slides can now be attached to sample rather than just portion
  • sample_type_id is no longer required when submitting sample entities
  • analyte_type_id is no longer required when submitting aliquot and analyte entities
  • Clinical Test Entity is created for storing results of a variety of potential clinical tests related to the diagnosis -
  • Genomic Profiling Report entity created for storing particular derived sequencing results
  • Structural Variation entity created
  • Project entity includes new field "Intended Release Date"
  • Project entity includes new field "Releasable"

Bugs Fixed Since Last Release

None

Known Issues and Workarounds

None

Release with API v1.3.1

  • GDC Product: GDC Data Dictionary
  • Release Date: September 7, 2016

New Features and Changes

  • Clinical Supplement entities can have data_format set to OMF.
  • Biospecimen Supplement entities can have data_format set to SSF or PPS.
  • Read group instrument_model can be set to "Illumina HiSeq 4000".
  • Category of Slide entities in the GDC Data Model has changed from data_bundle to biospecimen.

Bugs Fixed Since Last Release

None

Known Issues and Workarounds

None