Data Dictionary Release Notes

v.1.12

  • GDC Product: GDC Data Dictionary
  • Release Date: April 23, 2018

New Features and Changes

  • Updates to data dictionary viewer:
    • Data dictionary viewer displays true/false instead of boolean
    • Updated md5 description in data dictionary viewer to be more complete
  • Creation of new entities:
    • Copy Number Estimate entity for copy number variation data
    • Copy Number Variation Workflow entity for copy number variation pipeline metadata
    • Molecular Test entity
  • Updates to all entities:
    • Updated all entities to include batch_id field for submission
    • Updated all entities to include downloadable field
    • Modified all file entities file_size field to be an integer
    • Added support for creation of annotations on all entities in data model
  • Added new links / updated links between entities:
    • Created optional link between follow_up and diagnosis entities
    • Created many-to-many link between genomic_profile_harmonization_workflows and masked_somatic_mutations
    • Remove the restriction that prevents having alignment workflows to point to both submitted_aligned_reads and submitted_unaligned_reads
  • Updated the BCR XML endpoint to support submission of new entity relationships
  • Fixed description of prior_malignancy on diagnosis entity
  • Modified sample entity
    • Added new fields
      • growth_rate
      • passage_count
      • catalog_reference
      • distributor_reference
      • distance_normal_to_tumor
      • biospecimen_laterality
    • Added new permissible values to composition field
      • Sorted Cells
    • Added new permissible value to sample_type field
      • Next Generation Cancer Model
    • Added new permissible values to method_of_sample_procurement field
      • Pancreatectomy
      • Whipple Procedure
      • Paracentesis
    • Added new permissible values to biospecimen_anatomic_site field
      • Esophageal; Distal
      • Esophageal; Mid
      • Esophageal; Proximal
      • Hepatic Flexure
      • Rectosigmoid Junction
    • Removed permissible values on sample_type
      • 2D Classical Conditionally Reprogrammed Cells
      • 2D Modified Conditionally Reprogrammed Cells
      • 3D Organoid
      • 3D Air-Liquid Interface Organoid
      • 3D Neurosphere
      • Adherent Cell Line
      • Liquid Suspension Cell Line
  • Modified aliquot entity
    • Added new fields
      • no_matched_normal_wxs
      • no_matched_normal_wgs
      • no_matched_normal_targeted_sequencing
      • no_matched_normal_low_pass_wgs
  • Modified read_group entity
    • Updated descriptions on read_group entity
    • Added new field
      • target_capture_kit
    • Made library_selection a required field
    • Removed duplicate validators field
    • Modified permissible values on library_selection field
      • Replaced Affinity_Enrichment with Affinity Enrichment
      • Replaced Poly-T_Enrichment with Poly-T Enrichment
      • Replaced Hybrid_Selection with Hybrid Selection
      • Replaced RNA_Depletion with rRNA Depletion
      • Replaced Targeted Sequencing with Hybrid Selection for FM-AD
    • Added new permissible value on library_strand field
      • Not Applicable
    • Removed permissible values in library_strategy field
      • Amplicon
      • Validation
      • Other
  • Modified diagnosis entity
    • Added new fields
      • ajcc_staging_system_edition
      • anaplasia_present
      • anaplasia_present_type
      • child_pugh_classification
      • cog_neuroblastoma_risk_group
      • cog_rhabdomyosarcoma_risk_group
      • enneking_msts_grade
      • enneking_msts_metastasis
      • enneking_msts_stage
      • enneking_msts_tumor_site
      • esophageal_columnar_dysplasia_degree
      • esophageal_columnar_metaplasia_present
      • first_symptom_prior_to_diagnosis
      • gastric_esophageal_junction_involvement
      • goblet_cells_columnar_mucosa_present
      • inpc_grade
      • inss_stage
      • irs_group
      • ishak_fibrosis_score
      • micropapillary_features
      • meduloblastoma_molecular_classification
      • metastasis_at_diagnosis
      • mitosis_karyorrhexis_index
      • peripancreatic_lymph_nodes_positive
      • peripancreatic_lymph_nodes_tested
      • synchronous_malignancy
      • supratentorial_localization
      • tumor_confined_to_organ_of_origin
    • Added new permissible values to vascular_invasion_present field
      • Extramural
      • Intramural
    • Updated ann_arbor_clinical_stage and ann_arbor_pathologic_stage fields
  • Modified aliquot entity
    • Added new fields
      • selected_normal_wxs
      • selected_normal_wgs
      • selected_normal_targeted_sequencing
      • selected_normal_low_pass_wgs
  • Modified project entity
    • Added new fields
      • request_submission
      • in_review
      • submission_enabled
    • Removed duplicate intended_release_date field
    • Made fields not required
      • disease_type
      • primary_site
  • Modified case entity
    • Updated descriptions
      • disease_type
      • primary_site
  • Modified demographic entity
    • Added new fields
      • premature_at_birth
      • weeks_gestation_at_birth
    • Made submitter_id a required field
    • Changed type of year_of_birth from number to integer
    • Changed type of year_of_death from number to integer
  • Modified treatment entity
    • Updated CDE code for treatment_anatomic_site
    • Modified permissible values to treatment_anatomic_site field
    • Made submitter_id a required field
    • Added new permissible value to treatment_intent_type
      • Neoadjuvant
    • Added new permissible value to treatment_outcome
      • Very Good Partial Response
      • Mixed Response
      • No Response
    • Added new permissible value to treatment_type
      • Brachytherapy, High Dose
      • Brachytherapy, Low Dose
      • Radiation, 2D Conventional
      • Radiation, 3D Conformal
      • Radiation, Intensity-Modulated Radiotherapy
      • Radiation, Proton Beam
      • Radiation, Stereotactic Body
      • Radiation Therapy, NOS
      • Stereotactic Radiosurgery
    • Removed field
      • days_to_treatment
    • Removed permissible values for treatment_type
      • Radiation
      • Radiation Therapy
  • Modified analyte entity
    • Removed duplicate project field
  • Modified follow_up entity
    • Added new permissible values to comorbidity field
    • Added new fields
      • progression_or_recurrence_type
      • diabetes_treatment_type
      • reflux_treatment_type
      • barretts_esophagus_goblet_cells_present
      • karnofsky__performance_status
      • menopause_status
      • viral_hepatitis_serologies
      • reflux_treatment
      • pancreatitis_onset_year
      • comorbidity_method_of_diagnosis
      • risk_factor
    • Removed fields
      • absolute_neutrophil
      • albumin
      • beta_2_microglobulin
      • bun
      • calcium
      • cea_level
      • colon_polyps_history
      • creatinine
      • crp
      • days_to_hiv_diagnosis
      • estrogen_receptor_percent_positive_ihc
      • estrogen_receptor_result_ihc
      • glucose
      • hemoglobin
      • her2_erbb2_percent_positive_ihc
      • her2_erbb2_result_fish
      • her2_erbb2_result_ihc
      • hiv_positive
      • hpv_status
      • iga
      • igg
      • igl_kappa
      • igl_lambda
      • igm
      • ldh_level
      • ldh_normal_range_upper
      • m_protein
      • microsatellite_instability_abnormal
      • platelet_count
      • progesterone_receptor_percent_positive_ihc
      • progesterone_receptor_result_ihc
      • total_protein
      • wbc
  • Modified exposure entity
    • Added new fields
      • alcohol_days_per week
      • alcohol_drinks_per_day
  • Added molecular_test entity
    • Added new fields
      • gene_symbol
      • second_gene_symbol
      • test_analyte_type
      • test_result
      • molecular_analysis_method
      • variant_type
      • molecular_consequence
      • chromosome
      • cytoband
      • exon
      • transcript
      • locus
      • dna_change
      • aa_change
      • rna_change
      • zygosity
      • histone_family
      • histone_variant
      • copy_number
      • antigen
      • test_value
      • test_units
      • specialized_molecular_test
      • ploidy
      • cell_count
      • loci_count
      • loci_abnormal_count
      • mismatch_repair_mutation
      • blood_test
      • blood_test_normal_range_upper
      • blood_test_normal_range_lower
  • Modified biospecimen_supplement entity
    • Added new permissible values to data_format field
      • TSV
      • BCR Biotab
      • BCR SSF XML
      • BCR PPS XML
      • FoundationOne XML
      • BCR Auxiliary XML
    • Removed permissible values on data_format field
      • SSF
      • PPS
  • Modified clinical_supplement entity
    • Added new permissible values to data_format field
      • TSV
      • BCR OMF XML
      • BCR Biotab
    • Removed field
      • data_format
  • Modified submitted_aligned_reads entity
    • Added new permissible values to experimental_strategy field
      • Targeted Sequencing
      • Bisulfite-Seq
      • ChIP-Seq
      • ATAC-Seq
    • Removed permissible values from experimental_strategy field
      • Validation
      • Total RNA-Seq
  • Modified submitted_unaligned_reads entity
    • Added new field
      • read_pair_number
    • Added new permissible values to experimental_strategy field
      • Targeted Sequencing
      • Bisulfite-Seq
      • ChIP-Seq
      • ATAC-Seq
  • Modified alignment_workflow entity
    • Added new permissible values to workflow_type field
      • BWA with Mark Duplicates and BQSR
      • BWA with BQSR
  • Modified copy_number_segment entity
    • Removed permissible values for data_type field
      • Gene Level Copy Number
      • Gene Level Copy Number Scores
  • Modified submitted_genomic_profile entity
    • Added new permissible values for data_type
      • Raw GCI Variant
    • Added new permissible values for data_category
      • Combined Nucleotide Variation
    • Added new permissible values for experimental_strategy
      • WGS
    • Added new permissible values for data_format
      • VCF
  • Modified genomic_profile_harmonization_workflow entity
    • Added new permissible value for workflow_type
      • VCF LiftOver
  • Modified simple_somatic_mutation entity
    • Added new permissible values for data_type
      • Raw GCI Variant
    • Added new permissible values for data_category
      • Combined Nucleotide Variation
  • Modified masked_somatic_mutations entity
    • Added new permissible value for experimental_strategy
      • Targeted Sequencing
    • Removed permissible value for experimental_strategy
      • Validation

Bugs Fixed Since Last Release

  • Fixed issue when file_size is specified as a float in submitted json file

v.1.11

  • GDC Product: GDC Data Dictionary
  • Release Date: January 20, 2018

New Features and Changes

  • Added a link between the sample and analyte entities in the data model
  • Added a link between the sample entity and other sample entities in the data model
  • Created structural_variant_calling_workflow entity
  • Created structural_variation entity
  • Removed clinical_test entity
  • Removed exon_expression entity
  • Modified relationships of entities
    • Changed relationship between submitted_genomic_profile and read_group to one-to-many
    • Changed relationship between projects and masked_somatic_mutations from one-to-one to many_to_one
    • Changed relationship between projects and aggregated_somatic_mutations from one-to-one to many_to_one
    • Changed relationship of rna_expression_workflow to downstream entities from one-to-one to one-to-many
    • Changed relationship of alignment_workflow to downstream entities from many-to-one to many-to-many
  • Modified project entity
    • Added new state
      • processed
    • Added new fields
      • release_requested
      • awg_review
      • is_legacy
  • Modified clinical_supplement entity
    • Added new data_format field
      • CDC JSON
  • Modified case entity
    • Enumerated primary_site field
  • Modified diagnosis entity
    • Enumerated fields
      • primary_diagnosis
      • tissue_or_organ_of_origin
      • site_of_resection_or_biopsy
      • tumor_grade
      • morphology
      • ajcc_clinical_stage
      • ajcc_pathologic_stage
      • laterality
      • method_of_diagnosis
    • Changed type of fields
      • year_of_diagnosis: changed type to int
      • age_at_diagnosis: changed type to int
    • Added new permissible values to fields
      • figo_stage
        • Stage IIC
      • lymphatic_invasion_present
        • Not Reported
      • perineural_invasion_present
        • Not Reported
      • ann_arbor_clinical_stage
        • Not Reported
        • Unknown
      • residual_disease
        • Not Reported
        • Unknown
      • vascular_invasion_type
        • Macro
        • Micro
        • No Vascular Invasion
        • Not Reported
        • Unknown
  • Modified exposure entity
    • Enumerated fields
      • alcohol_intensity
      • alcohol_history
      • cause_of_death
    • Removed 6 as a valid field from tobacco_smoking_status
  • Modified demographic entity
    • Enumerated field
      • comorbidity
    • Added new permissible values to field
      • cause_of_death
        • Infection
        • Toxicity
        • Spinal Muscular Atrophy
        • End-stage Renal Disease
  • Modified family history entity
    • Enumerated fields
      • relationship_primary_diagnosis
      • relationship_type
  • Modified treatment entity
    • Enumerated fields
      • treatment_anatomic_site
      • treatment_type
    • Changed type of fields
      • days_to_treatment: changed type to int
      • days_to_treatment_end: changed type to int
      • days_to_treatment_start: changed type to int
  • Modified follow_up entity
    • Enumerated field
      • comorbidity
    • Removed None as valid field from comorbidity
  • Modified demographic entity
    • Added new field
      • age_at_index
  • Modified read_group entity
    • Added new fields
      • fragment_minimum_length
      • fragment_maximum_length
      • fragment_mean_length
      • fragment_standard_deviation_length
  • Modified slide image entity
    • Added new data_formats
      • JPEG and TIFF
    • Added new data_type
      • Cell Culture
    • Added new experimental strategy
      • Cell Culture
    • Added new property
      • Magnification
    • Added new field
      • date_time
  • Modified sample entity
    • Added new permissible values to fields
      • method_of_sample_procurement
        • Autopsy
      • sample_type
        • 2D Classical Conditionally Reprogrammed Cells
        • 2D Modified Conditionally Reprogrammed Cells
        • 3D Organoid
        • 3D Air-Liquid Interface Organoid
        • 3D Neurosphere
        • Adherent Cell Line
        • Liquid Suspension Cell Line
      • composition
        • 2D Classical Conditionally Reprogrammed Cells
        • 2D Modified Conditionally Reprogrammed Cells
        • 3D Organoid
        • 3D Air-Liquid Interface Organoid
        • 3D Neurosphere
        • Adherent Cell Line
        • Liquid Suspension Cell Line
  • Modified genomic_profile_harmonization_workflow entity
    • Added new permissible values to field
      • workflow_type
        • GENIE Simple Somatic Mutation
        • GENIE Copy Number Variation
  • Modified somatic_mutation_calling_workflow entity
    • Added new permissible values to field
      • workflow_type
        • Pindel
  • Modified rna_expression_workflow entity
    • Added new permissible values to field
      • workflow_type
        • RSEM - Quantification
        • STAR - Counts
        • RNA-SeQC - Counts
        • RNA-SeQC - FPKM
        • Kallisto - Quantification
        • Kallisto - HDF5
  • Modified somatic_aggregation_workflow
    • Added new permissible values to field
      • workflow_type
        • Pindel Variant Aggregation and Masking
        • GENIE Variant Aggregation and Masking
  • Modified gene_expression entity
    • Added new data_types
      • Isoform Expression Quantification
      • Exon Expression Quantification
    • Added new data_format
      • HDF5

Bugs Fixed Since Last Release

  • Fixed issue when submitting tobacco_smoking_status via tsv

Known Issues and Workarounds

Release with API v1.10.0

  • GDC Product: GDC Data Dictionary
  • Release Date: August 22, 2017

New Features and Changes

  • Created follow_up entity to support longitudinal clinical data
  • Deprecated clinical_test entity
  • Modified acceptable values for Read Group properties
    • library_selection : "Hybrid Selection, Affinity Enrichment, Poly-T Enrichment, Random, rRNA Depletion, miRNA Size Fractionation, Targeted Sequencing"
    • library_strategy : "Targeted Sequencing"
  • Modified Diagnosis entity
    • Added field iss_stage
    • Added field best_overall_response
    • Added field days_to_best_overall_response
    • Added field progression_free_survival
    • Added field progression_free_survival_event
    • Added field overall_survival
    • Added field days_to_diagnosis
  • Modified Treatment entity
    • Added field regimen_or_line_of_therapy
  • Modified Demographic entity
    • Added field cause_of_death
    • Added field days_to_birth
    • Added field days_to_death
    • Added field vital_status
  • Modified Case entity
    • Added field days_to_lost_to_followup
    • Added field lost_to_followup
    • Added field index_date
  • Added new tumor code, tumor id, and sample types to Sample entity to support OCG
    • tumor_code : "Acute leukemia of Ambiguous Lineage (ALAL), Lymphoid Normal, Tumor Adjacent Normal - Post Neo Adjuvant Therapy"
    • tumor_code_id : "15, 17, 18"
  • Created somatic_mutation_index entity
  • Updated caDSR CDE links in data dictionary
  • Added new sample_type : tumor to sample entity
  • Made classification_of_tumor on diagnosis entity non-required
  • Added support for FM-AD to Genomic Profile Harmonization Workflow entity
  • Added data_type : Gene Level Copy Number Scores to Copy Number Segment entity

Bugs Fixed Since Last Release

None

Known Issues and Workarounds

  • Portion weight property is incorrectly described in the Data Dictionary as the weight of the patient in kg, should be described as the weight of the portion in mg

Release with API v1.7.1

  • GDC Product: GDC Data Dictionary
  • Release Date: March 16, 2017

New Features and Changes

  • Added "submittable" property to all entities
  • Changed Read Group to category biospecimen
  • Added many new clinical properties available for submission
  • Added sample codes from Office of Cancer Genomics (OCG) to analyte and aliquot
  • Slides can now be attached to sample rather than just portion
  • sample_type_id is no longer required when submitting sample entities
  • analyte_type_id is no longer required when submitting aliquot and analyte entities
  • Clinical Test Entity is created for storing results of a variety of potential clinical tests related to the diagnosis -
  • Genomic Profiling Report entity created for storing particular derived sequencing results
  • Structural Variation entity created
  • Project entity includes new field "Intended Release Date"
  • Project entity includes new field "Releasable"

Bugs Fixed Since Last Release

None

Known Issues and Workarounds

None

Release with API v1.3.1

  • GDC Product: GDC Data Dictionary
  • Release Date: September 7, 2016

New Features and Changes

  • Clinical Supplement entities can have data_format set to OMF.
  • Biospecimen Supplement entities can have data_format set to SSF or PPS.
  • Read group instrument_model can be set to "Illumina HiSeq 4000".
  • Category of Slide entities in the GDC Data Model has changed from data_bundle to biospecimen.

Bugs Fixed Since Last Release

None

Known Issues and Workarounds

None