Skip to end of banner
Go to start of banner

Data File

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 18 Next »

File view driving this table: https://www.synapse.org/#!Synapse:syn9630847/tables/

The following is a brief description of the relevant columns in the table below (by column header):

column name: name as it should appear in data portal

current synapse file view column: corresponding name (e.g., for SQL query) in Synapse table view

eventual synapse file view column: name (e.g., for SQL query) in Synapse table view that we will eventually migrate to

difference between current and eventual columns: as we migrate to GDC, we will put new annotation keys in "eventual" column names. for now, use "current."

facet: true if column_name should be faceted in data portal.

desired for CSBC/PS-ON: should this column be included in the CSBC data portal

The following annotations do not exist on file: theme

The following annotations need to be "ported" to GDC: Data Category (currently use existing "assay", but eventually use "data_category"); 

The following have been added to the synapse table, but are blank. They need to be filled in:

column namecurrent synapse file view columneventual synapse file view columnfacetconceptexampleGDC equivalentfaceted on GDCfacet on CSBCGDC referencesizerestricted valuescommentsin AMP-AD portalin NF portal
Speciesspeciesspeciesyes

none
yes





Scientific ThemeNAthemeyes
tumor-heterogeneitynone
yes





Data Categoryassaydata_categoryyesBroad categorization of the contents of the data file.
  • Transcriptome Profiling
data_categoryyesyes


CSBC will need to add values to those in GDC (which only cover sequencing)

Data TypeNAdata_typeno?Specific content type of the data file.
  • Exon Expression Quantification
  • Gene Expression Quantification
  • Isoform Expression Quantification
  • Splice Junction Quantification
data_typeyesyes


CSBC will need to add values to those in GDC (which only cover sequencing)

Data FormatfileFormatdata_formatno?Format of the data files.
  • CSV
  • HDF5
  • TSV
  • TXT
  • SRA XML
  • MAGE-TAB
  • SDRF
  • IDF
  • ADF
data_formatyesyes





Experiment Strategyassayexperimental_strategyyesThe sequencing strategy used to generate the data file.  REMOVE "sequencing" for CSBC.
  • RNA-Seq
  • Total RNA-Seq
experimental_strategyyesyes


CSBC will need to add values to those in GDC (which only cover sequencing)

file_name

noThe name (or part of a name) of a file (of any type).
file_namenono





file_size

noThe size of the data file (object) in bytes.
file_sizenono





md5sum

noThe 128-bit hash value expressed as a 32 digit hexadecimal number (in lower case) used as a file's digital fingerprint.
md5sumnono





Platform

yes

platformyesyes





Disease Type

yesThe text term used to describe the type of malignant disease, as categorized by the World Health Organization's (WHO) International Classification of Diseases for Oncology (ICD-O).
  • Acinar Cell Neoplasms
  • Adenomas and Adenocarcinomas
  • Adnexal and Skin Appendage Neoplasms
  • Basal Cell Neoplasms
  • Blood Vessel Tumors
case/disease_typeyesyeshttps://docs.gdc.cancer.gov/Data_Dictionary/viewer/#?view=table-definition-view&id=case




Site

yesThe text term used to describe the general location of the malignant disease, as categorized by the World Health Organization's (WHO) International Classification of Diseases for Oncology (ICD-O).
  • Accessory sinuses
  • Adrenal gland
  • Anus and anal canal
  • Base of tongue
  • Bladder
case/primary_siteyesyeshttps://docs.gdc.cancer.gov/Data_Dictionary/viewer/#?view=table-definition-view&id=case




Ethnicity

noAn individual's self-described social and cultural grouping, specifically whether an individual describes themselves as Hispanic or Latino. The provided values are based on the categories defined by the U.S. Office of Management and Business and used by the U.S. Census Bureau.
  • hispanic or latino
  • not hispanic or latino
  • Unknown
  • not reported
  • not allowed to collect
demographic/ethnicityyesyeshttps://docs.gdc.cancer.gov/Data_Dictionary/viewer/#?view=table-definition-view&id=demographic




Gender

noText designations that identify gender. Gender is described as the assemblage of properties that distinguish people on the basis of their societal roles. [Explanatory Comment 1: Identification of gender is based upon self-report and may come from a form, questionnaire, interview, etc.]
  • female
  • male
  • unknown
  • unspecified
  • not reported
demographic/genderyesyeshttps://docs.gdc.cancer.gov/Data_Dictionary/viewer/#?view=table-definition-view&id=demographic




Race


An arbitrary classification of a taxonomic group that is a division of a species. It usually arises as a consequence of geographical isolation within a species and is characterized by shared heredity, physical attributes and behavior, and in the case of humans, by common history, nationality, or geographic distribution. The provided values are based on the categories defined by the U.S. Office of Management and Business and used by the U.S. Census Bureau.
  • white
  • american indian or alaska native
  • black or african american
  • asian
  • native hawaiian or other pacific
demographic/raceyesyeshttps://docs.gdc.cancer.gov/Data_Dictionary/viewer/#?view=table-definition-view&id=demographic




tissue_or_organ_of_origin


The text term used to describe the anatomic site of origin, of the patient's malignant disease, as described by the World Health Organization's (WHO) International Classification of Diseases for Oncology (ICD-O).
  • Abdomen, NOS
  • Abdominal esophagus
  • Accessory sinus, NOS
  • Acoustic nerve
  • Adrenal gland, NOS
diagnosis/tissue_or_organ_of_originyesyeshttps://docs.gdc.cancer.gov/Data_Dictionary/viewer/#?view=table-definition-view&id=diagnosis




Age


Age at the time of diagnosis expressed in number of days since birth.
diagnosis/age_at_diagnosisyes??https://docs.gdc.cancer.gov/Data_Dictionary/viewer/#?view=table-definition-view&id=diagnosis




Primary Diagnosis


Text term used to describe the patient's histologic diagnosis, as described by the World Health Organization's (WHO) International Classification of Diseases for Oncology (ICD-O).
  • Abdominal desmoid
  • Abdominal fibromatosis
  • Achromic nevus
  • Acidophil adenocarcinoma
  • Acidophil adenoma
diagnosis/primary_diagnosisnonohttps://docs.gdc.cancer.gov/Data_Dictionary/viewer/#?view=table-definition-view&id=diagnosis




Progression


Yes/No/Unknown indicator to identify whether a patient has had a new tumor event after initial treatment.
  • yes
  • no
  • unknown
  • not reported
  • Not Allowed To Collect
diagnosis/progression_or_recurrencenonohttps://docs.gdc.cancer.gov/Data_Dictionary/viewer/#?view=table-definition-view&id=diagnosis




Vital Status


The survival state of the person registered on the protocol.
  • alive
  • dead
  • lost to follow-up
  • unknown
  • not reported
diagnosis/vital_statusyes??https://docs.gdc.cancer.gov/Data_Dictionary/viewer/#?view=table-definition-view&id=diagnosis




Sample Type


Text term to describe the source of a biospecimen used for a laboratory test.
  • Additional Metastatic
  • Additional - New Primary
  • Blood Derived Cancer - Bone Marrow, Post-treatment
  • Blood Derived Cancer - Peripheral Blood, Post-treatment
  • Blood Derived Normal
  • Bone Marrow Normal
  • Buccal Cell Normal
  • Cell Line Derived Xenograft Tissue
  • Cell Lines
sample/sample_typenoyeshttps://docs.gdc.cancer.gov/Data_Dictionary/viewer/#?view=table-definition-view&id=sample




Tissue Type


Text term that represents a description of the kind of tissue collected with respect to disease status or proximity to tumor tissue.
  • Tumor
  • Normal
  • Abnormal
  • Peritumoral
  • Unknown
sample/tissue_typenoyeshttps://docs.gdc.cancer.gov/Data_Dictionary/viewer/#?view=table-definition-view&id=sample




GDC Data Dictionary viewer: https://docs.gdc.cancer.gov/Data_Dictionary/viewer/

GDC Data Dictionary is implemented in YAML files: https://github.com/NCI-GDC/gdcdictionary

GDC submission process (and metadata templates) are described here: https://docs.gdc.cancer.gov/Data_Submission_Portal/Users_Guide/Data_Submission_Overview/

GDC Data Upload Walkthrough: https://docs.gdc.cancer.gov/Data_Submission_Portal/Users_Guide/Data_Submission_Walkthrough/#clinical-data-requirements

  • No labels

0 Comments

You are not logged in. Any changes you make will be marked as anonymous. You may want to Log In if you already have an account.