-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #79 from icgc-argo/update_metadata_dict
update dict and template
- Loading branch information
Showing
6 changed files
with
36 additions
and
34 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,23 +1,25 @@ | ||
Field Attribute Description Permissible Values Note | ||
type Required table type sequencing_experiment | ||
submitter_sequencing_experiment_id Required Unique identifier of the sequencing experiment, assigned by the data provider. String values that meet the regular expression ^[a-zA-Z0-9]{1}[a-zA-Z0-9\\-_\\.:']{0,98}[a-zA-Z0-9]{1}$ | ||
program_id Required ARGO Program ID, the unique identifier of your program. If you have logged into the ARGO Data Platform, this is the Program ID that you see in the Program Services area. Must be the same as what is in sample_registration table | ||
submitter_donor_id Required Unique identifier of the donor, assigned by the data provider. Values must meet the regular expression ^[A-Za-z0-9\-\._]{1,64} Must be the same as what is in sample_registration table | ||
submitter_specimen_id Required Unique identifier of the specimen, assigned by the data provider. Values must meet the regular expression ^[A-Za-z0-9\-\._]{1,64} Must be the same as what is in sample_registration table | ||
submitter_sample_id Required Unique identifier of the sample, assigned by the data provider. If submitted along with BAM molecular data, must also be present in header SM. Values must meet the regular expression ^[A-Za-z0-9\-\._]{1,64} Must be the same as what is in sample_registration table | ||
submitter_matched_normal_sample_id Required Provide the identifier of matched normal sample used for data analysis. Values must meet the regular expression ^[A-Za-z0-9\-\._]{1,64} or null Required for WGS and WXS tumour samples | ||
program_id Required ARGO Program ID, the unique identifier of your program. If you have logged into the ARGO Data Platform, this is the Program ID that you see in the Program Services area. Must be the same as what are in sample_registration table submitted to ARGO platform. | ||
submitter_donor_id Required Unique identifier of the donor, assigned by the data provider. Values must meet the regular expression ^[A-Za-z0-9\-\._]{1,64} Must be the same as what are in sample_registration table submitted to ARGO platform. | ||
submitter_specimen_id Required Unique identifier of the specimen, assigned by the data provider. Values must meet the regular expression ^[A-Za-z0-9\-\._]{1,64} Must be the same as what are in sample_registration table submitted to ARGO platform. | ||
submitter_sample_id Required Unique identifier of the sample, assigned by the data provider. If submitted along with BAM molecular data, must also be present in header SM. Values must meet the regular expression ^[A-Za-z0-9\-\._]{1,64} Must be the same as what are in sample_registration table submitted to ARGO platform. | ||
submitter_matched_normal_sample_id Conditional Required Provide the identifier of matched normal sample used for data analysis. Values must meet the regular expression ^[A-Za-z0-9\-\._]{1,64} or empty(null) Required for WGS/WXS tumour samples | ||
read_group_count Required The number of read groups in the molecular files being submitted. A minimum of 1 is required. | ||
platform Required The sequencing platform type used in data generation. Can also be specified within Bam header PL. CAPILLARY, LS454, ILLUMINA, SOLID, HELICOS, IONTORRENT, ONT, PACBIO, Nanopore, BGI | ||
platform Required The sequencing platform type used in data generation. CAPILLARY, LS454, ILLUMINA, SOLID, HELICOS, IONTORRENT, ONT, PACBIO, Nanopore, BGI | ||
experimental_strategy Required The primary experimental method. For sequencing data it refers to how the sequencing library was made. WGS, WXS, RNA-Seq, Bisulfite-Seq, ChIP-Seq, Targeted-Seq | ||
sequencing_date Required Date sequencing was performed. datetime format, for example: 2019-06-16 or 2019-06-16T20:20:39+00:00 or null | ||
platform_model Required The model number of the sequencing machine used in data generation. Can also be specified within Bam header PM. Any string value or null | ||
sequencing_center Required Data centre sequencing was performed. Can also be specified with Bam header CN. Any string value or null | ||
target_capture_kit Optional Description that can uniquely identify a target capture kit. xGen Exome Research Panel V1 (IDT), SeqCap EZ MedExome (Roche), SureSelect Human All Exon V6 (Agilent), Human Core Exome Kit + RefSeq V1 (Twist) null | ||
library_isolation_protocol Optional Provide the protocol used to isolate RNAs TRIzol Reagent (Thermo Fisher), RNeasy kits (QIAGEN), RNase free DNase I (Thermo Fisher), Pico Pure RNA isolation kit (Thermo Fisher), mirVANA microRNA isolation kit (Thermo Fisher), Absolutely Total RNA, miRNA & mRNA Purification Kits (Stratagene, Agilent technologies), SV total RNA isolation kit (Promega), RNAqueous Kit (Thermo Fisher), AllPrep DNA/RNA Micro Kit (QIAGEN), GenElute Mammalian Total RNA Miniprep kit (MilliporeSigma), Spectrum Plant Total RNA kit (MilliporeSigma), peqGOLD Total RNA kits (PeqLab Biotechnologie), RNAlater (Thermo Fisher) null | ||
library_preparation_kit Optional Provide the kit being used for library construction Ovation SoLo kit (NuGEN), SMARTer Stranded Total RNA-Seq Kit (Takara), TruSeq RNA sample preparation v2 (Illumina), SMART-Seq v4 Ultra Low Input RNA Kit (Takara), Nextera XT DNA Library Preparation Kit (Illumina), NEXTflex kit (Bioo Scientific) null | ||
library_strandedness Conditional Required Indicate the data strandedness UNSTRANDED, FIRST_READ_SENSE_STRAND, FIRST_READ_ANTISENSE_STRAND null Required for RNA-Seq | ||
rin Optional RNA integrity number A number between 1 to 10 or null | ||
dv200 Optional The percentage of RNA fragments that are >200 nucleotides in size A percentage or null | ||
spike_ins_included Optional Indicate if include spike ins? true, false | ||
spike_ins_fasta Optional Name of FASTA file that contains the spike-in sequences Any string value or null. Must match a fileName identified in the files section. | ||
spike_ins_concentration Optional Spike in concentration String or null | ||
sequencing_date Optional The date of sequencing datetime format, for example: 2019-06-16 or 2019-06-16T20:20:39+00:00 or empty(null) | ||
platform_model Optional The model number of the sequencing machine used in data generation. Any string value or empty(null) | ||
sequencing_center Optional Data centre sequencing was performed. Can also be specified with Bam header CN. Any string value or empty(null) | ||
target_capture_kit Conditional Required Description that can uniquely identify a target capture kit. Suggested value is a combination of vendor, kit name, and kit version. Any string value or empty(null) Required for Targeted-Seq /WXS | ||
primary_target_regions Conditional Required A bed file which holds the biologically relevant target regions (based on a genome, e.g. GRCh38) to capture by the assay. Customized Enum values which can be mapped to fileName and fileURL Required for Targeted-Seq /WXS | ||
capture_target_regions Conditional Required A bed file which holds the technically relevant probes region to capture by the assay. Customized Enum values which can be mapped to fileName and fileURL Required for Targeted-Seq /WXS | ||
number_of_genes Optional Number of genes the assay is targeting Integer with a minimum value of 1 or empty(null). Optional for Targeted-Seq | ||
gene_padding Optional Number of basepairs to add to exon endpoints for the inBED filter Integer with a minimum value of 0 or empty(null). Optional for Targeted-Seq | ||
coverage Optional List of coverage Hotspot Regions, Coding Exons, Introns, Promoters, or empty(null) Optional for Targeted-Seq | ||
library_selection Optional The method used to select and/or enrich the material being sequenced. Affinity Enrichment, Hybrid Selection, miRNA Size Fractionation, PCR-based Enrichment, Poly-T Enrichment, Random, rRNA Depletion, Molecular Inversion Probes, or empty(null) Optional for Targeted-Seq/WXS/RNA-Seq | ||
library_preparation_kit Optional Provide the kit information being used for library construction. Suggested value is a combination of vendor, kit name, and kit version. Any string value or empty(null) | ||
library_strandedness Conditional Required Indicate the library strandedness UNSTRANDED, FIRST_READ_SENSE_STRAND, FIRST_READ_ANTISENSE_STRAND, or empty(null) Required for RNA-Seq | ||
rin Optional A numerical assessment of the integrity of RNA based on the entire electrophoretic trace of the RNA sample including the presence or absence of degradation products. A number between 1 to 10 or empty(null) Optional for RNA-Seq | ||
dv200 Optional The percentage of RNA fragments that are >200 nucleotides in size A percentage or empty(null) Optional for RNA-Seq |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
type submitter_sequencing_experiment_id program_id submitter_donor_id submitter_specimen_id submitter_sample_id submitter_matched_normal_sample_id read_group_count platform experimental_strategy sequencing_date platform_model sequencing_center target_capture_kit library_isolation_protocol library_preparation_kit library_strandedness rin dv200 spike_ins_included spike_ins_fasta spike_ins_concentration | ||
sequencing_experiment | ||
type submitter_sequencing_experiment_id program_id submitter_donor_id submitter_specimen_id submitter_sample_id submitter_matched_normal_sample_id read_group_count platform experimental_strategy sequencing_date platform_model sequencing_center target_capture_kit primary_target_regions capture_target_regions number_of_genes gene_padding coverage library_selection library_preparation_kit library_strandedness rin dv200 | ||
sequencing_experiment |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
type name format size md5sum path ega_file_id ega_dataset_id ega_experiment_id ega_sample_id ega_study_id ega_run_id ega_policy_id ega_analysis_id ega_submission_id ega_dac_id | ||
file | ||
sequencing_file |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
type submitter_sequencing_experiment_id submitter_read_group_id read_group_id_in_bam platform_unit is_paired_end file_r1 file_r2 read_length_r1 read_length_r2 insert_size sample_barcode library_name | ||
type submitter_sequencing_experiment_id submitter_read_group_id read_group_id_in_bam platform_unit is_paired_end file_r1 file_r2 library_name read_length_r1 read_length_r2 insert_size sample_barcode | ||
read_group |