diff --git a/2020_Bongers_SouthPeru/2020_Bongers_SouthPeru.bed b/2020_Bongers_SouthPeru/2020_Bongers_SouthPeru.bed index 3ed87e6..5f84546 100644 --- a/2020_Bongers_SouthPeru/2020_Bongers_SouthPeru.bed +++ b/2020_Bongers_SouthPeru/2020_Bongers_SouthPeru.bed @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:0c3b4a1dc47ea797bb22ea32883c440d787a89ed2f8f34524eaad4697c23b1b1 +oid sha256:bcce83075b24d5df227fc039e073dd89c12caf0846baf03546ae4a6bee0a6a46 size 2466029 diff --git a/2020_Bongers_SouthPeru/2020_Bongers_SouthPeru.janno b/2020_Bongers_SouthPeru/2020_Bongers_SouthPeru.janno index 9394fe7..ceaa79a 100644 --- a/2020_Bongers_SouthPeru/2020_Bongers_SouthPeru.janno +++ b/2020_Bongers_SouthPeru/2020_Bongers_SouthPeru.janno @@ -1,7 +1,7 @@ Poseidon_ID Genetic_Sex Group_Name Alternative_IDs Relation_To Relation_Degree Relation_Type Relation_Note Collection_ID Country Country_ISO Location Site Latitude Longitude Date_Type Date_C14_Labnr Date_C14_Uncal_BP Date_C14_Uncal_BP_Err Date_BC_AD_Start Date_BC_AD_Median Date_BC_AD_Stop Date_Note MT_Haplogroup Y_Haplogroup Source_Tissue Nr_Libraries Library_Names Capture_Type UDG Library_Built Genotype_Ploidy Data_Preparation_Pipeline_URL Endogenous Nr_SNPs Coverage_on_Target_SNPs Damage Contamination Contamination_Err Contamination_Meas Contamination_Note Genetic_Source_Accession_IDs Primary_Contact Publication Note Keywords Eager_ID Main_ID RateErrX RateErrY RateX RateY -UC12-12_ss_MNT F Peru_Chincha_LH n/a n/a n/a n/a n/a UC12-12 Peru PE SouthCoast n/a -13.476219 -76.016683 contextual n/a n/a n/a 1400 1450 1500 n/a A2+(64)+16129 n/a uc12-12 1 SC79-L1101 Shotgun half ss haploid https://github.com/nf-core/eager/releases/tag/2.4.6 64.627524 659142 n/a 2.9552786980429076e-2 0.175194 1.002554e-11 ANGSD[v0.935] Nr Snps (per library): 18180. Estimate and error are weighted means of values per library. Libraries with fewer than 100 SNPs used in contamination estimation were excluded. PRJEB37726;ERS4421181;ERR4017942 Fehren-Schmitz, Lars BongersPNAS2020 PASS n/a UC12-12_ss UC12-12 0.0036778307278310003 0.000362870433052 0.9378040341855981 0.006242505794216001 -UC12-20_ss_MNT M Peru_Chincha_LH n/a n/a n/a n/a n/a UC12-20 Peru PE SouthCoast n/a -13.476219 -76.016683 contextual n/a n/a n/a 1400 1450 1500 n/a C1b +16311 n/a uc12-20 1 SC79-L1098 Shotgun half ss haploid https://github.com/nf-core/eager/releases/tag/2.4.6 53.954231 834193 n/a 2.8999766639241295e-2 0.005742 4.6689380000000005e-14 ANGSD[v0.935] Nr Snps (per library): 12736. Estimate and error are weighted means of values per library. Libraries with fewer than 100 SNPs used in contamination estimation were excluded. PRJEB37726;ERS4421182;ERR4017943 Fehren-Schmitz, Lars BongersPNAS2020 PASS n/a UC12-20_ss UC12-20 0.002192551730127 0.0025789810956120002 0.49708731471340706 0.455857143945926 -UC12-24_ss_MNT M Peru_Chincha_LH n/a n/a n/a n/a n/a UC12-24 Peru PE SouthCoast n/a -13.476219 -76.016683 contextual n/a n/a n/a 1400 1450 1500 n/a C1b n/a uc12-24 1 SC79-L1096 Shotgun half ss haploid https://github.com/nf-core/eager/releases/tag/2.4.6 67.100728 663524 n/a 3.6231935965394695e-2 0.02035 2.147561e-13 ANGSD[v0.935] Nr Snps (per library): 7055. Estimate and error are weighted means of values per library. Libraries with fewer than 100 SNPs used in contamination estimation were excluded. PRJEB37726;ERS4421183;ERR4017944 Fehren-Schmitz, Lars BongersPNAS2020 PASS n/a UC12-24_ss UC12-24 0.0026191747636510002 0.0030634058675530003 0.48878107807005805 0.44319571820121706 -UC12-25_ss_MNT F Peru_Chincha_LH n/a n/a n/a n/a n/a UC12-25 Peru PE SouthCoast n/a -13.477239 -76.025299 contextual n/a n/a n/a 1400 1450 1500 n/a C1c n/a uc12-25 1 SC79-L1097 Shotgun half ss haploid https://github.com/nf-core/eager/releases/tag/2.4.6 77.587999 864369 n/a 2.9834926876643592e-2 0.18969399999999997 1.554692e-11 ANGSD[v0.935] Nr Snps (per library): 32440. Estimate and error are weighted means of values per library. Libraries with fewer than 100 SNPs used in contamination estimation were excluded. PRJEB37726;ERS4421184;ERR4017945 Fehren-Schmitz, Lars BongersPNAS2020 PASS n/a UC12-25_ss UC12-25 0.002943847456157 0.000312334458473 0.9625981903938851 0.007416751530339 -UC8-8168_ss_MNT F Peru_Chincha_LH n/a n/a n/a n/a n/a UC8-8168 Peru PE SouthCoast n/a -13.476219 -76.016683 contextual n/a n/a n/a 1400 1450 1500 n/a B2b n/a uc8-8168 1 SC79-L1099 Shotgun half ss haploid https://github.com/nf-core/eager/releases/tag/2.4.6 41.473265 503053 n/a 4.800801542978008e-2 0.174397 1.745724e-12 ANGSD[v0.935] Nr Snps (per library): 9599. Estimate and error are weighted means of values per library. Libraries with fewer than 100 SNPs used in contamination estimation were excluded. PRJEB37726;ERS4421185;ERR4017946 Fehren-Schmitz, Lars BongersPNAS2020 PASS n/a UC8-8168_ss UC8-8168 0.004553026203634 0.00044834816249900004 0.929809197736119 0.006163228942574 -UC8-8173_ss_MNT M Peru_Chincha_LH n/a n/a n/a n/a n/a UC8-8173 Peru PE SouthCoast n/a -13.477239 -76.025299 contextual n/a n/a n/a 1400 1450 1500 n/a B2b Q1a2a1a uc8-8173 1 SC79-L1095 Shotgun half ss haploid https://github.com/nf-core/eager/releases/tag/2.4.6 63.387807 717533 n/a 4.0810946463713735e-2 0.00394 8.432734e-14 ANGSD[v0.935] Nr Snps (per library): 8928. Estimate and error are weighted means of values per library. Libraries with fewer than 100 SNPs used in contamination estimation were excluded. PRJEB37726;ERS4421186;ERR4017947 Fehren-Schmitz, Lars BongersPNAS2020 PASS n/a UC8-8173_ss UC8-8173 0.0025831521904110004 0.0030217875354850003 0.5173068863522 0.469441459565386 +UC12-12_ss_MNT F Peru_Chincha_LH n/a n/a n/a n/a n/a UC12-12 Peru PE SouthCoast n/a -13.476219 -76.016683 contextual n/a n/a n/a 1400 1450 1500 n/a A2+(64)+16129 n/a uc12-12 1 SC79-L1101 Shotgun half ss haploid https://github.com/nf-core/eager/releases/tag/2.5.1 64.623018 659147 n/a 2.97539993963175e-2 0.175198 4.333311e-12 ANGSD[v0.935] Nr Snps (per library): 18178. Estimate and error are weighted means of values per library. Libraries with fewer than 100 SNPs used in contamination estimation were excluded. PRJEB37726;ERS4421181;ERR4017942 Fehren-Schmitz, Lars n/a PASS n/a UC12-12_ss UC12-12 0.0036779032724780004 0.000362252944412 0.9378493736652561 0.0062213491921190005 +UC12-20_ss_MNT M Peru_Chincha_LH n/a n/a n/a n/a n/a UC12-20 Peru PE SouthCoast n/a -13.476219 -76.016683 contextual n/a n/a n/a 1400 1450 1500 n/a C1b +16311 n/a uc12-20 1 SC79-L1098 Shotgun half ss haploid https://github.com/nf-core/eager/releases/tag/2.5.1 53.951534 834181 n/a 2.8177322422549e-2 0.005742 1.014989e-13 ANGSD[v0.935] Nr Snps (per library): 12735. Estimate and error are weighted means of values per library. Libraries with fewer than 100 SNPs used in contamination estimation were excluded. PRJEB37726;ERS4421182;ERR4017943 Fehren-Schmitz, Lars n/a PASS n/a UC12-20_ss UC12-20 0.002192540378467 0.0025789927718020004 0.497080083309806 0.45585919471114805 +UC12-24_ss_MNT M Peru_Chincha_LH n/a n/a n/a n/a n/a UC12-24 Peru PE SouthCoast n/a -13.476219 -76.016683 contextual n/a n/a n/a 1400 1450 1500 n/a C1b n/a uc12-24 1 SC79-L1096 Shotgun half ss haploid https://github.com/nf-core/eager/releases/tag/2.5.1 67.098032 663531 n/a 3.60829908790217e-2 0.020349 5.973964e-14 ANGSD[v0.935] Nr Snps (per library): 7056. Estimate and error are weighted means of values per library. Libraries with fewer than 100 SNPs used in contamination estimation were excluded. PRJEB37726;ERS4421183;ERR4017944 Fehren-Schmitz, Lars n/a PASS n/a UC12-24_ss UC12-24 0.002619112155784 0.0030637863334150002 0.488755909875806 0.443302380718023 +UC12-25_ss_MNT F Peru_Chincha_LH n/a n/a n/a n/a n/a UC12-25 Peru PE SouthCoast n/a -13.477239 -76.025299 contextual n/a n/a n/a 1400 1450 1500 n/a C1c n/a uc12-25 1 SC79-L1097 Shotgun half ss haploid https://github.com/nf-core/eager/releases/tag/2.5.1 77.580438 864367 n/a 2.98567366717787e-2 0.189742 7.393421e-12 ANGSD[v0.935] Nr Snps (per library): 32439. Estimate and error are weighted means of values per library. Libraries with fewer than 100 SNPs used in contamination estimation were excluded. PRJEB37726;ERS4421184;ERR4017945 Fehren-Schmitz, Lars n/a PASS n/a UC12-25_ss UC12-25 0.00294390467649 0.000312888103372 0.9626334834610931 0.007443057625777001 +UC8-8168_ss_MNT F Peru_Chincha_LH n/a n/a n/a n/a n/a UC8-8168 Peru PE SouthCoast n/a -13.476219 -76.016683 contextual n/a n/a n/a 1400 1450 1500 n/a B2b n/a uc8-8168 1 SC79-L1099 Shotgun half ss haploid https://github.com/nf-core/eager/releases/tag/2.5.1 41.471467 503049 n/a 4.76525779484052e-2 0.174391 9.354036e-13 ANGSD[v0.935] Nr Snps (per library): 9599. Estimate and error are weighted means of values per library. Libraries with fewer than 100 SNPs used in contamination estimation were excluded. PRJEB37726;ERS4421185;ERR4017946 Fehren-Schmitz, Lars n/a PASS n/a UC8-8168_ss UC8-8168 0.004552921489778 0.000447160690243 0.929767190487595 0.006130624941876 +UC8-8173_ss_MNT M Peru_Chincha_LH n/a n/a n/a n/a n/a UC8-8173 Peru PE SouthCoast n/a -13.477239 -76.025299 contextual n/a n/a n/a 1400 1450 1500 n/a B2b Q1a2a1a uc8-8173 1 SC79-L1095 Shotgun half ss haploid https://github.com/nf-core/eager/releases/tag/2.5.1 63.3829 717519 n/a 4.1235022171903e-2 0.00394 2.7943649999999998e-14 ANGSD[v0.935] Nr Snps (per library): 8927. Estimate and error are weighted means of values per library. Libraries with fewer than 100 SNPs used in contamination estimation were excluded. PRJEB37726;ERS4421186;ERR4017947 Fehren-Schmitz, Lars n/a PASS n/a UC8-8173_ss UC8-8173 0.0025831450742550004 0.003021716653648 0.517305476663749 0.46942098501586604 diff --git a/2020_Bongers_SouthPeru/CHANGELOG.md b/2020_Bongers_SouthPeru/CHANGELOG.md index 862c74a..c1a62bc 100644 --- a/2020_Bongers_SouthPeru/CHANGELOG.md +++ b/2020_Bongers_SouthPeru/CHANGELOG.md @@ -1,3 +1,4 @@ +- V 2.0.0: Updated package contents due to reprocessing of sequence data - V 1.0.0: Bump version for release - V 0.2.0: Arranged Poseidon_IDs alphabetically - V 0.1.5: Fill Lon/Lat, mtDNA/Y haplos, Country_ISO, Publication columns diff --git a/2020_Bongers_SouthPeru/POSEIDON.yml b/2020_Bongers_SouthPeru/POSEIDON.yml index 3677acc..0904837 100644 --- a/2020_Bongers_SouthPeru/POSEIDON.yml +++ b/2020_Bongers_SouthPeru/POSEIDON.yml @@ -6,19 +6,19 @@ contributor: - name: Thiseas C. Lamnidis email: thiseas_christos_lamnidis@eva.mpg.de orcid: 0000-0003-4485-8570 -packageVersion: 1.0.0 -lastModified: 2024-07-17 +packageVersion: 2.0.0 +lastModified: 2024-11-08 genotypeData: format: PLINK genoFile: 2020_Bongers_SouthPeru.bed - genoFileChkSum: f80d808156b11be3ff454a2cce6ee2ad + genoFileChkSum: 47583daa76e973022c7632899c039a4f snpFile: 2020_Bongers_SouthPeru.bim snpFileChkSum: 433fa85a23f3123bade02348e4628b75 indFile: 2020_Bongers_SouthPeru.fam indFileChkSum: c92b6bd217075f64d0e1bcde9c499080 snpSet: 1240K jannoFile: 2020_Bongers_SouthPeru.janno -jannoFileChkSum: 6c1ed98562021eeb751fbc647824b513 +jannoFileChkSum: cc6235e9868ee944cf9d4263c8bd9d1a sequencingSourceFile: 2020_Bongers_SouthPeru.ssf sequencingSourceFileChkSum: a6b5fd5157b862791c5247ec47975d77 bibFile: 2020_Bongers_SouthPeru.bib diff --git a/2020_Bongers_SouthPeru/README.md b/2020_Bongers_SouthPeru/README.md index cded284..6961523 100644 --- a/2020_Bongers_SouthPeru/README.md +++ b/2020_Bongers_SouthPeru/README.md @@ -1,4 +1,79 @@ -# 2020_Bongers_SouthPeru +# 2020_Bongers_SouthPeru-2.0.0 +This package was updated on 2024-11-05 and was processed using the following versions: + - nf-core/eager version: 2.5.1 + - Minotaur config version: 0.4.0dev + - CaptureType profile: 1240K + - CaptureType config version: 0.2.2dev + - Config template version: 0.3.0dev + - Package config version: 0.3.0dev + - Minotaur-packager version: 0.4.2dev + - populate_janno.py version: 0.4.1dev + +## CHANGELOG: + - Fixed a problem with the processing of PE sequenced data from ssDNA libraries [minotaur-recipes #53](https://github.com/poseidon-framework/minotaur-recipes/issues/53) + - New genotypes were generated. + - Data processing statistics were updated in the janno file to match latest processing. + +## Fill metadata from the minotaur archive package (2020_Bongers_SouthPeru-1.0.0). + +```bash +package_name="2020_Bongers_SouthPeru" +fill_from_archive="minotaur-archive" +fill_from_package="2020_Bongers_SouthPeru-1.0.0" +## trident v1.5.4.0 +## First fill-in missing metadata from relevant columns (no processing-based info). +trident jannocoalesce \ + -s ../../archives/${fill_from_archive}/${fill_from_package}/2020_Bongers_SouthPeru.janno \ + -t 2020_Bongers_SouthPeru/2020_Bongers_SouthPeru.janno \ + --stripIdRegex "(_ss_MNT$)|(_MNT$)" \ + --includeColumns Alternative_IDs,Relation_To,Relation_Degree,Relation_Type,Relation_Note,Collection_ID,Country,Country_ISO,Location,Site,Latitude,Longitude,Date_Type,Date_C14_Labnr,Date_C14_Uncal_BP,Date_C14_Uncal_BP_Err,Date_BC_AD_Start,Date_BC_AD_Median,Date_BC_AD_Stop,Date_Note,MT_Haplogroup,Y_Haplogroup,Source_Tissue,Primary_Contact,Note,Keywords + +## Then fill in Group_Name and Genetic_Sex +trident jannocoalesce \ + -s ../../archives/${fill_from_archive}/${fill_from_package}/2020_Bongers_SouthPeru.janno \ + -t 2020_Bongers_SouthPeru/2020_Bongers_SouthPeru.janno \ + --stripIdRegex "(_ss_MNT$)|(_MNT$)" \ + --includeColumns Genetic_Sex,Group_Name \ + --force + +## Mirror Sex and Group name info to the fam file. +paste -d "\t" 2020_Bongers_SouthPeru/2020_Bongers_SouthPeru.fam <(cut -f 1-3 2020_Bongers_SouthPeru/2020_Bongers_SouthPeru.janno |tail -n +2) | \ + awk ' + BEGIN{ + OFS=IFS="\t" + } + { + $1=$9 + if ($8 == "M") { + $5=1 + } else if ($8 == "F") { + $5=2 + } + print $1,$2,$3,$4,$5,$6 + } + ' > tmp.fam +## Cannot overwrite in the same command that reads in the file, so an extra mv is needed. +mv tmp.fam 2020_Bongers_SouthPeru/2020_Bongers_SouthPeru.fam + +## trident v1.5.4.0 +trident rectify --checksumAll -d 2020_Bongers_SouthPeru/ + +## Arrange Poseidon_IDs alphabetically +## First, move the old directory out of the way (avoids all files being renamed by trident forge) +cd ../ ## Start from the parent directory, since we need to rename the directory we were in. +mv ${package_name} ${package_name}_old + +## qjanno v1.0.0.1 +qjanno "SELECT '<'||Poseidon_ID||'>' FROM d(${package_name}_old) ORDER BY Poseidon_ID" --raw --noOutHeader > desiredOrder.txt + +## Rearrange package contents to the desired order. +## trident v1.5.4.0 +trident forge -d ${package_name}_old --forgeFile desiredOrder.txt -o ${package_name} --ordered --preservePyml +trident rectify -d ${package_name} --packageVersion Major --logText "Updated package contents due to reprocessing of sequence data" --checksumAll +## Once the new package has been verified, the '_old' directory can be removed. +``` + +# 2020_Bongers_SouthPeru-1.0.0 This package was created on 2024-05-06 and was processed using the following versions: - nf-core/eager version: 2.4.6 - Minotaur config version: 0.2.1dev diff --git a/archive.chron b/archive.chron index 4701aaf..f3dbe47 100644 --- a/archive.chron +++ b/archive.chron @@ -1,11 +1,15 @@ title: Poseidon minotaur-archive chronicle -chronicleVersion: 1.6.0 -lastModified: 2024-07-17 +chronicleVersion: 1.7.0 +lastModified: 2024-11-08 packages: - title: 2020_Bongers_SouthPeru version: 1.0.0 commit: a3a4f0ee2699238d9abb4bfbda721bddca07303b path: 2020_Bongers_SouthPeru +- title: 2020_Bongers_SouthPeru + version: 2.0.0 + commit: bbdb17e36bdecdf25120c4870aa26d606f6284b2 + path: 2020_Bongers_SouthPeru - title: 2020_Margaryan_Viking version: 1.0.0 commit: 9118d90debbf4ec47ac90de01dfb5c0f31e0b53d