Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 2021_Carlhoff_Nature package #2

Merged
merged 14 commits into from
Apr 8, 2024
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions 2021_CarlhoffNature/2021_CarlhoffNature.bed
Git LFS file not shown
22 changes: 22 additions & 0 deletions 2021_CarlhoffNature/2021_CarlhoffNature.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
@article{CarlhoffNature2021,
title = "Genome of a middle Holocene hunter-gatherer from Wallacea",
author = "Carlhoff, Selina and Duli, Akin and N{\"a}gele, Kathrin and Nur,
Muhammad and Skov, Laurits and Sumantri, Iwan and Oktaviana,
Adhi Agus and Hakim, Budianto and Burhan, Basran and Syahdar,
Fardi Ali and McGahan, David P and Bulbeck, David and Perston,
Yinika L and Newman, Kim and Saiful, Andi Muhammad and
Ririmasse, Marlon and Chia, Stephen and {Hasanuddin} and
Pulubuhu, Dwia Aries Tina and {Suryatman} and {Supriadi} and
Jeong, Choongwon and Peter, Benjamin M and Pr{\"u}fer, Kay and
Powell, Adam and Krause, Johannes and Posth, Cosimo and Brumm,
Adam",
journal = "Nature",
publisher = "Springer Science and Business Media LLC",
volume = 596,
number = 7873,
pages = "543--547",
month = aug,
year = 2021,
copyright = "https://creativecommons.org/licenses/by/4.0",
language = "en"
}
TCLamnidis marked this conversation as resolved.
Show resolved Hide resolved
3 changes: 3 additions & 0 deletions 2021_CarlhoffNature/2021_CarlhoffNature.bim
Git LFS file not shown
2 changes: 2 additions & 0 deletions 2021_CarlhoffNature/2021_CarlhoffNature.fam
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Leang_Panninge GUP001_MNT 0 0 2 0
Leang_Panninge GUP001_ss_MNT 0 0 2 0
3 changes: 3 additions & 0 deletions 2021_CarlhoffNature/2021_CarlhoffNature.janno
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Poseidon_ID Genetic_Sex Group_Name Alternative_IDs Relation_To Relation_Degree Relation_Type Relation_Note Collection_ID Country Country_ISO Location Site Latitude Longitude Date_Type Date_C14_Labnr Date_C14_Uncal_BP Date_C14_Uncal_BP_Err Date_BC_AD_Start Date_BC_AD_Median Date_BC_AD_Stop Date_Note MT_Haplogroup Y_Haplogroup Source_Tissue Nr_Libraries Library_Names Capture_Type UDG Library_Built Genotype_Ploidy Data_Preparation_Pipeline_URL Endogenous Nr_SNPs Coverage_on_Target_SNPs Damage Contamination Contamination_Err Contamination_Meas Contamination_Note Genetic_Source_Accession_IDs Primary_Contact Publication Note Keywords Eager_ID Individual_ID RateErrX RateErrY RateX RateY
GUP001_MNT F Leang_Panninge n/a GUP001_ss_MNT identical same_as n/a 30 B. PANNINGE Skull block region 9 petrous Indonesia ID Southern Sulawesi Leang Panninge -4.7741 119.9396 C14 Wk-48639 6317 19 -5314 -5265 -5215 n/a M n/a bone_petrous 1 GUP001.A0101 1240K;Shotgun half ds haploid https://github.com/nf-core/eager/releases/tag/2.4.6 6.762222 272731 n/a 0.4636218846261551 0.11128 3.536035e-14 ANGSD[v0.935] Nr Snps (per library): 691. Estimate and error are weighted means of values per library. Libraries with fewer than 100 SNPs used in contamination estimation were excluded. PRJEB43715;ERS5956814;ERR5490520;ERR5490521;ERR5490522;ERR5490523;ERR5490524;ERR5490525;ERR5490526 Selina Carlhoff CarlhoffNature2021 C14 dates from charcoal and seeds associated with burial; damage for ds SG; mtcontam for ss with AuthentiCT n/a GUP001 GUP001 0.007582631999185001 0.001447150678242 0.8115348979428191 0.020098725775262002
GUP001_ss_MNT F Leang_Panninge n/a GUP001_MNT identical same_as n/a 30 B. PANNINGE Skull block region 9 petrous Indonesia ID Southern Sulawesi Leang Panninge -4.7741 119.9396 C14 Wk-48639 6317 19 -5314 -5265 -5215 n/a M n/a bone_petrous 1 GUP001.A0102 1240K;Shotgun minus ss haploid https://github.com/nf-core/eager/releases/tag/2.4.6 13.339985 134067 n/a 0.5877457008717157 0.097276 3.860919e-15 ANGSD[v0.935] Nr Snps (per library): 345. Estimate and error are weighted means of values per library. Libraries with fewer than 100 SNPs used in contamination estimation were excluded. PRJEB43715;ERS5956814;ERR5490527;ERR5490528;ERR5490529 Selina Carlhoff CarlhoffNature2021 C14 dates from charcoal and seeds associated with burial; damage for ds SG; mtcontam for ss with AuthentiCT n/a GUP001_ss GUP001 0.008514648295196001 0.0016382538785210002 0.7711635308594431 0.019378750752398
11 changes: 11 additions & 0 deletions 2021_CarlhoffNature/2021_CarlhoffNature.ssf
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
sample_accession study_accession run_accession sample_alias poseidon_IDs udg library_built secondary_sample_accession first_public last_updated instrument_model library_layout library_source instrument_platform library_name library_strategy fastq_aspera fastq_bytes fastq_md5 fastq_ftp read_count submitted_ftp
SAMEA8270508 PRJEB43715 ERR5490520 GUP001 GUP001_MNT half ds ERS5956814 2021-06-18 2021-06-18 Illumina HiSeq 4000 SINGLE METAGENOMIC ILLUMINA GUP001.A0101 Targeted-Capture fasp.sra.ebi.ac.uk:/vol1/fastq/ERR549/000/ERR5490520/ERR5490520.fastq.gz 31485654 842ec55a3a0aff4979dea1b5dc529e7c ftp.sra.ebi.ac.uk/vol1/fastq/ERR549/000/ERR5490520/ERR5490520.fastq.gz 945210 ftp.sra.ebi.ac.uk/vol1/run/ERR549/ERR5490520/GUP001.A0101.MT1.1.fastq.truncated.gz
SAMEA8270508 PRJEB43715 ERR5490521 GUP001 GUP001_MNT half ds ERS5956814 2021-06-18 2021-06-18 Illumina HiSeq 4000 SINGLE METAGENOMIC ILLUMINA GUP001.A0101 Targeted-Capture fasp.sra.ebi.ac.uk:/vol1/fastq/ERR549/001/ERR5490521/ERR5490521.fastq.gz 32643939 e4c9ae3aed81094a383c2c2003dd961f ftp.sra.ebi.ac.uk/vol1/fastq/ERR549/001/ERR5490521/ERR5490521.fastq.gz 1040533 ftp.sra.ebi.ac.uk/vol1/run/ERR549/ERR5490521/GUP001.A0101.MT1.2.fastq.truncated.gz
SAMEA8270508 PRJEB43715 ERR5490522 GUP001 GUP001_MNT half ds ERS5956814 2021-06-18 2021-06-18 Illumina HiSeq 4000 SINGLE METAGENOMIC ILLUMINA GUP001.A0101 Targeted-Capture fasp.sra.ebi.ac.uk:/vol1/fastq/ERR549/002/ERR5490522/ERR5490522.fastq.gz 192243801 a4add9bc886418c913420eab0391297e ftp.sra.ebi.ac.uk/vol1/fastq/ERR549/002/ERR5490522/ERR5490522.fastq.gz 6189559 ftp.sra.ebi.ac.uk/vol1/run/ERR549/ERR5490522/GUP001.A0101.MT1.3.fastq.truncated.gz
SAMEA8270508 PRJEB43715 ERR5490523 GUP001 GUP001_MNT half ds ERS5956814 2021-06-18 2021-06-18 Illumina HiSeq 4000 SINGLE METAGENOMIC ILLUMINA GUP001.A0101 WGS fasp.sra.ebi.ac.uk:/vol1/fastq/ERR549/003/ERR5490523/ERR5490523.fastq.gz 95526960 29a36d22db98919f96d32af631a9422c ftp.sra.ebi.ac.uk/vol1/fastq/ERR549/003/ERR5490523/ERR5490523.fastq.gz 2888875 ftp.sra.ebi.ac.uk/vol1/run/ERR549/ERR5490523/GUP001.A0101.SG1.1.fastq.truncated.gz
SAMEA8270508 PRJEB43715 ERR5490524 GUP001 GUP001_MNT half ds ERS5956814 2021-06-18 2021-06-18 Illumina HiSeq 4000 SINGLE METAGENOMIC ILLUMINA GUP001.A0101 Targeted-Capture fasp.sra.ebi.ac.uk:/vol1/fastq/ERR549/004/ERR5490524/ERR5490524.fastq.gz 562634908 ee58126a3880e12ac25ff3c1adf97aec ftp.sra.ebi.ac.uk/vol1/fastq/ERR549/004/ERR5490524/ERR5490524.fastq.gz 17847250 ftp.sra.ebi.ac.uk/vol1/run/ERR549/ERR5490524/GUP001.A0101.TF1.1.fastq.truncated.gz
SAMEA8270508 PRJEB43715 ERR5490525 GUP001 GUP001_MNT half ds ERS5956814 2021-06-18 2021-06-18 Illumina HiSeq 4000 SINGLE METAGENOMIC ILLUMINA GUP001.A0101 Targeted-Capture fasp.sra.ebi.ac.uk:/vol1/fastq/ERR549/005/ERR5490525/ERR5490525.fastq.gz 1541567827 640a45bc4956b647f5578a66763813d0 ftp.sra.ebi.ac.uk/vol1/fastq/ERR549/005/ERR5490525/ERR5490525.fastq.gz 50108944 ftp.sra.ebi.ac.uk/vol1/run/ERR549/ERR5490525/GUP001.A0101.TF1.2.fastq.truncated.gz
SAMEA8270508 PRJEB43715 ERR5490526 GUP001 GUP001_MNT half ds ERS5956814 2021-06-18 2021-06-18 Illumina HiSeq 4000 SINGLE METAGENOMIC ILLUMINA GUP001.A0101 Targeted-Capture fasp.sra.ebi.ac.uk:/vol1/fastq/ERR549/006/ERR5490526/ERR5490526.fastq.gz 674880719 e58345ddc808d879d255f8d68065050d ftp.sra.ebi.ac.uk/vol1/fastq/ERR549/006/ERR5490526/ERR5490526.fastq.gz 23288459 ftp.sra.ebi.ac.uk/vol1/run/ERR549/ERR5490526/GUP001.A0101.TF1.3.fastq.truncated.gz
SAMEA8270508 PRJEB43715 ERR5490527 GUP001 GUP001_ss_MNT minus ss ERS5956814 2021-06-18 2021-06-18 Illumina HiSeq 4000 SINGLE METAGENOMIC ILLUMINA GUP001.A0102 Targeted-Capture fasp.sra.ebi.ac.uk:/vol1/fastq/ERR549/007/ERR5490527/ERR5490527.fastq.gz 656969967 d16a8f18feef912878f0b56218581717 ftp.sra.ebi.ac.uk/vol1/fastq/ERR549/007/ERR5490527/ERR5490527.fastq.gz 21959304 ftp.sra.ebi.ac.uk/vol1/run/ERR549/ERR5490527/GUP001.A0102.AA1.1.fastq.truncated.gz
SAMEA8270508 PRJEB43715 ERR5490528 GUP001 GUP001_ss_MNT minus ss ERS5956814 2021-06-18 2021-06-18 Illumina HiSeq 4000 SINGLE METAGENOMIC ILLUMINA GUP001.A0102 WGS fasp.sra.ebi.ac.uk:/vol1/fastq/ERR549/008/ERR5490528/ERR5490528.fastq.gz 145599926 e2be513bf56c01591ecdb7dfbe55c5cb ftp.sra.ebi.ac.uk/vol1/fastq/ERR549/008/ERR5490528/ERR5490528.fastq.gz 5086983 ftp.sra.ebi.ac.uk/vol1/run/ERR549/ERR5490528/GUP001.A0102.SG1.1.fastq.truncated.gz
SAMEA8270508 PRJEB43715 ERR5490529 GUP001 GUP001_ss_MNT minus ss ERS5956814 2021-06-18 2021-06-18 Illumina HiSeq 4000 SINGLE METAGENOMIC ILLUMINA GUP001.A0102 Targeted-Capture fasp.sra.ebi.ac.uk:/vol1/fastq/ERR549/009/ERR5490529/ERR5490529.fastq.gz 674885210 189d55160fdfec278ddc8fba5296ad18 ftp.sra.ebi.ac.uk/vol1/fastq/ERR549/009/ERR5490529/ERR5490529.fastq.gz 23288459 ftp.sra.ebi.ac.uk/vol1/run/ERR549/ERR5490529/GUP001.A0102.TF1.1.fastq.truncated.gz
6 changes: 6 additions & 0 deletions 2021_CarlhoffNature/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
- V 0.1.6: Add contributor and description
- V 0.1.5: Remove abstract from .bib file
- V 0.1.4: Add Country_ISO info
- V 0.1.3: Fill-in metadata from community-archive: 2021_CarlhoffNature-2.2.0
- V 0.1.2: Add bibtex. Update Genetic_Sex. Add identical individual to Relation_* columns.
- V 0.1.1: Automatic update of janno file from Minotaur processing.
26 changes: 26 additions & 0 deletions 2021_CarlhoffNature/POSEIDON.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
poseidonVersion: 2.7.1
title: 2021_CarlhoffNature
description: Genome of a Middle Holocene hunter-gatherer from Wallacea
contributor:
- name: Thiseas C. Lamnidis
email: [email protected]
orcid: 0000-0003-4485-8570
packageVersion: 0.1.6
TCLamnidis marked this conversation as resolved.
Show resolved Hide resolved
lastModified: 2024-04-04
genotypeData:
format: PLINK
genoFile: 2021_CarlhoffNature.bed
genoFileChkSum: 079d15ad25a6db3f7e3d1da9e105abc3
snpFile: 2021_CarlhoffNature.bim
snpFileChkSum: 433fa85a23f3123bade02348e4628b75
indFile: 2021_CarlhoffNature.fam
indFileChkSum: d5982f4cd410501dd26c0b2f2b95a83b
snpSet: 1240K
jannoFile: 2021_CarlhoffNature.janno
jannoFileChkSum: 9b1fc8f11211767fbb5868128327a6cd
sequencingSourceFile: 2021_CarlhoffNature.ssf
sequencingSourceFileChkSum: ac4cec11100b5ab343548313c86c68b2
bibFile: 2021_CarlhoffNature.bib
bibFileChkSum: 6a742a8973cab71d1676cef91071395c
readmeFile: README.md
changelogFile: CHANGELOG.md
53 changes: 53 additions & 0 deletions 2021_CarlhoffNature/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# 2021_CarlhoffNature

This package was created on 2024-04-03 and was processed using the following versions:

- nf-core/eager version: 2.4.6
- Minotaur config version: 0.2.1dev
- CaptureType profile: 1240K
- CaptureType config version: 0.2.2dev
- Config template version: 0.3.0dev
- Package config version: 0.3.0dev
- Minotaur-packager version: 0.3.0dev
- populate_janno.py version: 0.3.2dev

## Fill relevant columns from the community archive package.

```bash
## First fill-in missing metadata from relevant columns (no processing-based info).
trident jannocoalesce \
-s ../../archives/community-archive/2021_CarlhoffNature-2.2.0/2021_CarlhoffNature.janno \
-t 2021_CarlhoffNature/2021_CarlhoffNature.janno \
--stripIdRegex "(_ss_MNT$)|(_MNT$)" \
--includeColumns Alternative_IDs,Relation_To,Relation_Degree,Relation_Type,Relation_Note,Collection_ID,Country,Country_ISO,Location,Site,Latitude,Longitude,Date_Type,Date_C14_Labnr,Date_C14_Uncal_BP,Date_C14_Uncal_BP_Err,Date_BC_AD_Start,Date_BC_AD_Median,Date_BC_AD_Stop,Date_Note,MT_Haplogroup,Y_Haplogroup,Source_Tissue,Primary_Contact,Note,Keywords

## Then fill in Group_Name and Genetic_Sex
trident jannocoalesce \
-s ../../archives/community-archive/2021_CarlhoffNature-2.2.0/2021_CarlhoffNature.janno \
-t 2021_CarlhoffNature/2021_CarlhoffNature.janno \
--stripIdRegex "(_ss_MNT$)|(_MNT$)" \
--includeColumns Genetic_Sex,Group_Name \
--force

## Mirror Sex and Group name info to the fam file.
paste -d "\t" 2021_CarlhoffNature/2021_CarlhoffNature.fam <(cut -f 1-3 2021_CarlhoffNature/2021_CarlhoffNature.janno |tail -n +2) | \
awk '
BEGIN{
OFS=IFS="\t"
}
{
$1=$9
if ($8 == "M") {
$5=1
} else if ($8 == "F") {
$5=2
}
print $1,$2,$3,$4,$5,$6
}
' > tmp.fam
## Cannot overwrite in the same command that reads in the file, so an extra mv is needed.
mv tmp.fam 2021_CarlhoffNature/2021_CarlhoffNature.fam

## trident version: 1.4.1.0
trident rectify --packageVersion Patch --logText "Fill-in metadata from community-archive: 2021_CarlhoffNature-2.2.0" --checksumAll -d .
```
Loading