Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sequence identifiers containing both GRCh38 and CHM13 were found in the alignment of the GRCh38 reference genome #31

Open
yangyaxi4444 opened this issue Dec 19, 2024 · 2 comments

Comments

@yangyaxi4444
Copy link

When examining the BAM files generated from both GRCh38 and CHM13 mappings using vg giraffe, there's an unexpected presence of sequence headers from both reference genomes in each BAM file. For example:
In GRCh38-mapped BAM:
Copy@SQ SN:GRCh38#0#chr11 LN:135086622
@sq SN:GRCh38#0#chr10 LN:133797422
@sq SN:CHM13#0#chrY LN:26682553
@sq SN:CHM13#0#chrX LN:154259564
And similarly in CHM13-mapped BAM:

Headers contain sequences from both GRCh38 and CHM13
This occurs despite using separate reference graphs

would it influence and quantification step?

@yangyaxi4444
Copy link
Author

hi, if anyone could help me with it

@glennhickey
Copy link
Collaborator

Which BAM file are you talking about? By default vg surject will use all reference paths, which include both GRCh38 and CHM13 in both graphs (chm13-referenecd and grch38-referened) unless otherwise specified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants