golden BAM? #106

xmy1990 · 2024-08-08T15:49:49Z

Hi,

Could you plainly explain how a golden BAM is generated? I’ve noticed that the BAM file obtained using fq files and BWA differs significantly from the golden BAM.

Thanks

lmainzer · 2024-08-08T17:45:23Z

Xmy:

The golden BAM is generated at the time of the simulation. It contains information about locations on the chromosomes where reads were taken from and orientation in which they were taken, at the time of the simulation.

The aligner will not output the same BAM, ever. First in the regions with redundant sequences the aligner would not be able to resolve the redundancy and place the reads according to the settings you choose, frequently choosing the top alignment in the list of equivalent alignments. Second, the aligner uses random seed during its process, unless you choose the seed to use. Thus, alignment output will be different from one run to another. Third, depending on the quality of sequencing you chose to simulate (sequencing error rate), the aligner may or may not be able to place the reads correctly.

So, many factors are involved.

If you are looking to figure out how the simulator works and validate that it will be appropriate for you, I suggest you start with a small area of the genome of interest, which does not contain redundant regions. Then simulate with low number of mutations and low sequencing error rate. That will make validation easier. Finally, make sure you adjust parameters in the aligner so that you can control the randomness in the alignment process.

Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

golden BAM? #106

golden BAM? #106

xmy1990 commented Aug 8, 2024

lmainzer commented Aug 8, 2024

golden BAM? #106

golden BAM? #106

Comments

xmy1990 commented Aug 8, 2024

lmainzer commented Aug 8, 2024