You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Under default settings, genome sizes of many of the samples are massively over-estimated, making the estimated genome size much larger than the biggest known bacterial genome. I have had conversations with a couple of people, concluding that this is probably due to the lower read accuracy of ONT data leading to increased kmer discovery, which is not being considered in the genome size estimation.
Suggested fixes: Removing/replacing genome size estimation tool so that good quality data passes screening under default settings OR making recommendations to users about sequencing chemistry/basecallers to be able to run TheiaProk_ONT successfully under default settings
The text was updated successfully, but these errors were encountered:
Many samples in the bacterial_training_data_ont dataset in https://app.terra.bio/#workspaces/theiagen-validations/Theiagen_Doughty_Sandbox/data fails raw read screening under default TheiaProk_ONT settings. NB results in the current data table have modified parameters to get assemblies.
Under default settings, genome sizes of many of the samples are massively over-estimated, making the estimated genome size much larger than the biggest known bacterial genome. I have had conversations with a couple of people, concluding that this is probably due to the lower read accuracy of ONT data leading to increased kmer discovery, which is not being considered in the genome size estimation.
Suggested fixes: Removing/replacing genome size estimation tool so that good quality data passes screening under default settings OR making recommendations to users about sequencing chemistry/basecallers to be able to run TheiaProk_ONT successfully under default settings
The text was updated successfully, but these errors were encountered: