blobtools v0.9.19
Release notes
- general refactoring
- version number now printed in all output files
blobtools create
BAM parsing
- BAM parsing now fast(er)
- BAM is parsed with
samtools view -F 4 -F 1028 -F 256
(only mapped, no optical-dupes, no 2nd-ary) - Support for M/X/= in CIGAR string (previously only M's)
BAM/SAM/CAS parsing
- when parsing mapping files, blobtools automatically writes COV files
Taxonomy calculation
- if two highest scoring taxonomies for a given contig have EQUAL scores, taxonomy gets set to "unresolved"
- during plotting, the group "unresolved" gets treated as any other group
- new flag
[--tax_collision_random]
allows legacy behaviour (random selection of taxonomy) - new argument
[--min_diff FLOAT]
sets minimal score difference between two highest scoring taxonomies of a contig in order for the contig to not be called "unresolved"
blobtools view
- refactored and is now fast(er)
supported views
- default view : table output, taxonomy columns are now numbered (1-based, for cut'ing). Can be turned off with
[--notable]
[--concoct]
: outputs taxonomy and coverage information necessary to run concoct[--cov]
: outputs coverage files (for using them in covplots)
blobtools map2cov
- previously bam2cov
- now supports BAM, SAM and CAS files (uses parsers in BtIO)
- implemented, so that one can convert multiple mapping files to lightweight COV format (i.e. using
GNU parallel
)
blobtools blobplot
Flags and arguments
- changed behaviour of
[--multiplot]
. Plots each group separately, then final plot of all groups. - new flag
[--cumulative]
. Analogous to[--multiplot]
behaviour in previous versions: incremental addition of groups - new flag
[--legend]
. Plots legend in a separate figure, useful for slides. - new argument
[--lib LIBNAME]
. Allows selecting particular covlib(s) to plot; lists available libraries when no argument is specified. - new flag
[--filelabel]
: labels coverage axis based on coverage file - fixed
[--exclude]
, now removes group(s) from plotting and stats - changed behaviour of
[--colours FILE]
. All groups not named in FILE are set to WHITE (for example seetest_files/colours.txt
) - added example file for
[--refcov FILE]
(test_files/refcov.txt
)
Scaling
- changed scaling of blobs in plots
- previously, blobs had an area of 65*(length/1,000) pixels^2. But if an assembly had a few very big contigs, these covered the smaller ones.
- now the scaling is done in the following way:
- the biggest contig gets plotted as a blob of area 12,500 pixels^2
- all other contigs get scaled proportionally
- reference scale shows area of 5%, 10% and 25% length of longest contig
Misc
- added x-axis minor ticklabels, all ticklabels facing outwards
- fixed bug causing different plots to share filenames, causing overwriting
blobtools covplot
- same underlying library as blobplot.
- coverage axes are labelled based on coverage file
- axes can be relabelled using
[--xlabel XLABEL]
and[--ylabel YLABEL]
- maximal value for x/y-axes can be set using
[--max FLOAT]
blobtools sumcov
- removed
blobtools taxify
- allows annotating similarity search results (from BLAST or Diamond) with TaxIDs using an ID-mapping-file, or a single TaxID
blobtools seqfilter
- allows filtering a FASTA file based on a list of headers