Skip to content

salmon v0.11.3

Compare
Choose a tag to compare
@rob-p rob-p released this 30 Aug 01:56
· 914 commits to master since this release

Note: Though not technically a bug fix or improvement in salmon itself, we were also able to successfully build salmon again in bioconda under OSX (previously, boost/icu related linking issues were preventing this and only linux builds were available). This means you can upgrade to this version of salmon in bioconda under OSX.

Deprecation note: With the upgrade of bioconda to conda-build 3, all compilers testing salmon builds (on our internal machines, on Travis CI, on our internal CI, in Docker images, and in bioconda) are now invoked in C++14 mode. Moving forward, the minimum required version of GCC supported is 5.2, and C++14 support will be assumed. Technically, this version of salmon can be compiled with C++11 (GCC >= 4.8.2) if the appropriate flags are passed and the preprocessor directive SALMON_DEPRECATED_COMPILER is defined. However, the guarded code will be removed in the next release, and compilation with C++11 will no longer be possible.

Bug fixes & Improvements:

This release implements the following bug fixes:

  1. Fixed a bug that caused salmon to infer the library type as unstranded (U) when run in alignment-based mode with single-end reads. Note: This bug only occurred in alignment-based mode, and only occurred with single-end reads, but the result is that libraries were always detected as unstranded. This has been fixed and single-end libraries in alignment based mode are now properly detected (at least as accurately as is possible using the same heuristic as for all other types of automated detection).

  2. Potential improvement to the accuracy of NumReads, when salmon is run in alignment-based mode. This fix avoids an extra extrapolation of the estimated number of reads from the TPM and total library size in alignment-based mode, which could result in a slight decrease in precision.

Special thanks to Jeremy Simon and Travis Ptacek (UNC Neuroscience Center Bioinformatics Core) for bringing the above to our attention, for providing test data to help reproduce the issues, and for verifying the fixes.

  1. Avoid writing NaN to lib_format_counts.json where strand_mapping_bias in lib_format_counts.json would be written as NaN, in paired-end libraries where reads mapped but none mapped concordantly. That is, when all of the mapped reads were orphans, the strand_mapping_bias resulted in a value of NaN which is not technically valid JSON. This case is now reported as 0.0. This fix addresses #279 --- thanks to @kurtwheeler.

  2. Fixed a minor thread synchronization bug that could cause the number of reported skipped cell barcodes to be incorrectly reported (off by a small number) in Alevin's log.

and the following improvements:

  1. Read libraries are now written in lib_format_counts.json as a proper JSON list, rather than as a single string in a custom format. Thanks to @Miserlou for this suggestion.

  2. Numerous improvements were made to the TravisCI configuration (and associated parts of the build system). Salmon is now compiled in Travis CI under both GCC 5 and 7 to improve testing of compatibility. Many thanks to @junaruga for the pull requests implementing these improvements!

  3. Bumped included versions of fmt and spdlog, and improved pretty printing of log and console messages in a number of places.

New feature:

This release also introduces a new "debug" flag for Alevin (activated by --debug). Thanks to the suggestions from @habilzare, @patrickvdb made in issue #253. When debug mode is activated, Alevin will run in a relatively lenient manner, relaxing the following assumptions, and not raising when the below conditions occur. This allows the pipeline to run to completion even when these conditions occur (though warnings are issued):

  • All externally provided whitelisted barcodes, if given through --whitelist flag, must have some reads in the FASTQ file assigned to it.
  • All the "High Confidence" cells must have at least 10 reads confidently mapped to them.
  • All the whitelisted cellular barcodes must have at least one deduplicated UMI.

That is, Alevin will normally terminate if any of the above conditions occur. With the --debug flag, it will instead run to completion and report the above as warnings.