Skip to content

Commit

Permalink
RLS Version 1.4.0
Browse files Browse the repository at this point in the history
Long reads binning!

Full ChangeLog

* Provide binning algorithm for assemblies from long read
* Add `--allow-missing-mmseqs2` flag to `check_install` subcommand
* Run Prodigal in multiple jobs without multiprocessing (#106)
* Better command line arguments
* Better error checking
  • Loading branch information
luispedro committed Dec 15, 2022
1 parent 8bb066e commit 87340d3
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 5 deletions.
5 changes: 4 additions & 1 deletion ChangeLog
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
Version 1.4.0 Dec 2022 by BigDataBiology
* Provide binning algorithm for assemblies from long read
* Add `--allow-missing-mmseqs2` flag to `check_install` subcommand
* Run Prodigal in multiple jobs without multiprocessing (#106)
* Better command line arguments
* Better error checking

Version 1.3.1 Dec 9 2022 by BigDataBiology
* Make `--training-type` argument optional
* Add `--allow-missing-mmseqs2` flag to `check_install` subcommand

Version 1.3.0 Nov 4 2022 by BigDataBiology
* Add self-supervised learning
Expand Down
27 changes: 23 additions & 4 deletions docs/whatsnew.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,31 @@
# What's New

## Version 1.4
## Version 1.4.0: long reads binning!

*Release December , 2022*
*Released December 15, 2022*

### User visible improvements
Big change is the added binning algorithm for assemblies from long-read datasets.

The overall structure of the pipeline is still similar to what was [manuscript](https://www.nature.com/articles/s41467-022-29843-y), but when clustering, it does not use infomap, but another procedure (an iterative version of DBSCAN).

Use the flag `--sequencing-type=long_read` to enable an alternative clustering that works better with long reads.

### Other user-visible improvements

- Better error checking at multiple steps in the pipeline so that processes that will crash are caught as early as possible
- Add `--allow-missing-mmseqs2` flag to `check_install` subcommand (eventually, self-supervision will be the default and mmseqs2 will be an optional dependency)

### Command line parameter deprecations

The previous arguments should continue to work, but going forward, the newer arguments are probably a better API.

- Selecting self-supervised learning is now done with the `--self-supervised` flag (instead of `--training-type=self`)
- Training from multiple samples is now enabled with the `--train-from-many` flag (instead of `--mode=several`)

### Bugfixes

- Added binning algorithm for assemblies from long-read datasets.
- The output table sometimes had the wrong path in `v1.3`. This has been fixed
- Prodigal is now run in a more robust manner when using multiple threads ([#106](https://github.com/BigDataBiology/SemiBin/issues/106))

## Version 1.3.1

Expand Down

0 comments on commit 87340d3

Please sign in to comment.