SIDR - Sequence Identification with Decision tRees

https://travis-ci.org/damurdock/SIDR.svg?branch=master

SIDR (pronounced: cider) is a tool to filter Next Generation Sequencing (NGS) data based on a chosen target organism. SIDR uses data fron BLAST (or similar classifiers) to train a decision tree model to classify sequence data as either belonging to the target organism, or belonging to something else. This classification can be used to filter the data for later assembly.

Note: SIDR is alpha software. Features are currently incomplete and subject to major change.

Installation

To install SIDR, clone this repository and run setup.py, or use pip to install.

pip install sidr

See the documentation for more details.

Usage

SIDR has two main modes. Default mode takes several bioinformatics files as input, and computes a decision tree based on percentage GC content and per-base sequencing coverage. To run it, use:

sidr default -d [taxdump path] -b [bamfile] -f [assembly FASTA] -r [BLAST results] -k tokeep.contigids -x toremove.contigids -t [target phylum]

Runfile mode takes a tab-delimited file of contigs, variables, and classification as input. To run it, use:

sidr runfile -i [runfile] -k tokeep.contigids -x toremove.contigids -t [target phylum]

See the documentation for more details.

TODO

More complete documentation
More unit tests

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
docs		docs
sidr		sidr
tests		tests
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.rst		README.rst
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SIDR - Sequence Identification with Decision tRees

Note: SIDR is alpha software. Features are currently incomplete and subject to major change.

Installation

Usage

TODO

About

Releases

Packages

Languages

License

damurdock/SIDR

Folders and files

Latest commit

History

Repository files navigation

SIDR - Sequence Identification with Decision tRees

Note: SIDR is alpha software. Features are currently incomplete and subject to major change.

Installation

Usage

TODO

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages