Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Use standard in/out #23

Open
MillironX opened this issue Jan 25, 2022 · 1 comment
Open

[Feature]: Use standard in/out #23

MillironX opened this issue Jan 25, 2022 · 1 comment
Labels
enhancement New feature or request

Comments

@MillironX
Copy link
Member

Have each of the haplink commands read from standard input and write to standard output unless appropriate file flags are called.

Additional Removed Arguments

haplink variants

  • --bam
    • Can be replaced by stdin in SAM format
  • --output
    Can be replaced by stdout in VCF format

haplink haplotypes

  • --variants
    • Can be replaced by stdin in VCF format
  • --output
    • Can be replaced by stdout in YAML format

haplink sequences

  • --haplotypes
    • Can be replaced by stdin in YAML format
  • --output
    • Can be replaced by stdout in FASTA format

Example Usage

Command

minimap2 -ax map-ont --MD example/reference.fasta reads.fastq \
  | tee output.sam \
  | haplink variants --reference example/reference.fasta \
  | tee output.vcf \
  | haplink haplotypes  --bam output.sam \
  | haplink sequences --reference example/reference.fasta \
  > output.fasta

Context

Most bioinformatic tools work with standard in/out. It would be convenient if HapLink did this, too, as it could then be added to cli pipelines for more efficient work.

Possible Implementation

Each of the stdin/stdout parameters will need to be made optional, with manual checks for stdin. Also, read/write operations will need to open stdin/out instead of files.

The big problem here is that https://github.com/BioJulia/XAM.jl does not support a unified API for SAM and BAM records (see BioJulia/XAM.jl#25). Although https://github.com/samtools/htslib and related tools do not seem to care, https://github.com/genome/bam-readcount does require an indexed BAM file, so there will need to be a way to create, sort, and index a BAM file is SAM input is given. There will still need to be a way to sort and index BAM input if given via stdin.

@MillironX MillironX added the enhancement New feature or request label Jan 25, 2022
@MillironX
Copy link
Member Author

Partially implemented with #35. Stdout is used unless a parameter is passed, but stdin is more tricky.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant