Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extended standalone #63

Open
wants to merge 99 commits into
base: postprocess_dedup_filter
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
99 commits
Select commit Hold shift + click to select a range
9e14ec5
added new scripts for extending standalone features
Mar 6, 2020
8720927
fix for dbsnp
Mar 9, 2020
046b123
Merge branch 'postprocess_dedup_filter' into extended_standalone
Mar 11, 2020
9cb9960
fix_extract_ensemble
Mar 12, 2020
19b8512
Merge branch 'postprocess_dedup_filter' into extended_standalone
Mar 12, 2020
58ccf62
fix dirnames
Mar 12, 2020
db2312e
Merge branch 'postprocess_dedup_filter' into extended_standalone
Mar 12, 2020
cab5ff7
small fix
Mar 14, 2020
c53f74e
Merge branch 'postprocess_dedup_filter' into extended_standalone
Mar 18, 2020
efa4190
fix features
Mar 19, 2020
0ed0e83
Merge branch 'postprocess_dedup_filter' into extended_standalone
Apr 10, 2020
e4be780
fix ensemble
Apr 10, 2020
8fc8583
backward compatiblity for call.py
Apr 22, 2020
7514b31
Merge branch 'postprocess_dedup_filter' into extended_standalone
Apr 28, 2020
8e8ec69
few fixes
Apr 29, 2020
2a84509
fix format
Apr 29, 2020
5581ba4
small fix
Apr 29, 2020
115d814
fix for training loss
Apr 29, 2020
c848028
Merge branch 'postprocess_dedup_filter' into extended_standalone
Apr 29, 2020
a3c4f3e
fix for backward compatibility
Apr 29, 2020
55a498e
Merge branch 'postprocess_dedup_filter' into extended_standalone
Apr 29, 2020
f9ee725
fix train loss
Apr 30, 2020
b09f5f6
Merge branch 'postprocess_dedup_filter' into extended_standalone
Apr 30, 2020
26c4ca4
fix extend_features
May 3, 2020
757dfab
improve efficiency in extend_features
May 3, 2020
776ddce
fix for extend_features
May 4, 2020
15ff4a8
added seq_complexity
May 5, 2020
2c4f45d
fix ensemble
May 6, 2020
e913e83
more efficient LC
May 6, 2020
2be51cc
small fix
May 6, 2020
e9b83da
filter duplicate by default
May 6, 2020
a54be69
switched fisher test
May 7, 2020
4f6bf57
fix seq_complexity
May 9, 2020
dbe9c4a
fix num fields
May 9, 2020
cb64681
zero anns columns added
May 10, 2020
97d16f6
fix bug in read_info_extractor.py as in somaticseq
May 10, 2020
018e87a
cluster variants for feature extraction
May 12, 2020
0e394ff
small fix
May 12, 2020
3c061b3
record aligned_pairs
May 12, 2020
709b64b
small fix
May 12, 2020
12167e1
more efficient read/ref pos match search
May 13, 2020
ff00be3
input num_splits
May 13, 2020
366964e
max_cluster size added
May 13, 2020
839cc63
better memory management for feature extraction
May 13, 2020
0501883
not to store aligned_pairs
May 13, 2020
90a68da
small fix
May 13, 2020
f83b6b5
enable custom header
May 15, 2020
49c809f
fixed a bug
May 15, 2020
2f9905b
fix bug in region splitting
May 15, 2020
c4f24ac
small fix
May 15, 2020
1ce3935
small_fix
May 15, 2020
8b29757
small fix
May 15, 2020
654b90a
Merge branch 'extended_standalone' into custom_header
May 15, 2020
7b0cb75
small fix
May 15, 2020
7706c3b
enable custom heading
May 16, 2020
27f98c8
small fix
May 16, 2020
0bc4655
small fix
May 17, 2020
7deb7d6
small fix
May 23, 2020
7897df8
small fix
May 28, 2020
8ac67a1
fix ann
May 31, 2020
607abcd
small fix
Jun 5, 2020
867ba5f
added callers vcf to tsv
Jun 18, 2020
635341e
merge regions for scanning
Jun 18, 2020
a52d0c2
bug fixes for call/post
Jun 19, 2020
2094b3e
fix_bugs
Jun 19, 2020
30ac6cf
updated versions to 0.3.0
Jun 19, 2020
c2deecb
small fix
Jun 19, 2020
bad8004
small fix
Jun 19, 2020
97e4864
fix test
Jun 20, 2020
fb6ea21
fix in resolve variants
Jun 24, 2020
1bf40b4
ensemble with internal features
Jun 25, 2020
2471f8e
fix resolve
Jun 25, 2020
544e2f2
small fix
Jul 2, 2020
96d2091
small fix
Jul 2, 2020
fd2a2f7
repeat extension
Jul 19, 2020
cca94c8
small fix
Jul 24, 2020
b4d2bdf
improve cpu multi-thread call.py
Jul 29, 2020
a359327
small fix
Jul 30, 2020
f6d3174
updated dockerfile
Jul 30, 2020
accb759
updated docker test
Jul 30, 2020
9f5d876
fix build
Sep 19, 2020
e500bf7
small_fix
Oct 6, 2020
e48e15d
force cov_thr
Nov 7, 2020
a96a236
fix max_cov
Nov 9, 2020
d804742
fixed matrices gradual delete
Dec 6, 2020
45fbcd6
fix generate_dataset
Dec 29, 2020
cff3653
fix ensemble rounding
Jan 22, 2021
603d582
reduce disc I/O while calling
Jan 23, 2021
d8738f0
Updated README
Jan 26, 2021
e8f9f04
fix README
Jan 26, 2021
088c845
added uint16 as an option for input matrices
Mar 5, 2021
47ac4da
fixed uint16
Mar 5, 2021
d948d08
added report_all_alleles
Mar 8, 2021
798c880
added strict_labeling
Mar 8, 2021
cdfc062
fixed strict_labeling
Mar 8, 2021
74a27df
fixed strict_labeling
Mar 8, 2021
a2eed5e
small fix
Mar 10, 2021
399b15c
fixed strict_labeling
Mar 10, 2021
13c45b7
small fix for strict_labeling
Mar 11, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 19 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,27 +40,29 @@ NeuSomatic first scans the genome to identify candidate variants and extract ali
The binary for this step can be obtained at `neusomatic/bin` folder by running `./build.sh` (which requires cmake 3.13.2 and g++ 5.4.0).

Python 3.7 and the following Python packages must be installed:
* pytorch 1.1.0
* torchvision 0.3.0
* pybedtools 0.8.0
* pysam 0.15.2
* pytorch 1.6.0
* torchvision 0.7.0
* pysam 0.16.0.1
* zlib 1.2.11
* numpy 1.15.4
* scipy 1.2.0
* imageio 2.5.0
* biopython 1.73
* numpy 1.18.1
* scipy 1.4.1
* pillow 7.2.0
* imageio 2.8.0
* biopython 1.77
* fisher 0.1.9

It also depends on the following packages:
* cudatoolkit 9.0 (if you want to use GPU)
* cudatoolkit 10.1 (if you want to use GPU)
* tabix 0.2.6
* bedtools 2.27.1
* bedtools 2.29.2
* samtools 1.9

You can install these packages using [anaconda](https://www.anaconda.com/download)/[miniconda](https://conda.io/miniconda.html) :
You can install these packages using [anaconda](https://www.anaconda.com/download)/[miniconda](https://conda.io/miniconda.html) (for Python 3.7 on miniconda you can use [this link](https://repo.anaconda.com/miniconda/Miniconda3-py37_4.8.3-Linux-x86_64.sh)):
```
conda install zlib=1.2.11 numpy=1.15.4 scipy=1.2.0 cmake=3.13.2 imageio=2.5.0
conda install pysam=0.15.2 pybedtools=0.8.0 samtools=1.9 tabix=0.2.6 bedtools=2.27.1 biopython=1.73 -c bioconda
conda install pytorch=1.1.0 torchvision=0.3.0 cudatoolkit=9.0 -c pytorch
conda install zlib=1.2.11 numpy=1.18.1 scipy=1.4.1 pillow=7.2.0 cmake=3.17.0 imageio=2.8.0
conda install pysam=0.16.0.1 samtools=1.9 tabix=0.2.6 bedtools=2.29.2 biopython=1.77 -c bioconda
conda install pytorch=1.6.0 torchvision=0.7.0 cudatoolkit=10.1 -c pytorch
conda install -c conda-forge fisher=0.1.9
```
Then you can export the conda paths as:
```
Expand Down Expand Up @@ -123,7 +125,7 @@ python preprocess.py \
--work work_train \
--truth_vcf truth.vcf \
--min_mapq 10 \
--number_threads 10 \
--num_threads 10 \
--scan_alignments_binary ../bin/scan_alignments
```
2. Train network
Expand All @@ -147,7 +149,7 @@ python preprocess.py \
--normal_bam normal.bam \
--work work_call \
--min_mapq 10 \
--number_threads 10 \
--num_threads 10 \
--scan_alignments_binary ../bin/scan_alignments
```
2. Call variants
Expand Down Expand Up @@ -278,7 +280,7 @@ do
--reference GRCh38.fa --tumor_bam tumor.bam --normal_bam normal.bam \
--region_bed work/splits/region_${i}.bed \
--work work/work_${i} \
--min_mapq 10 --number_threads 24 \
--min_mapq 10 --num_threads 24 \
--scan_alignments_binary ../bin/scan_alignments"
done
```
Expand Down
4 changes: 4 additions & 0 deletions build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@
set -e
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd -P)"

if [ -d ${DIR}/neusomatic/build ]; then
rm -rf ${DIR}/neusomatic/build
fi
rm -rf $DIR/third_party/SeqLib/ $DIR/third_party/seqan/
pushd $DIR/neusomatic
mkdir build
Expand All @@ -10,3 +13,4 @@ pushd $DIR/neusomatic
make
popd
popd
rm -rf $DIR/third_party/SeqLib/ $DIR/third_party/seqan/
45 changes: 23 additions & 22 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,38 +1,38 @@
FROM ubuntu:16.04
FROM ubuntu:18.04

ENV NEUSOMATIC_VERSION 0.2.1
ENV NEUSOMATIC_VERSION 0.3.0
ENV ZLIB_VERSION 1.2.11
ENV NUMPY_VERSION 1.15.4
ENV SCIPY_VERSION 1.2.0
ENV IMAGEIO_VERSION 2.5.0
ENV PYTORCH_VERSION 1.1.0
ENV TORCHVISION_VERSION 0.3.0
ENV CUDATOOLKIT_VERSION 9.0
ENV CMAKE_VERSION 3.13.2
ENV PYBEDTOOLS_VERSION 0.8.0
ENV PYSAM_VERSION 0.15.2
ENV NUMPY_VERSION 1.18.5
ENV SCIPY_VERSION 1.5.0
ENV IMAGEIO_VERSION 2.9.0
ENV PILLOW_VERSION 7.2.0
ENV PYTORCH_VERSION 1.6.0
ENV TORCHVISION_VERSION 0.7.0
ENV CUDATOOLKIT_VERSION 10.1
ENV CMAKE_VERSION 3.14.0
ENV PYSAM_VERSION 0.15.3
ENV SAMTOOLS_VERSION 1.9
ENV TABIX_VERSION 0.2.6
ENV BEDTOOLS_VERSION 2.27.1
ENV BIOPYTHON_VERSION 1.72
ENV BEDTOOLS_VERSION 2.29.2
ENV BIOPYTHON_VERSION 1.77
ENV FISHER_VERSION 0.1.9
ENV GCC_VERSION 5

RUN apt-get update && apt-get install -y --fix-missing \
build-essential zlib1g-dev curl less vim bzip2
RUN apt-get install -y --fix-missing git wget

RUN curl -LO http://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
RUN bash Miniconda3-latest-Linux-x86_64.sh -p /miniconda -b
RUN rm Miniconda3-latest-Linux-x86_64.sh
RUN curl -LO https://repo.anaconda.com/miniconda/Miniconda3-py37_4.8.3-Linux-x86_64.sh
RUN bash Miniconda3-py37_4.8.3-Linux-x86_64.sh -p /miniconda -b
RUN rm Miniconda3-py37_4.8.3-Linux-x86_64.sh
ENV PATH=/miniconda/bin:${PATH}
ENV LD_LIBRARY_PATH=/miniconda/lib:${LD_LIBRARY_PATH}
RUN conda update -y conda


RUN conda install -y zlib=${ZLIB_VERSION} numpy=${NUMPY_VERSION} scipy=${SCIPY_VERSION} \
imageio=${IMAGEIO_VERSION} && conda clean -a
RUN conda install -y cmake=${CMAKE_VERSION} -c conda-forge && conda clean -a
RUN conda install -y pysam=${PYSAM_VERSION} pybedtools=${PYBEDTOOLS_VERSION} \
pillow=${PILLOW_VERSION} cmake=${CMAKE_VERSION} imageio=${IMAGEIO_VERSION} && conda clean -a
RUN conda install -y fisher=${FISHER_VERSION} -c conda-forge && conda clean -a
RUN conda install -y pysam=${PYSAM_VERSION} \
samtools=${SAMTOOLS_VERSION} tabix=${TABIX_VERSION} \
bedtools=${BEDTOOLS_VERSION} \
biopython=${BIOPYTHON_VERSION} -c bioconda && conda clean -a
Expand All @@ -42,7 +42,8 @@ RUN conda install -y pytorch=${PYTORCH_VERSION} \

RUN apt-get install -y --fix-missing gcc-${GCC_VERSION} g++-${GCC_VERSION}

ADD https://github.com/bioinform/neusomatic/archive/v${NEUSOMATIC_VERSION}.tar.gz /opt/v${NEUSOMATIC_VERSION}.tar.gz
RUN cd /opt/ && tar -xzvf v${NEUSOMATIC_VERSION}.tar.gz && mv neusomatic-${NEUSOMATIC_VERSION} neusomatic && rm /opt/v${NEUSOMATIC_VERSION}.tar.gz

ADD https://github.com/bioinform/neusomatic/archive/extended_standalone.tar.gz /opt/extended_standalone.tar.gz
RUN cd /opt/ && tar -xzvf extended_standalone.tar.gz && mv neusomatic-extended_standalone neusomatic && rm /opt/extended_standalone.tar.gz
RUN cd /opt/neusomatic/ && ./build.sh
ENV PATH=/opt/neusomatic/neusomatic/bin:/opt/neusomatic/neusomatic/python/:${PATH}
157 changes: 92 additions & 65 deletions neusomatic/cpp/scan_alignments.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ int main(int argc, char **argv) {
float del_min_af=min_af;
float snp_min_af=min_af;
const bool calculate_qual_stat = opts.calculate_qual_stat();
const bool report_all_alleles = opts.report_all_alleles();

//const std::map<char, int> empty_pileup_counts = {{'-', 0}, {'A', 0}, {'C', 0}, {'G', 0}, {'T', 0}};
static const std::vector<char> nuc_code_char = {'A', 'C', 'G', 'T', '-', 'N'};
Expand Down Expand Up @@ -206,83 +207,109 @@ int main(int argc, char **argv) {
<<":"<<pileup_counts[3]<<std::endl;
}

int major = -1;
int major_count = 0;
int minor = -1;
int minor_count = 0;
int minor2 = -1;
int minor2_count = 0;

for (int row = 0; row < cols[i].base_freq_.size(); ++row) {
if (cols[i].base_freq_[row] > major_count) {
minor2 = minor;
minor2_count = minor_count;
minor_count = major_count;
minor = major;
major_count = cols[i].base_freq_[row];
major = row;
} else if (cols[i].base_freq_[row] > minor_count) {
minor2 = minor;
minor2_count = minor_count;
minor_count = cols[i].base_freq_[row];
minor = row;
} else if (cols[i].base_freq_[row] > minor2_count) {
minor2_count = cols[i].base_freq_[row];
minor2 = row;
std::map<int, int> alt_counts;
auto ref_count = cols[i].base_freq_[ref_code];
auto var_code = ref_code;
int var_count = 0;
int dp = ref_count;
if (report_all_alleles and ref_base != '-'){
for (int row = 0; row < cols[i].base_freq_.size(); ++row) {
auto alt_cnt = cols[i].base_freq_[row];
if (( row != ref_code) and (alt_cnt > 0)){
auto af = alt_cnt/float(alt_cnt+ref_count);
if ((alt_cnt >= ref_code) or ((row == 4 and af > del_min_af ) or
(row != 4 and ref_base != '-' and af > snp_min_af ) or
(ref_base =='-' and af > ins_min_af))){
alt_counts.insert(std::pair<int, int>(row, alt_cnt));
dp += alt_cnt;
}
}
}
}else{
int major = -1;
int major_count = 0;
int minor = -1;
int minor_count = 0;
int minor2 = -1;
int minor2_count = 0;

for (int row = 0; row < cols[i].base_freq_.size(); ++row) {
if (cols[i].base_freq_[row] > major_count) {
minor2 = minor;
minor2_count = minor_count;
minor_count = major_count;
minor = major;
major_count = cols[i].base_freq_[row];
major = row;
} else if (cols[i].base_freq_[row] > minor_count) {
minor2 = minor;
minor2_count = minor_count;
minor_count = cols[i].base_freq_[row];
minor = row;
} else if (cols[i].base_freq_[row] > minor2_count) {
minor2_count = cols[i].base_freq_[row];
minor2 = row;
}
}
}

if (minor != -1 and major != -1){
if (minor2 != -1 and ref_code == major and minor == 4 and ref_code != 4 ){
if (minor2_count>0.5*minor_count){
minor = minor2;
minor_count = minor2_count;
if (minor != -1 and major != -1){
if (minor2 != -1 and ref_code == major and minor == 4 and ref_code != 4 ){
if (minor2_count>0.5*minor_count){
minor = minor2;
minor_count = minor2_count;
}
}
}
auto af = minor_count/float(major_count+minor_count);
if (major != ref_code){
var_code = major;
var_count = major_count;
} else if (minor != ref_code and ( (minor == 4 and af > del_min_af ) or
(minor != 4 and ref_base != '-' and af > snp_min_af ) or
(ref_base =='-' and af > ins_min_af))){
var_code = minor;
var_count = minor_count;
}
if (var_count > 0) {
alt_counts.insert(std::pair<int, int>(var_code,var_count));
dp += var_count;
}
}
auto ref_count = cols[i].base_freq_[ref_code];
auto var_code = ref_code;
int var_count = 0;
auto af = minor_count/float(major_count+minor_count);
if (major != ref_code){
var_code = major;
var_count = major_count;
} else if (minor != ref_code and ( (minor == 4 and af > del_min_af ) or
(minor != 4 and ref_base != '-' and af > snp_min_af ) or
(ref_base =='-' and af > ins_min_af))){
var_code = minor;
var_count = minor_count;
}

if (var_count > 0) {

auto record_info = "AF="+std::to_string((var_count)/float(var_count+ref_count))+";DP="+std::to_string(nrow)+";RO="+std::to_string(ref_count)+";AO="+std::to_string(var_count);
auto gtinfo = "0/1:"+std::to_string(nrow)+":"+std::to_string(ref_count)+":"+std::to_string(var_count);
// for(auto it = alt_counts.cbegin(); it != alt_counts.cend(); ++it)
// {
// std::cout << it->first << " " << it->second << std::endl;
// }
for(auto it = alt_counts.cbegin(); it != alt_counts.cend(); ++it)
{
auto var_code_ = it->first;
auto var_count_ = it->second;
auto record_info = "AF="+std::to_string((var_count_)/float(dp))+";DP="+std::to_string(nrow)+";RO="+std::to_string(ref_count)+";AO="+std::to_string(var_count_);
auto gtinfo = "0/1:"+std::to_string(nrow)+":"+std::to_string(ref_count)+":"+std::to_string(var_count_);
if (calculate_qual_stat){
record_info += ";ST="+std::to_string(int(round(ref_count*(cols_strand[i].strand_mean[ref_code]/100))))+ \
","+std::to_string(int(round(var_count*(cols_strand[i].strand_mean[var_code]/100))))+ \
","+std::to_string(int(round(var_count_*(cols_strand[i].strand_mean[var_code_]/100))))+ \
";LS="+std::to_string(lsc_counts)+\
";RS="+std::to_string(rsc_counts)+\
";NM="+std::to_string(int(round(cols_tag[i].tag_mean[var_code][0])))+\
";AS="+std::to_string(int(round(cols_tag[i].tag_mean[var_code][1])))+ \
";XS="+std::to_string(int(round(cols_tag[i].tag_mean[var_code][2])))+ \
";PR="+std::to_string(int(round(cols_tag[i].tag_mean[var_code][3])))+ \
";CL="+std::to_string(int(round(cols_tag[i].tag_mean[var_code][4])))+ \
";MQ="+std::to_string(int(round(cols_mqual[i].mqual_mean[var_code])))+ \
";BQ="+std::to_string(int(round(cols[i].bqual_mean[var_code])));
";NM="+std::to_string(int(round(cols_tag[i].tag_mean[var_code_][0])))+\
";AS="+std::to_string(int(round(cols_tag[i].tag_mean[var_code_][1])))+ \
";XS="+std::to_string(int(round(cols_tag[i].tag_mean[var_code_][2])))+ \
";PR="+std::to_string(int(round(cols_tag[i].tag_mean[var_code_][3])))+ \
";CL="+std::to_string(int(round(cols_tag[i].tag_mean[var_code_][4])))+ \
";MQ="+std::to_string(int(round(cols_mqual[i].mqual_mean[var_code_])))+ \
";BQ="+std::to_string(int(round(cols[i].bqual_mean[var_code_])));
gtinfo += ":"+std::to_string(int(round(ref_count*(cols_strand[i].strand_mean[ref_code]/100))))+","+ \
std::to_string(int(round(var_count*(cols_strand[i].strand_mean[var_code]/100))))+":"+\
std::to_string(int(round(var_count_*(cols_strand[i].strand_mean[var_code_]/100))))+":"+\
std::to_string(lsc_counts)+":"+\
std::to_string(rsc_counts)+":"+\
std::to_string(int(round(cols_tag[i].tag_mean[var_code][0])))+":"+\
std::to_string(int(round(cols_tag[i].tag_mean[var_code][1])))+":"+\
std::to_string(int(round(cols_tag[i].tag_mean[var_code][2])))+":"+\
std::to_string(int(round(cols_tag[i].tag_mean[var_code][3])))+":"+\
std::to_string(int(round(cols_tag[i].tag_mean[var_code][4])))+":"+\
std::to_string(int(round(cols_mqual[i].mqual_mean[var_code])))+":"+\
std::to_string(int(round(cols[i].bqual_mean[var_code])));
std::to_string(int(round(cols_tag[i].tag_mean[var_code_][0])))+":"+\
std::to_string(int(round(cols_tag[i].tag_mean[var_code_][1])))+":"+\
std::to_string(int(round(cols_tag[i].tag_mean[var_code_][2])))+":"+\
std::to_string(int(round(cols_tag[i].tag_mean[var_code_][3])))+":"+\
std::to_string(int(round(cols_tag[i].tag_mean[var_code_][4])))+":"+\
std::to_string(int(round(cols_mqual[i].mqual_mean[var_code_])))+":"+\
std::to_string(int(round(cols[i].bqual_mean[var_code_])));
}
auto var_base = nuc_code_char[var_code];
auto var_base = nuc_code_char[var_code_];
if (ref_base == '-') {ref_base = 'N';}
if (var_base == '-') {var_base = 'N';}
auto var_ref_pos=ginv.left() + cc.UngapPos(i);
Expand All @@ -304,7 +331,7 @@ int main(int argc, char **argv) {
appendValue(record.genotypeInfos, gtinfo);
vcf_writer.Write(record);
if (opts.verbosity()>0){
std::cout<<"var: " << i << "," << var_ref_pos << ","<< ref_base << "," << var_base<<","<<nrow<<":"<<ref_count<<":"<<var_count<<std::endl;
std::cout<<"var: " << i << "," << var_ref_pos << ","<< ref_base << "," << var_base<<","<<nrow<<":"<<ref_count<<":"<<var_count_<<std::endl;
std::cout<<"col "<<i<<": ";
std::cout<<"(ref= "<< ref_base << ") ";
for (size_t row = 0; row < cols[i].base_freq_.size(); ++row) {
Expand Down
14 changes: 14 additions & 0 deletions neusomatic/include/Options.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ namespace neusomatic {
{"filter_improper_pair", no_argument, 0, 'E'},
{"filter_mate_unmapped", no_argument, 0, 'F'},
{"filter_improper_orientation", no_argument, 0, 'G'},
{"report_all_alleles", no_argument, 0, 'A'},
{0, 0, 0, 0} // terminator
};

Expand Down Expand Up @@ -58,6 +59,7 @@ namespace neusomatic {
std::cerr<< "-E/--filter_improper_pair, filter improper pairs if the flag is set, default is False.\n";
std::cerr<< "-F/--filter_mate_unmapped, filter reads with unmapeed mates if the flag is set, default is False.\n";
std::cerr<< "-G/--filter_improper_orientation, filter reads with improper orientation (not FR) or different chrom, default is False.\n";
std::cerr<< "-A/--report_all_alleles, report all alleles per column, default is False.\n";
}

int parseInt(const char* optarg, int lower, const char *errmsg, void (*print_help)()) {
Expand Down Expand Up @@ -163,6 +165,9 @@ namespace neusomatic {
case 'G':
opt.filter_improper_orientation() = true;
break;
case 'A':
opt.report_all_alleles() = true;
break;
case 'a':
opt.min_allele_freq() = parseFloat(optarg, 0.0, 1.0, "-a/--min_af must be between 0 and 1", print_help);
break;
Expand Down Expand Up @@ -329,6 +334,14 @@ struct Options {
return (filter_improper_orientation_);
}

decltype(auto) report_all_alleles() const {
return (report_all_alleles_);
}

decltype(auto) report_all_alleles() {
return (report_all_alleles_);
}

private:
unsigned verbosity_ = 0;
std::string bam_in_;
Expand All @@ -350,6 +363,7 @@ struct Options {
bool filter_improper_pair_ = false;
bool filter_mate_unmapped_ = false;
bool filter_improper_orientation_ = false;
bool report_all_alleles_ = false;
};
}//namespace neusomatic

Expand Down
2 changes: 1 addition & 1 deletion neusomatic/python/_version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.2.1"
__version__ = "0.3.0"
Loading