01 Feb 14:23

FelixKrueger

db275a3

Diverse fixes and improvements - last version supporting Bowtie (1)

This is an early notice that this will be the last release of Bismark that supports the use of `Bowtie 1`. We have added warning statements to both the genome preparation and alignment steps to warn users that Bowtie1 is now deprecated. All Bowtie 1 functionality and support will disappear in a future release. Please shout now if you think this will be a disaster for you...

bismark

Added check to prevent users from inadvertently specifying the very same file as both R1 and R2
Added a check for file truncation, or more generally the same number of reads between R1 and R2 for paired-end FastQ files (directional, non-directional and PBAT mode).
Added Travis CI testing for most Bismark modules and commands. This should help spotting problems a early, e.g. if I release a new version right before the Christmas holidays ...
Changed error message for failed fork command in --parallel mode to [FATAL ERROR]: ... to alert users that something isn't working as intended.

bismark_genome_preparation

Added multi-threading to the Bowtie2-based genome preparation (thanks to Rahul Karnik)
Added test to see whether specified files exist, or die otherwise

bismark2summary

Fixed division by zero errors when a C-context was not covered by any reads. This will now use values of 0/0 for the context plots, which looks a bit odd, but at least it still works.
Detects if (non-deduplicated) RRBS and WGBS samples are mixed together, and bails with a meaningful error message.

bam2nuc

Changed samtools to $samtools_path during single-end/paired-end file testing.

bismark_methylation_extractor

Changed the order in which --ample_mem and --buffer_size are checked.

Assets 2

26 Apr 15:23

FelixKrueger

0.20.0

d38910c

v0.20.0

bismark_methylation_extractor

The methylation extractor now creates output directories if they don't exist already.
The options --ample_mem and --buffer_size <string> are now mutually exclusive.
Changed the directory being passed on when --cytosine_report is specified from parent directory' to 'output directory'.

bismark2report

Major rewrite of bismark2report: HTML file are now rendered using Plotly.js [plotly.js v1.39.4] which is completely open source and free to use. Highcharts and JQuery were dropped, as was raised here: #177.
The files bioinfo.logo, bismark.logo, plot.ly and plotly_template.tpl are read in dynamically from a new folder plotly. bismark_sitrep and all its contents no longer ship with Bismark. The Bismark HTML reports should be completely self-contained, here is an example paired-end Bismark report.

bismark2summary

Major rewrite of bismark2summary: HTML file are now rendered using Plotly.js [plotly.js v1.39.4] which is completely open source and free to use. Highcharts and JQuery were dropped, as was raised here: #177. The files bioinfo.logo, bismark.logo, plot.ly and plotly_template.tpl are read in dynamically from a new folder plotly. bismark_sitrep and all its contents no longer ship with Bismark. The Bismark HTML Summary reports should be completely self-contained, here is an example of a percent alignment plot for a single cell experiment:
.

And finally, here are some examples for a WGBS summary report, an RRBS report (no deduplication), and the full scBS-seq report and scBS-seq data file.

Assets 2

13 Oct 13:59

FelixKrueger

0.19.0

f20b2ec

v0.19.0 - Various fixes and improvements

Bismark

Changed the methylation call behaviour so that insertions in a read (which are filled in with X for the methylation call) are also considered as Unknown context for the methylation call. Here is issue #135.

filter_non_conversion

Added new options --percentage_cutoff [int] and --minimum_count [int] to allow filtering reads for non-bisulfite conversion using an overall methylation percentage and count cutoff. Here is issue #122.

deduplicate_bismark

Added option --multiple to the deduplicator to treat several input SAM/BAM files as the same sample. Here is issue #107.
Added option --output_dir to deduplicate_bismark so that it can be used in the Google cloud. Here is issue #123

coverage2cytosine

Output files are now handled better and more consistently. Default processing now produces the following output files (with --gzip):

CpG_report.txt(.gz) or
CX_report.txt(.gz)

The option --NOMe-Seq now produces four output files (with --gzip):

NOMe.CpG_report.txt(.gz)
NOMe.CpG.cov(.gz)
NOMe.GpC_report.txt(.gz)
NOMe.GpC.cov(.gz)

The option --split_by_chromosome should work in either default, --gc or --NOMe-seq mode.

NOMe-Seq processing if now ignoring processing that were not covered by any reads.
Improved handling of the --output_dir, i.e. the folder will be created if it doesn't exist already and making the path absolute.
Added new option --discordance <int> to allow filtering for discordance pf top and bottom strand when in --merge_CpG mode. CpG positions for which either the top or bottom strand was not measured at all will not be assessed for discordance and hence appear in the regular 'merged_CpG_evidence.cov' file. More details in issue #91.
Fixed context extraction for Gs at positions 1 and 2 of a chromosome/contig. Also, last cytosine positions of not covered chromosomes are now ignored in the same way as for covered chromosomes issue #127

copy_files_for_release

Is now working from any location.

Assets 2

28 Jun 08:45

FelixKrueger

0.18.2

f5e0f33

v0.18.2 - Hotfix release for ambiguous alignments

Bismark

Changed the timing of when ambiguous within same thread alignments are reset. Previously some alignments were incorrectly considered ambiguous (see here). This affected Bowtie 2 alignments only.

bismark2bedGraph

The option --ample_mem is now mutually exclusive with specifying memory for the UNIX sort command via the option --buffer_size.

Assets 2

22 May 16:24

FelixKrueger

0.18.1

75940b1

v0.18.1

Bismark

Commented out warning messages for certain ambiguous alignments for paired-end alignments.

Assets 2

15 May 10:32

FelixKrueger

0.18.0

bc3f7f1

v0.18.0 - further NOMe-Seq support and bug fixes

Release Notes for Bismark v0.18.0

Changed FindBin qw($Bin) to FindBin qw($RealBin) for bismark, bismark_methylation_extractor, bismark2report and bismark2summary so that symlinks are resolved before calling different modules.

Bismark

Fixed the behaviour of (very rare) ambiguous corner cases where a sequence had a perfect sequence duplication within the valid paired-end distance.

Methylation Extractor

Added new option --yacht (for Yet Another Context Hunting Tool) that writes out additional information about the read a methylation call belongs to, and its output is meant to be fed into the NOMe_filtering script (see below). This option writes out a single 'any_C_context' file that contains all methylation calls for a read consecutively. Its intended use is single-cell NOMe-Seq data, so it only works in single-end mode (paired-end reads often suffer from chimaera problems...)

--yacht adds three additional columns to the standard methylation call files:

<read start> <read end> <read orientation>

For forward reads (+ orientation) the start position is the left-most position wheras for reverse reads (- orientation) it is the rightmost position.

Changed FindBin qw($Bin) to FindBin qw($RealBin) so that symlinks are resolved before calling different modules.

NOMe_filtering

This script reads in methylation call files from the Bismark methylation extractor that contain additional information about the reads that methylation calls belonged to. It processes entire (single-end) reads and then filters calls for NOMe-Seq positions (nucleosome occupancy and methylome sequencing) where accessible DNA gets methylated in a GpC context:

 (i) filters CpGs to only output cytosines in A-CG and T-CG context
(ii) filters GC context to only report cytosines in GC-A, GC-C and GC-T context

Both of these measures aim to reduce unwanted biases, i.e. the influence of G-CG (intended) and C-CG (off-target) on endogenous CpG methylation, and the influence of CpG methylation on (the NOMe-Seq specific) GC context methylation.

The NOMe-Seq filtering output reports cytosines in CpG context only if they are in A-CG or T-CG context,
and cytosines in GC context only when the C is not in CpG context. The output file is tab-delimited and in
the following format (1-based coords):

<readID>  <chromosome>  <read start>  <read end>  <count methylated CpG>  <count non-methylated CpG>  <count methylated GC>  <count non-methylated GC>
HWI-D00436:298:C9KY4ANXX:1:1101:2035:2000_1:N:0:_ACAGTGGT 10 8517979 8518098 0 1 0 1
HWI-D00436:298:C9KY4ANXX:1:1101:5072:1993_1:N:0:_ACAGTGGT 8 9476630 9476748 0 0 0 2

coverage2cytosine

Fixed an issue in --merge_CpG mode caused by chromosomes ending in CG.
Fixed an issue caused by specifying --zero as well as --merge_CpG.

bam2nuc

Fixed an issue where the option --output_dir had been ignored.

filter_non_conversion

Removed help text indicating that this script also did the deduplication.

Assets 2

18 Jan 14:40

FelixKrueger

0.17.0

2b4d2f5

v0.17.0 - Filter non-conversion, Documentation and convenience updates

Bismark

The option --dovetail is now the default behaviour for paired-end Bowtie2 libraries to assist with
alignments that have undergone 5'-end trimming. Can be disabled using the new option --no_dovetail.
Added time stamp to the Bismark run.
Chromosome names with leading spaces now cause Bismark to bail.
Fixed path handling for --multicore mode when --prefix had been specified as well.
Bismark now quits if the Bowties could not be executed properly.
Bails if supplied filenames do not exist.

Documentation

Added Overview of different library types and kits to the Bismark User Guide.
Also added documentation for Bismark modules bam2nuc, bismark2report, bismark2summary and filter_non_conversion.
Added a Markdown to HTML converter (make_docs.pl; thanks to Phil Ewels).

filter_non_conversion

Added a new script that allows filtering out of reads or read-pairs if the apparent non-CG methylation exceeds a certain threshold (3 by default). Optionally, the non-CG count may be forced to occur on consecutive non-CGs using the option --consecutive.
Added time stamp to filtering step.

bismark2bedGraph

For the creation of temporary files, we are now replacing / characters in the chromosome names with _ (underscores), similar to | (pipe) characters, as these / would attempt to write files to non-existing directories.

deduplicate_bismark

Single-/paired-end detection now also accepts --1 or --2.
Added EOF or truncation detection, causing the deduplicator to die.

bismark_methylation_extractor

Single-/paired-end detection now also accepts --1 or --2.
Added EOF or truncation detection, causing the methylation extractor to die.
Addded fatal ID1/ID2 check to paired-end extraction so that files which went out-of-sync at a later stage do not complete silently (but incorrectly!)

bismark2report

Major refactoring of bismark2report, the output should look the same though. Massive thanks to Phil Ewels for this.

coverage2cytosine

Added a new option --NOMe-seq to filter nucleosome occupancy and methylome sequencing (NOMe-Seq) data where accessible DNA gets enzymatically methylated in a GpC context. The option --NOMe-seq:

     i) filters the genome-wide CpG-report to only output cytosines in ACG and TCG context
    ii) filters the GC context output to only report cytosines in GCA, GCC and GCT context

Both of these measures aim to reduce unwanted biases, namely the influence of GCG and CCG on endogenous CpG methylation, and the inlfluence of CpG methylation on (the NOMe-Seq specific) GC context methylation. PLEASE NOTE that NOMe-Seq data requires a .cov.gz file as input which has been generated in non-CG mode (--CX).

bismark_genome_preparation

Fixed a bug that arose when --genomic_composition was specified (now moving back to the genome directory for in silico conversion).

Assets 2

25 Jul 15:16

FelixKrueger

0.16.3

966f3e3

0.16.3 - Additional bug fix for ambiguous Bowtie 2 alignments

Bismark

Essential: Fixed another bug where a subset of ambiguous Bowtie 2 alignments where considered unique even though
they had been ambiguous in a different thread before, e.g.:

Read 1: AS:i:0 XS:i:0
Read 2: AS:i:0

In such cases the 'ambiguous within thread' variable is now only reset if the second alignment is truly better. This also affects the ambig.bam output.

Added support for large Bowtie (1) index files ending in .ebwtl which had been added in Bowtie v1.1.0.

Assets 2

19 Jul 13:23

FelixKrueger

v0.16.2

a831ce7

v0.16.2 - Includes essential bug fix for Bowtie 2 alignments

Changed the Shebang in all scripts of the Bismark suite to #!/usr/bin/env perl instead of
#!/usr/bin/perl

Bismark

Essential: Fixed a bug for Bowtie 2 alignments where reads that should be considered ambiguous were incorrectly assigned to the first alignment thread. This error had crept in during the 'changing the behavior of
corner cases' in v0.16.0). Thanks to John Gaspar for spotting this!

deduplicate_bismark

Does now bail with a useful error message when the input files are empty.

bismark_genome_preparation

Added new option --genomic_composition so that the genomic composition can be calculated and written right at the genome preparation stage rather than by using bam2nuc

bam2nuc

Now also calculates a fold coverage for the various (di-)nucleotides. The changes in the nucleotide_stats text file are also picked up and plotted by bismark2report
Added a new option --genomic_composition_only to just process the genomic sequence without requiring any data files

bismark2summary

Added option -o/--basename <filename> to specify a certain filename. If not specified the name will
remain bismark_summary_report.txt/html
Added documentation and the options --help and --version to be consistent with the rest of Bismark
Added option --title <string> to give the HTML report a different title

Assets 2

25 Apr 08:40

FelixKrueger

0.16.1

d32b68c

0.16.1

Bismark

Removed a rogue warn/sleep statement to check the resetting of best alignment scores for paired-end/Bowtie2 alignments which would obviously slow alignments down massively. Sorry for that.

Assets 2

Releases: FelixKrueger/Bismark

Diverse fixes and improvements - last version supporting Bowtie (1)

bismark

bismark_genome_preparation

bismark2summary

bam2nuc

bismark_methylation_extractor

v0.20.0

bismark_methylation_extractor

bismark2report

bismark2summary

v0.19.0 - Various fixes and improvements

Bismark

filter_non_conversion

deduplicate_bismark

coverage2cytosine

copy_files_for_release

v0.18.2 - Hotfix release for ambiguous alignments

Bismark

bismark2bedGraph

v0.18.1

Bismark

v0.18.0 - further NOMe-Seq support and bug fixes

Release Notes for Bismark v0.18.0

Bismark

Methylation Extractor

NOMe_filtering

coverage2cytosine

bam2nuc

filter_non_conversion

v0.17.0 - Filter non-conversion, Documentation and convenience updates

Bismark

Documentation

filter_non_conversion

bismark2bedGraph

deduplicate_bismark

bismark_methylation_extractor

bismark2report

coverage2cytosine

bismark_genome_preparation

0.16.3 - Additional bug fix for ambiguous Bowtie 2 alignments

Bismark

v0.16.2 - Includes essential bug fix for Bowtie 2 alignments

Bismark

deduplicate_bismark

bismark_genome_preparation

bam2nuc

bismark2summary

0.16.1

Bismark