Skip to content

v0.17.0 - Filter non-conversion, Documentation and convenience updates

Compare
Choose a tag to compare
@FelixKrueger FelixKrueger released this 18 Jan 14:40
· 573 commits to master since this release

Bismark

  • The option --dovetail is now the default behaviour for paired-end Bowtie2 libraries to assist with
    alignments that have undergone 5'-end trimming. Can be disabled using the new option --no_dovetail.
  • Added time stamp to the Bismark run.
  • Chromosome names with leading spaces now cause Bismark to bail.
  • Fixed path handling for --multicore mode when --prefix had been specified as well.
  • Bismark now quits if the Bowties could not be executed properly.
  • Bails if supplied filenames do not exist.

Documentation

  • Added Overview of different library types and kits to the Bismark User Guide.
  • Also added documentation for Bismark modules bam2nuc, bismark2report, bismark2summary and filter_non_conversion.
  • Added a Markdown to HTML converter (make_docs.pl; thanks to Phil Ewels).

filter_non_conversion

  • Added a new script that allows filtering out of reads or read-pairs if the apparent non-CG methylation exceeds a certain threshold (3 by default). Optionally, the non-CG count may be forced to occur on consecutive non-CGs using the option --consecutive.
  • Added time stamp to filtering step.

bismark2bedGraph

  • For the creation of temporary files, we are now replacing / characters in the chromosome names with _ (underscores), similar to | (pipe) characters, as these / would attempt to write files to non-existing directories.

deduplicate_bismark

  • Single-/paired-end detection now also accepts --1 or --2.
  • Added EOF or truncation detection, causing the deduplicator to die.

bismark_methylation_extractor

  • Single-/paired-end detection now also accepts --1 or --2.
  • Added EOF or truncation detection, causing the methylation extractor to die.
  • Addded fatal ID1/ID2 check to paired-end extraction so that files which went out-of-sync at a later stage do not complete silently (but incorrectly!)

bismark2report

  • Major refactoring of bismark2report, the output should look the same though. Massive thanks to Phil Ewels for this.

coverage2cytosine

  • Added a new option --NOMe-seq to filter nucleosome occupancy and methylome sequencing (NOMe-Seq) data where accessible DNA gets enzymatically methylated in a GpC context. The option --NOMe-seq:
     i) filters the genome-wide CpG-report to only output cytosines in ACG and TCG context
    ii) filters the GC context output to only report cytosines in GCA, GCC and GCT context

Both of these measures aim to reduce unwanted biases, namely the influence of GCG and CCG on endogenous CpG methylation, and the inlfluence of CpG methylation on (the NOMe-Seq specific) GC context methylation. PLEASE NOTE that NOMe-Seq data requires a .cov.gz file as input which has been generated in non-CG mode (--CX).

bismark_genome_preparation

  • Fixed a bug that arose when --genomic_composition was specified (now moving back to the genome directory for in silico conversion).