Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Fix memory problem with alt hla #87

Open
wants to merge 2 commits into
base: ReleaseBranch_1.2.73-2
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified AlignmentAndQCWorkflows.jar
Binary file not shown.
20 changes: 20 additions & 0 deletions AlignmentAndQCWorkflows_1.2.73-205.iml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
<?xml version="1.0" encoding="UTF-8"?>
<module type="JAVA_MODULE" version="4">
<component name="NewModuleRootManager" inherit-compiler-output="true">
<exclude-output />
<content url="file://$MODULE_DIR$">
<sourceFolder url="file://$MODULE_DIR$/docs" type="java-resource" />
<sourceFolder url="file://$MODULE_DIR$/resources/analysisTools" isTestSource="false" />
<sourceFolder url="file://$MODULE_DIR$/resources/configurationFiles" type="java-resource" />
<sourceFolder url="file://$MODULE_DIR$/resources/tests" isTestSource="true" />
<sourceFolder url="file://$MODULE_DIR$/src" isTestSource="false" />
</content>
<orderEntry type="inheritedJdk" />
<orderEntry type="sourceFolder" forTests="false" />
<orderEntry type="module" module-name="Roddy.main" />
<orderEntry type="library" name="Gradle: org.codehaus.groovy:groovy-all:2.4.21" level="project" />
<orderEntry type="module" module-name="BatchEuphoria.main" />
<orderEntry type="module" module-name="RoddyToolLib.main" />
<orderEntry type="module" module-name="COWorkflowsBasePlugin_1.4.2" />
</component>
</module>
36 changes: 0 additions & 36 deletions AlignmentAndQCWorkflows_1.2.73.iml

This file was deleted.

11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,17 @@ The original script with a documentation of the underlying ideas can be found [h

## Change Logs

* 1.2.73-205 (branch-specific change)
- minor: Added `bwaPostAltJsK8Options` to allow setting opt K8 options for `bwa-postalt.js`
- minor: Added `SAMPESORT_MEMSIZE` to allow reducing memory for SAM sorting. Default: ~2 GiB.
- minor: Upgrade from COWorkflows 1.2.76 to COWorkflowsBasePlugin 1.4.2.
- minor: Old `FLAG_USE_EXISTING_PAIRED_BAMS` was renamed to `FLAG_USE_ONLY_EXISTING_PAIRED_BAMS`
- minor: Removed `FLAG_RUN_SLIM_WORKFLOW`. Corresponding code is unused for years.
- patch: Little refactorings and groovification of old Java code

* 1.2.73-204 (branch-specific change)
- minor: Separate BWA from BWAKIT version. Default `BWAKIT_VERSION` to `BWA_VERSION`. Independently set `K8_VERSION` (default 0.2.5). Changed associated module-loading code in environment setup file `tbi-lsf-cluster.sh`.

* 1.2.73-203 (branch-specific change)
- minor: Optional ALT-chromosome processing via bwa.kit's `bwa-postaln.js`.
* Set `runBwaPostAltJs=true` to activate the ALT chromosome processing. Default: `false`.
Expand Down
4 changes: 2 additions & 2 deletions buildinfo.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
dependson=COWorkflows:1.2.76
dependson=COWorkflowsBasePlugin:1.4.2
JDKVersion=1.8
GroovyVersion=2.4
RoddyAPIVersion=3.0
RoddyAPIVersion=3.7
16 changes: 8 additions & 8 deletions resources/analysisTools/qcPipeline/flags_isizes_PEaberrations.pl
Original file line number Diff line number Diff line change
Expand Up @@ -119,13 +119,13 @@
if ($_ =~ /^\@/) # there might be a SAM header
{next;}
$all++;
@help = split ("\t", $_);
$flag = $help[1];
my ($qname, $flag, $rname, $pos, $mapq, $cigar, $rnext, $pnext, $tlen, $seq, $qual) =
split ("\t", $_);
# not unmapped, no duplicate and no secondary/supplementary alignment, and mapqual >= X
if (!($flag & 4) && !($flag & 1024) && !($flag & 256) && !($flag & 2048))
{
$uniq++;
if ($help[4] >= $minmapq)
if ($mapq >= $minmapq)
{
# of these, with mapqual >= X
$minmapuniq++;
Expand All @@ -135,30 +135,30 @@
next;
}
# is the read itself on a wanted chromosome?
if (defined $chroms{$help[2]})
if (defined $chroms{$rname})
{
$onchr++;
# and the mate on a wanted chromosome
if (defined $chroms{$help[6]} || $help[6] eq "=") # same chrom is usually indicated by "=" instead of repeating the name
if (defined $chroms{$rnext} || $rnext eq "=") # same chrom is usually indicated by "=" instead of repeating the name
{
$both++;
# paired end aberration: mate also has to be mapped, on a different chrom
if (!($flag & 8) && $help[6] ne "=" && ($help[2] ne $help[6]))
if (!($flag & 8) && $rnext ne "=" && ($rname ne $rnext))
# keep matrix symmetrical to see whether there is a bias, e.g. more 1->10 than 10->1
{
$aberrant++;
# only use read1 info, since read2 might have mapq 0, and the info of having bias w.r.t. which read
# is more interesting
if ($flag & 64)
{
$chrompairs{$help[2]}{$help[6]}++;
$chrompairs{$rname}{$rnext}++;
}
}
# for insert sizes, take first read of a proper pair (-f 67 = 64 (first in pair) + 2 (proper pair) + 1 (paired));
# discarding duplicates (-F 1024) is already done further up
if ($flag & 64 && $flag & 2 && $flag & 1)
{
$entry = abs($help[8]); # insert size
$entry = abs($tlen); # insert size
if ($entry < $min)
{
$min = $entry;
Expand Down
3 changes: 2 additions & 1 deletion resources/analysisTools/qcPipeline/workflowLib.sh
Original file line number Diff line number Diff line change
Expand Up @@ -272,11 +272,12 @@ ALT_FILE="${ALT_FILE:-$INDEX_PREFIX.alt}"
# By default assume that bwa-postalt.js and k8 are located besides bwa (like in bwakit).
bwaPostAltJsPath="${bwaPostAltJsPath:-"$(dirname "$(which bwa)")"/bwa-postalt.js}"
K8_BINARY="${K8_BINARY:-"$(dirname "$(which bwa)")"/k8}"
declare -a bwaPostAltJsK8Options="$bwaPostAltJsK8Options"
optionalBwaPostAltJs() {
local hlaPrefix="${1:-}"
local minPaRatio="${2:-}"
if [[ "$runBwaPostAltJs" == "true" ]]; then
$K8_BINARY "$bwaPostAltJsPath" ${hlaPrefix:+-p "$hlaPrefix"} ${minPaRatio:+-r "$minPaRatio"} "$ALT_FILE"
$K8_BINARY ${bwaPostAltJsK8Options[@]} "$bwaPostAltJsPath" ${hlaPrefix:+-p "$hlaPrefix"} ${minPaRatio:+-r "$minPaRatio"} "$ALT_FILE"
else
cat -
fi
Expand Down
5 changes: 5 additions & 0 deletions resources/configurationFiles/analysisQC.xml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@
description="The bwakit module contains the bwa-postalt.js script. Used if runBwaPostAltJs=true"/>
<cvalue name="K8_VERSION" value="0.2.5" type="string"
description="Used for bwa-postalt.js if runBwaPostAltJs=true"/>
<cvalue name="bwaPostAltJsK8Options" value="" type="string"
description="List of parameters for the K8 JS-interpreter used for bwa-postalt.js"/>

<cvalue name="workflowEnvironmentScript" value="workflowEnvironment_tbiLsf" type="string"/>

Expand Down Expand Up @@ -81,6 +83,9 @@
<cvalue name='markDuplicatesVariant' value='' type="string"
description="Allowed values: biobambam, picard, sambamba. Default: empty. If set, this option takes precedence over the older useBioBamBamMarkDuplicates option."/>

<cvalue name="SAMPESORT_MEMSIZE" value="2000000000" type="string"
description="Memory for sorting the SAM stream. Format depends on sorter. E.g. the default could be written '2G' for samtools sort."/>

<cvalue name='SAMBAMBA_MARKDUP_OPTS' value='"-t 1 -l 0 --hash-table-size=2000000 --overflow-list-size=1000000 --io-buffer-size=64"'
description="Please use -l 0, the workflow unpacks the BAM directly with samtools. Compression is faster and more stable with samtools."/>

Expand Down
2 changes: 1 addition & 1 deletion src/de/dkfz/b080/co/QualityControlWorkflowPlugin.java
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
public class QualityControlWorkflowPlugin extends BasePlugin {

public static final String CURRENT_VERSION_STRING = "1.2.73";
public static final String CURRENT_VERSION_BUILD_DATE = "Wed Feb 21 15:47:18 CET 2018";
public static final String CURRENT_VERSION_BUILD_DATE = "Wed Apr 12 10:50:50 CEST 2023";

@Override
public String getVersionInfo() {
Expand Down
67 changes: 52 additions & 15 deletions src/de/dkfz/b080/co/common/AlignmentAndQCConfig.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,19 @@ class AlignmentAndQCConfig extends COConfig {
public static final String CVALUE_RUN_FINGERPRINTING = "runFingerprinting"
public static final String CVALUE_FINGERPRINTING_SITES_FILE="fingerprintingSitesFile"

public AlignmentAndQCConfig(ExecutionContext context) {
AlignmentAndQCConfig(ExecutionContext context) {
super(context)
}

public String getSingleBamParameter() {
void setUseOnlyExistingTargetBam(boolean value = true) {
setConfig("useOnlyExistingTargetBam", value.toString(), "boolean")
}

void setExtractSamplesFromOutputFiles(boolean value = true) {
setConfig("extractSamplesFromOutputFiles", value.toString(), "boolean")
}

String getSingleBamParameter() {
return configValues.get("bam", "");
}

Expand All @@ -33,52 +41,81 @@ class AlignmentAndQCConfig extends COConfig {
}

boolean getUseExistingLaneBams() {
return configValues.getBoolean(COConstants.FLAG_USE_EXISTING_PAIRED_BAMS, false)
return configValues.getBoolean("useExistingLaneBams", false)
}

public String getIndexPrefix() {
boolean getUseOnlyExistingPairedBams() {
return configValues.getBoolean("useExistingPairedBams", false)
}

String getIndexPrefix() {
return configValues.getString(CVALUE_INDEX_PREFIX, "")
}

public File getChromosomeSizesFile() {
File getChromosomeSizesFile() {
return new File (configValues.getString(CVALUE_CHROMOSOME_SIZES_FILE, ""))
}

public File getTargetRegionsFile() {
File getTargetRegionsFile() {
return new File (configValues.getString(CVALUE_TARGET_REGIONS_FILE, ""))
}

public Integer getTargetSize() {
Integer getTargetSize() {
Integer returnValue = configValues.getString(CVALUE_TARGET_SIZE, null) as Integer
if (null == returnValue) {
returnValue = configValues.getString(CVALUE_TARGETSIZE, null) as Integer
}
return returnValue
}

public boolean getRunExomeAnalysis() {
return configValues.getBoolean(COConstants.FLAG_RUN_EXOME_ANALYSIS)
boolean getRunExomeAnalysis() {
return configValues.getBoolean("runExomeAnalysis")
}

public File getCytosinePositionIndex() {
File getCytosinePositionIndex() {
return new File(configValues.getString(CVALUE_CYTOSINE_POSITIONS_INDEX))
}

public File getClipIndex() {
File getClipIndex() {
return new File(configValues.getString(CVALUE_CLIP_INDEX))
}

public boolean getRunFingerprinting() {
boolean getRunFingerprinting() {
return configValues.getBoolean(CVALUE_RUN_FINGERPRINTING, true)
}

public File getFingerprintingSitesFile() {
File getFingerprintingSitesFile() {
return new File(configValues.getString(CVALUE_FINGERPRINTING_SITES_FILE))
}

public Boolean getUseOnlyExistingPairedBams() {
return configValues.getBoolean(COConstants.FLAG_USE_EXISTING_PAIRED_BAMS, false);
boolean getRunFastqcOnly() {
return configValues.getBoolean("runFastQCOnly", false)
}

boolean getRunFastqc() {
return configValues.getBoolean("runFastQC", true)
}

boolean getRunAlignmentOnly() {
return configValues.getBoolean("runAlignmentOnly", false)
}

boolean getRunCoveragePlots() {
return configValues.getBoolean("runCoveragePlots", true)
}

boolean getRunCollectBamFileMetrics() {
return configValues.getBoolean("runCollectBamFileMetrics", false)
}

@Deprecated
boolean getUseCombinedAlignAndSampe() {
return true
}

@Deprecated
boolean getRunSlimWorkflow() {
return true
}

}
38 changes: 28 additions & 10 deletions src/de/dkfz/b080/co/common/COProjectsRuntimeService.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,13 @@ import de.dkfz.roddy.execution.io.ExecutionResult;
import de.dkfz.roddy.execution.io.ExecutionService;
import de.dkfz.roddy.execution.io.fs.FileSystemAccessProvider
import de.dkfz.roddy.tools.LoggerWrapper
import groovy.transform.CompileStatic

import java.util.function.Consumer;

/**
* Created by heinold on 15.01.16.
*/
@groovy.transform.CompileStatic
public class COProjectsRuntimeService extends BasicCOProjectsRuntimeService {

@CompileStatic
class COProjectsRuntimeService extends BasicCOProjectsRuntimeService {
private static LoggerWrapper logger = LoggerWrapper.getLogger(BasicCOProjectsRuntimeService.class.getName());

protected static void getFileCompression(ExecutionContext run, List<LaneFile> allLaneFiles) {
Expand Down Expand Up @@ -174,6 +173,14 @@ public class COProjectsRuntimeService extends BasicCOProjectsRuntimeService {
return laneFiles
}

protected static int indexOfPathElement(String pathnamePattern, String element) {
int index = pathnamePattern.split(StringConstants.SPLIT_SLASH).findIndexOf { it -> it == element }
if (index < 0) {
throw new RuntimeException("Couldn't match '${element}' in '${pathnamePattern}")
}
return index
}

public List<LaneFileGroup> getLaneFileGroupsFromFastqList(ExecutionContext context, Sample sample, String libraryID) {
COConfig coConfig = new COConfig(context)
List<File> fastqFiles = coConfig.getFastqList().collect { String it -> new File(it); }
Expand Down Expand Up @@ -238,11 +245,22 @@ public class COProjectsRuntimeService extends BasicCOProjectsRuntimeService {
runIndex = 2;
sampleName = split[0..1].join(StringConstants.UNDERSCORE);
}
String run = split[runIndex..-2].join(StringConstants.UNDERSCORE);
String lane = String.format("L%03d", laneID);


BamFile bamFile = COBaseFile.constructSourceFile(BamFile, f, context, new COFileStageSettings(lane, run, sample, context.getDataSet())) as BamFile
RunID run = new RunID(split[runIndex..-2].join(StringConstants.UNDERSCORE))
LaneID lane = new LaneID(String.format("L%03d", laneID))


BamFile bamFile =
COBaseFile.constructSourceFile(
BamFile,
f,
context,
new COFileStageSettings(
lane,
run,
(LibraryID) null,
sample,
context.dataSet)
) as BamFile
return bamFile;
})
BamFileGroup bamFileGroup = new BamFileGroup(bamFiles);
Expand Down
Loading