-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cromwell workflow engine support #825
Comments
Hi @tom-dyar , yes that's exactly what we do too. It seems that paths for sub-workflows often have shifting interpretations, and I think the DNAnexus parser has changed on this at one point as well. For now, I think the |
Great, thanks! I am hoping that's all there is regarding compatibility, so great you have it on your radar. |
@dpark01 -- I am now trying to get this running on google compute from Cromwell -- do you have configuration files (machine requirements and reference file locations) for Google cloud, similar to those dx-**.json files in pipes/WDL ? I bumped up the local disk to 2TB due to a large run I tried, but still Kraken never finished after 24 hours, probably due to RAM issue working with my NextSeq500 run... Thanks, |
@tom-dyar here is a json config file that we use (though see #843 for some caveats about it). The machine requirement specs should be fully derivable from the WDL task runtimes (in fact they were primarily designed around GCP instances in mind, with the As for default databases, I don't have them all linked in properly, and they might not be the latest versions, but see |
OK, not sure if I should submit a new ticket or not... SamToFastQ is "hanging" when running demux_plus.wdl, so kraken.py never completes. I have a couple 5-7 GB BAM files, so it is failing on one of them. I am using the new Google Pipelines API version v2alpha1 and Cromwell verstion 34. I have bumped up the disk sizes, so I have 2 local disks 500GB each and the boot disk is 100GB. Below is my configuration file. I wonder how I should debug this, since there is no output in the log files, and perhaps there is a picard VERBOSITY option I could set but it seems I would have to update the container to put that in. Thanks for any help!
|
Hi Tom, interesting... you should at least be able to deduce from the stdout/stderr log files (that Cromwell normally produces) for the kraken task which bam file it was processing at the time. And given that you have the input bam files, perhaps you could try reproducing that effect manually by spinning up a GCE VM manually, pulling the docker image, running it interactively ( If it's reproducible and if your data isn't sensitive, we'd be happy to look at an example bam file. |
Thanks @dpark01 -- good tips and I will try to reproduce. Nothing particularly sensitive, here is th path to my logs, I tried to make my buckets publicly readable: gs://atvir-cromwell/cromwell-execution/demux_plus/2021156e-a3c3-45b1-9eb3-9171f70595f4/call-kraken |
The current wdl files were developed with DNANexus in mind. I am trying to modify and use in Cromwell, and it seems there are differences in the way paths for sub-workflows are handled vs. DNANexus. I got it to "work" by putting all the tasks and workflows into a single directory.
The text was updated successfully, but these errors were encountered: