[Do not merge!] Pseudo PR for first release #69

jfy133 · 2023-01-23T07:04:56Z

Do not merge! This is a PR of dev compared to first release for whole-pipeline reviewing purposes.

Comments can be copied over to #66 for safety if necessary and changes made there. Changes should be made to dev and this PR should not be merged into first-commit-for-pseudo-pr!

jfy133

Overall looks pretty good, main things are:

Add license to ALL scripts in bin/ (particularly for the package)
Make sure all version numbers are defined for all tools in the conda declaration of local modules with mulled containers,and these are exported.

.github/workflows/ci.yml

CHANGELOG.md

CITATIONS.md

subworkflows/local/prepare_genome.nf

workflows/circrna.nf

jfy133 · 2023-01-23T10:55:38Z

workflows/circrna.nf

+    ch_multiqc_files = ch_multiqc_files.mix(ch_workflow_summary.collectFile(name: 'workflow_summary_mqc.yaml'))
+    ch_multiqc_files = ch_multiqc_files.mix(ch_methods_description.collectFile(name: 'methods_description_mqc.yaml'))
+    ch_multiqc_files = ch_multiqc_files.mix(CUSTOM_DUMPSOFTWAREVERSIONS.out.mqc_yml.collect())
+    ch_multiqc_files = ch_multiqc_files.mix(FASTQC_TRIMGALORE.out.fastqc_zip.collect{it[1]}.ifEmpty([]))


Is it correct only FastQC goes into MultiQC?

I'll look into generating BAM stats

FriederikeHanssen · 2023-01-27T15:38:45Z

README.md


-**workflow for the quantification, differential expression analysis and miRNA target prediction analysis of circRNAs in RNA-Seq data**.
+[![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/circrna/results)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.XXXXXXX-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.XXXXXXX)


@jfy133 @apeltzer can we figure this out before the relase? Otherwise, another Zenodo ID is coming ;) Barry, the docs for this were recently updated. Shout if you need help to add the ID after the release. Someone from core (Alex) needs to generate this beforehand for you.

FriederikeHanssen · 2023-01-27T15:39:42Z

README.md

+4. circRNA annotation
+5. Export mature spliced length as FASTA file
+6. Annotate parent gene, underlying transcripts.
+7. circRNA count matrix


Are these the custom R scripts? Otherwise maybe also link the tools here in a similar fashion to above

Nope, vanilla python and bash.

FriederikeHanssen · 2023-01-27T15:43:50Z

assets/methods_description_template.yml

+section_name: "nf-core/circrna Methods Description"
+section_href: "https://github.com/nf-core/circrna"
+plot_type: "html"
+## TODO nf-core: Update the HTML below to your prefered methods description, e.g. add publication citation for this pipeline


you might want to add your paper here

I'll give it a go! I went to the Sarek one to copy but... ;)

bin/check_samplesheet.py

bin/circRNA_counts_matrix.py

bin/circ_test.R

FriederikeHanssen · 2023-01-27T15:54:54Z

bin/ensembl_database_map.txt

@@ -0,0 +1,18 @@
+species	command
+cel	useMart(biomart = "ensembl", dataset = "celegans_gene_ensembl", host="https://www.ensembl.org", archive=FALSE)


This fetches the newest version every time, right? I am a bit worried about reproducibility and compatibility in the long run. Would there be a way to cache it?

Think I'll deprecate this code, its only going to cause trouble

bin/targetscan_format.sh

conf/modules.config

FriederikeHanssen · 2023-01-27T16:01:43Z

conf/modules.config

+    }
+}
+
+    // PREPARE GENOME


I suspect you will get WARN messages if the indices already exist

I'll test this. I have a hard time resolving WARN messages

FriederikeHanssen · 2023-01-27T16:02:29Z

conf/modules.config

+
+    // PREPARE GENOME
+    withName: BOWTIE_BUILD {
+        ext.when = { params.fasta && !params.bowtie && params.tool.split(',').contains('mapsplice') && params.module.split(',').contains('circrna_discovery') }


what happens if params.tools doesn't exist? Or does it always have to have a value?

similar for params.module

must have a value, a la --tool 'mutect2', --step 'mapping' ..

conf/modules.config

conf/test.config

FriederikeHanssen · 2023-01-27T16:09:37Z

conf/test_full.config

+    // Input data for full size test
+    // TODO nf-core: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
+    // TODO nf-core: Give any required params for the test so that command line flags are not needed
+    input = 'https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/samplesheet/samplesheet_full_illumina_amplicon.csv'


is this the right input?

nope, I was hoping for some help in getting my full test dataset up and running.

FriederikeHanssen · 2023-01-27T16:12:40Z

docs/output.md

@@ -1,63 +1,590 @@
 # nf-core/circrna: Output

-## :warning: Please read this documentation on the nf-core website: [https://nf-co.re/circrna/output](https://nf-co.re/circrna/output)


is this line removed in the template @jfy133 ?

docs/output.md

docs/usage.md

modules/local/annotation/full_annotation/main.nf

modules/local/circexplorer2/filter/main.nf

Implement ciriquant quantification

Add parameter for grouping detected circRNAs that are very close

Versioning

Important! Template update for nf-core/tools v3.0.2

jfy133 marked this pull request as draft January 23, 2023 07:05

jfy133 assigned FriederikeHanssen and jfy133 Jan 23, 2023

jfy133 commented Jan 23, 2023

View reviewed changes