Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--bwa_index: string [database/indexes/GRCm38/BWA/genome.bwt] does not match pattern ^\S+\.\{amb,ann,bwt,pac,sa\}$ (database/indexes/GRCm38/BWA/genome.bwt) #68

Open
alexmascension opened this issue Jan 31, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@alexmascension
Copy link

Description of the bug

Hi! I'm using nf-core/circdna with the following configuration:

nextflow run nf-core/circdna \
-r 1.0.4 \
-profile docker \
-resume \
--max_cpus 9 \
--max_memory 21.GB \
--max_time 500.h \
--circle_identifier circle_map_realign,circle_map_repeats,circle_finder,circexplorer2,ampliconarchitect \
--input work/test_mouse/samplesheets/CIRCDNA.csv \
--outdir results/test_mouse/CIRCDNA \
--genome GRCm38 \
--bwa_index database/indexes/GRCm38/BWA/genome.bwt \
--reference_build mm10 \
--mosek_license_dir src/others \
--fasta database/genomes/GRCm38/genome.fasta \
--aa_data_repo database/indexes/GRCm38/aa_data_repo

And it throws this error:

ERROR ~ ERROR: Validation of pipeline parameters failed!

 -- Check '.nextflow.log' file for details
ERROR ~ * --bwa_index: string [database/indexes/GRCm38/BWA/genome.bwt] does not match pattern ^\S+\.\{amb,ann,bwt,pac,sa\}$ (database/indexes/GRCm38/BWA/genome.bwt)

I've been searching and I think that the regex pattern might be wrong and instead it is ^\S+\.(amb|ann|bwt|pac|sa)$ (at least according to chatgpt and tested in https://regex101.com/).

Command used and terminal output

No response

Relevant files

No response

System information

No response

@alexmascension alexmascension added the bug Something isn't working label Jan 31, 2024
@DSchreyer
Copy link
Collaborator

I'll have a look. In the meantime could you just run it without the --bwa_index parameter? It will be generated in the pipeline either way. I am looking into removing the parameter all the way to simplify, but this can still be debated.

@DSchreyer
Copy link
Collaborator

Hey, I just fixed your bug in a way that makes the user experience better in the future. Now --bwa_index only accepts directory paths. The directory needs to be given that contain all bwa index files.

Is this acceptable for your use ?

@alexmascension
Copy link
Author

Hi! Yes! It works fine. Thanks!

@alexmascension
Copy link
Author

Hi! I've run the pipeline a second time with this config

nextflow run nf-core/circdna \
-r dev \
-profile docker \
-resume \
--max_cpus 9 \
--max_memory 53.GB \
--max_time 500.h \
--circle_identifier circle_map_realign,circle_map_repeats,circle_finder,circexplorer2,ampliconarchitect \
--input work/test_human/samplesheets/CIRCDNA.csv \
--outdir results/test_human/CIRCDNA \
--genome GRCh38 \
--bwa_index database/indexes/GRCh38/BWA \
--input_format FASTQ \
--reference_build GRCh38 \
--mosek_license_dir src/others \
--fasta database/genomes/GRCh38/genome.fasta \
--aa_data_repo $(pwd)/database/indexes/GRCh38/aa_data_repo

And it fails

ERROR ~ Error executing process > 'NFCORE_CIRCDNA:CIRCDNA:BWA_MEM (CDNA_2)'

Caused by:
  Process `NFCORE_CIRCDNA:CIRCDNA:BWA_MEM (CDNA_2)` terminated with an error exit status (1)

Command executed:

  INDEX=`find -L ./ -name "*.amb" | sed 's/\.amb$//'`
  
  bwa mem \
       \
      -t 9 \
      $INDEX \
      CDNA_2.trimmed_1_val_1.fq.gz CDNA_2.trimmed_2_val_2.fq.gz \
      | samtools sort  --threads 9 -o CDNA_2.bam -
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_CIRCDNA:CIRCDNA:BWA_MEM":
      bwa: $(echo $(bwa 2>&1) | sed 's/^.*Version: //; s/Contact:.*$//')
      samtools: $(echo $(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*$//')
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  [E::bwa_idx_load_from_disk] fail to locate the index files
  samtools sort: failed to read header from "-"

Work dir:
  /data/Proyectos/NGS_pipeline/work/62/016f2b2f4d03ae096654b3b1b7598f

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

 -- Check '.nextflow.log' file for details

It also failed with --bwa_index database/indexes/GRCh38/BWA/genome; the index are genome.bwt, etc.

@alexmascension alexmascension reopened this Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants