Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SRA Explorer not returning results #19

Closed
tamuanand opened this issue May 9, 2020 · 4 comments
Closed

SRA Explorer not returning results #19

tamuanand opened this issue May 9, 2020 · 4 comments

Comments

@tamuanand
Copy link

tamuanand commented May 9, 2020

Hi @ewels

https://sra-explorer.info/# is not returning any results (09-May-2020, 822 PM London time)

Probably the EBI/ENA ftp site is down

@tamuanand
Copy link
Author

An update on the above - if you know the ascp command line for a particular record, that aspera download however seems to work

ascp -QT -l 300m -P33001 -i <path>/asperaweb_id_dsa.openssh [email protected]:vol1/fastq/ERR036/ERR036000/ERR036000_1.fastq.gz .

@ewels
Copy link
Owner

ewels commented May 10, 2020

Tested ERR036000 just now and it seemed to work fine, so I guess that this was just a temporary glitch in the matrix..

Let me know if it keeps happening 🤞

@ewels ewels closed this as completed May 10, 2020
@tamuanand
Copy link
Author

Thanks @ewels - yes, it was a temporary glitch.

I was using the fromSRA channel factory and I believe it is based off your code - https://www.nextflow.io/blog/2019/release-19.03.0-edge.html

One question and one suggestion:

  • question: Does SRA Explorer query NCBI or EBI to get the individual fastq runs? I believe NCBI looking at the fromSRA error messages.

  • suggestion: fromSRA was returning error messages like "can't do nulls on uids". Hence it would be nice to see a similar error reported on SRA Explorer when someone searches for a SRA id or anything, but then the underlying system (NCBI or EBI) had a glitch. In my case, I kept hitting submit with a ID and did not see the bottom change, so I was worried if something was wrong with my browser.

Again, just a suggestion.

Needless to say, it is a great great tool.

On a side note, I have suggested to Paolo/Evan that NF should develop a method to return ascp compatible urls when querying for SRA.

Right now I use fromSRA and then have this ugly looking chained perl regex to ultimately get to a aspera compatible url_download followed by a pipe to bash - would like to know your thoughts/ideas on the below

echo "ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR279/SRR279588/SRR279588_1.fastq.gz 
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR279/SRR279588/SRR279588_2.fastq.gz" 
| perl -pe 's#.gz#.gz .#g' | perl -pe 's#.gz .#.gz . &&  #' 
| perl -pe 's#ftp://ftp.sra.ebi.ac.uk/vol#ascp -QT -l 300m -P33001 -i <path_to>/asperaweb_id_dsa.openssh  era-fasp\x40fasp.sra.ebi.ac.uk:vol#g'  > SRR279588.txt

cat SRR279588.txt | bash

What that ultimately translates to is this command on the shell

ascp -QT -l 300m -P33001 -i <path_to>/asperaweb_id_dsa.openssh  [email protected]:vol1/fastq/SRR279/SRR279588/SRR279588_1.fastq.gz . 
&&
ascp -QT -l 300m -P33001 -i <path_to>/asperaweb_id_dsa.openssh  [email protected]:vol1/fastq/SRR279/SRR279588/SRR279588_2.fastq.gz . 

Hence, it would be nice to have a new channel factory or a method to get aspera compatible urls with NF.

@ewels
Copy link
Owner

ewels commented May 10, 2020

Yeah, I know I should catch errors. Kind of mentioned in #7 (comment) and it's been in the back of my mind for a while. It's a bit crap to just silently die when it hits unexpected errors.

This tool needs quite a lot of work at the moment though, as the SRA is totally replacing their infrastructure so all of the SRA links are stopping working. Unfortunately it's a fairly low priority project for me so it'll probably take me a while until I can find time to invest here.

Does SRA Explorer query NCBI or EBI to get the individual fastq runs?

It queries NCBI first to find the runs and get SRA accessions. Once it has these for individual runs, it queries the EBI to get the FastQ download paths.

The ascp nextflow factory sounds like a sensible idea.. It might complicate things as it requires custom software though, whereas the simple URLs presumably work by default with Nextflow's built-in staging mechanisms (but this is a topic for the nextflow repo, not here 😉 )

Phil

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants