Skip to content

Latest commit

 

History

History
61 lines (48 loc) · 3.41 KB

data.md

File metadata and controls

61 lines (48 loc) · 3.41 KB
layout title permalink order
page
Data Access
/data/
10

IGSR - 1000 Genomes Sample Data

The International Genome Sample Resource (IGSR) presents all the data it has on the 1000 Genomes samples through the 1000 Genomes FTP site{:target="_blank"} and the mirror site{:target="_blank"} hosted at the NCBI. Currently this FTP site contains the three main releases of the 1000 Genomes Project.

{% for release in site.data.phasedata %} {% endfor %}
1000 Genomes Release Variants Individuals Populations VCF Sequence and Alignments Supporting Data
{{release.release}} {{release.variants}} {{release.individuals}} {{release.populations}} VCF Alignments {% if release.supporting %}Supporting Data{% else %} - {% endif %}

##1000 Genomes Data Access Tools

The FTP site{:target="_blank"} is accessible both over FTP and [HTTP]({{site.ebi_ftp| replace:'ftp://', 'http://'}}){:target="_blank"}. It is also accessible via two fast download tools, aspera and globus grid ftp.

###Aspera

The data is also via an Aspera server from both sites. To be able to use this service you need to download the Aspera connect software{:target="_blank"}. This provides both a browser plug in for downloading data and a bulk download client called ascp. An example command line to get a file using ascp looks like:

{% highlight bash %} ascp -i bin/aspera/etc/asperaweb_id_dsa.putty -Tr -Q -l 100M
-L- [email protected]:vol1/ftp/release/20100804/ALL.2of4intersection.20100804.genotypes.vcf.gz ./ {% endhighlight %}

The location of -i will depend on where the ascp program was installed. This key should not ask you for a password. IGSR data is freely available and should never request a password for download.

###Globus Grid FTP

The 1000 Genomes FTP site is available as an end point in the Globus Online system{:target="_blank"}. In order to access the data you need to sign up for an account with Globus via their signup page{:target="_blank"}. You must also install the Globus Connect Personal software{:target="_blank"} and setup a personal endpoint to download the data too.

The 1000 Genomes end point is one of several EMBL-EBI hosted end points and is called ebi#1000genomes. When you have setup your personal end point you should be able to start a transfer using their web front end.

![Globus screenshot]({{ '/images/globus_1000genomes.png' | prepend: site.baseurl }}){:class="center-block"}

The Globus website has support for setting up accounts{:target="_blank"} and installing the globus personal connect software{:target="_blank"}.