⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️
Icgc-get is no longer supported. Please use the score-client directly to download files from the ICGC Data Portal. Instructions to download and use the score-client with various repositories can be found at https://docs.icgc.org/pcawg/data/. If you experience issues while downloading with the score-client, please contact us with details at [email protected].
⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️
This is the icgc-get
utility, a universal download client for accessing ICGC data residing in various data repositories.
The data for ICGC resides in many data repositories around the world. These repositories
each have their own environment (public cloud, private cloud, on-premise file systems, etc.),
access controls (DACO, OAuth, asymmetric keys, IP filtering), download clients and configuration mechanisms.
Thus, there is much for a user to learn and perform before actually acquiring the data.
This is compounded by the fact that the number of environments are increasing over time
and their characteristics are frequently changing. A coordinated mechanism to bootstrap and
streamline this process is highly desirable. This is the problem the icgc-get
tool helps to solve.
To install icgc-get
on your local machine, first download the icgc-get
package, then unzip the executable.
unzip icgc-get_linux_v0.3.13_x64.zip
Once the installation is complete, icgc-get
can be invoked with the path to the icgc-get
executable. To make the
executable callable from anywhere, you need to either move the executable to a folder on your PATH
or add the folder you downloaded
the executable to to the PATH
. You can find out what directories are on your path with echo $PATH
on Mac and Linux or path
on Windows. You can
add folders to your path with export PATH=$PATH:/folder
on Mac and Linux or set PATH=%PATH%;/folder
on Windows.
icgc-get
is capable of interfacing with the ICGC storage client, Genetorrent,
the GDC data transfer tool, the EGA download client
and the Amazon Web Service command line interface.
If you do not have any of download clients installed locally, icgc-get
is capable of running them through
the icgc-get Docker container. Running any of the clients through the Docker container will prevent issues from arising related to conflicting
software requirements for the data download clients. To enable this functionality, first install
Docker. Make sure to create a Docker group
when running on a Linux machine to ensure that Docker can be run without root permissions.
This tool requires one or more download clients installed or Docker installed to function
After installing icgc-get
, you will need to do configure some of the essential usage parameters,
such as your access credentials. Enter ./icgc-get configure
and follow the instructions of the prompts.
To keep the default values for the parameters, press enter.
For further information, please view our documentation here.
We depend on PyInstaller for building our binaries. In order to ensure correct behaviour from icgc-get
on termination, it is recommended that you build a PyInstaller release from source as historically their bundled dists through PyPI or otherwise have been inconsistent. This will include building their C libraries, so ensure you have the correct build tools for your platform.
wget https://github.com/pyinstaller/pyinstaller/releases/download/v3.2/PyInstaller-3.2.tar.gz
tar zxvf PyInstaller-3.2.tar.gz
cd PyInstaller-3.2/bootloader
./waf all
cd ..
python setup.py install
First run sudo pip install -r ./requirements.txt
to ensure that all necessary packages have been installed. Then run:
pyinstaller --clean icgc-get-data.spec
The executable icgc-get
will be in a folder named dist
in your current directory. Compress it into a zip file, with the naming convention of
icgc-get_v$VERSION_$OS_x64.zip
, and deploy to artifactory
under dcc-binaries
As an easy way to build a Linux version of icgc-get, you can package it inside the Docker container described in the icgc-get Dockerfile. First rebuild the container to make sure all of the latest updates to the code are copied inside the table. This command must be run from the root directory of the icgc-get project.
docker build -t icgc/icgc-get:$VERSION .
Then run the container in interactive mode. You will need to mount a directory as a data volume to transfer the packaged icgc-get out of the Docker container.
docker run -it -v ~/mnt:/icgc/mnt icgc/icgc-get:$VERSION
Once inside, navigate to /icgc/mnt
, and run the following version of the pyinstaller call:
python /icgc/pyinstaller/pyinstaller-pyinstaller-1804636/pyinstaller.py --clean --onefile -n icgc-get --additional-hooks-dir /icgc/icgcget/bin /icgc/icgcget/icgcget/cli.py
Then, exit Docker. Your executable will be present in the mounted directory, but the docker container does not natively have the ability to zip files.