Client software for downloading datasets from QBiC's data management system openBIS (https://wiki-bsse.ethz.ch/display/bis/Home).
We are making use of the V3 API of openBIS (https://wiki-bsse.ethz.ch/display/openBISDoc1605/openBIS+V3+API) in order to interact with the data management system from the command line, in order to provide a quick data retrieval on the server or cluster resources, where the download via the qPortal is impractical.
You can download the compiled and executable Java binaries as JAR of postman from the GitHub release page: https://github.com/qbicsoftware/postman-cli/releases.
If you want to build from source, check out the commit of your choice and execute mvn clean package
. We only recommend this if you are familiar with Java build systems though, as we cannot give you support here. In the normal case, the binary of a stable release is sufficient.
You need to have Java JRE or JDK installed (openJDK is fine), at least version 17. And the client's host must have allowance to connect to the server, which is determined by our firewall settings. If you are unsure, if your client is allowed to connect, contact us at [email protected].
Postman uses picocli file arguments. Therefore, a config file has to be passed with the '@' prefix to its path:
Example:
java -jar postman.jar list -u <user> <sample> @path/to/config.txt
The structure of the configuration file is:
[-cliOption] [value]
For example: To set the ApplicationServerURL to another URL we have to use: -as [URL]
To use our openBIS URL we write the following lines in the config file:
(Anything beginning with '#' is a comment)
# Config file for postman-cli
# Replace the values defined after the respective CLI parameters (e.g. -u)
# with your value of choice (e.g. qbcab01 as value for -u)
--suffix .txt,.fastq,.fastq.gz
-u qbc001
--password:env MY_PASSWORD
A default file is provided here: default-config.
Just execute the following command to get an overview of the available options.
java -jar postman-cli.jar -h
Usage: postman-cli [-hV] COMMAND
Description:
A software client for downloading data from QBiC.
Options:
-h, --help Show this help message and exit.
-V, --version Print version information and exit.
Commands:
download Download data from QBiC.
list lists all the datasets found for the given identifiers
Optional: specify a config file by running postman with '@/path/to/config.txt'.
A detailed documentation can be found at
https://github.com/qbicsoftware/postman-cli#readme.
File sizes printed by postman Kilobyte, Megabyte, Gigabyte, Terabyte and Petabyte use base 2.
By default, postman will create log files in a directory ./logs/
in your working directory.
You can specify where logs are written by setting the system property log.path
to the desired directory.
After gaining access by applying through [email protected], you can log in using your credentials. The username is provided by us or if you have a uni-tuebingen account by the university of Tuebingen.
Provide your username:
Use the option -u / --user
to provide us with your username.
Provide your password:
Please never send your password over email. We will never ask you for it!
When using the application, you can either:
- enter your password interactively
--password
- enter the name of a system property containing your password
--password:prop my.awesome.property
java -jar -Dmy.awesome.property=ABCDEFG postman.jar -u qbc001a --password:prop my.awesome.property @path/to/config.txt
- enter the name of an environment variable containing your password
--password:env MY_PASSWORD
. Make sure to not use the$
sign before the environment variable (as in bash variables) otherwise the password is not recognized (--password:env $MY_PASSWORD
will fail)
MY_PASSWORD=ABCDEFG java -jar postman.jar -u qbc001a --password:env MY_PASSWORD @path/to/config.txt
To specify which data you want to list or download, you need to provide us with QBiC identifiers.
A QBiC project identifier begins with Q
followed by four characters (QTEST
). QBiC sample identifiers contain their project identifier.
You can provide identifiers either using the command line directly:
java -jar postman.jar QSTTSABCD01 QSTTSABCD02 QSTTSABCDE4 NGSQSTTS0012 "QTEST*"
Please note: When you use the *
character to search for all files in a project, escape your identifier using quotation marks.
Provide identifiers using a file:
You can provide identifiers using a file containing the identifiers. Lines starting with #
are ignored.
# all files for the project
QTEST*
# all files associated with these samples
QSTTS001AB
QSTTS002BC
java -jar postman.jar -f myids.txt @path/to/config.txt
Both the download
and the list
command allow you to filter files by their suffix. The suffixes provided are not case-sensitive.
.TXT
and .txt
will have the same effect.
Multiple suffixes can be provided separated by a comma. A suffix does not have to be the file ending but can be any substring from the end of a file's name.
If you only want to download fastq
and fastq.gz
files you can run postman with
java -jar postman.jar -s .fastq,.fastq.gz @path/to/config.txt
Both the download
and the list
command allow you to filter files using a regular expression.
When filtering with a regular expression pattern, the path of the file as well as the filename are matched against the pattern. Only matching files are included in the output.
If you only want to download files containing test
in their path or in their name, you can run postman with
java -jar postman.jar --pattern ".*test.*" @path/to/config.txt
Although this can be used to filter for suffixes as well, please use the --suffix
option.
Instead of
java -jar postman.jar --pattern".*\.fastq\.gz" @path/to/config.txt
do
java -jar postman.jar -s .fastq.gz @path/to/config.txt
Usage: postman-cli list [-hV] [--exact-filesize] [--with-checksum]
[--without-header] [--format=<format>]
[--pattern=<regex_pattern>] -u=<user> [-s=<suffix>[,
<suffix>...]]... (--password:env=<environment-variable>
| --password:prop=<system-property> | --password)
(-f=<filePath> | SAMPLE_IDENTIFIER...)
Description:
lists all the datasets found for the given identifiers
Parameters:
SAMPLE_IDENTIFIER... one or more QBiC sample identifiers
Options:
--password:env=<environment-variable>
provide the name of an environment variable to
read in your password from
--password:prop=<system-property>
provide the name of a system property to read in
your password from
--password please provide your password
-u, --user=<user> openBIS user name
-f, --file=<filePath> a file with line-separated list of QBiC sample ids
-s, --suffix=<suffix>[,<suffix>...]
only include files ending with one of these
(case-insensitive) suffixes
--pattern=<regex_pattern>
only include files with paths matching this pattern
--with-checksum list the crc32 checksum for each file
--exact-filesize use exact byte count instead of unit suffixes:
Byte, Kilobyte, Megabyte, Gigabyte, Terabyte and
Petabyte
--format=<format> The format to list files in. Case-insensitive.
Possible values: LEGACY, TSV
Default: LEGACY
--without-header remove the header line from the output. Only takes
effect for tabular output formats.
-h, --help Show this help message and exit.
-V, --version Print version information and exit.
Optional: specify a config file by running postman with '@/path/to/config.txt'.
A detailed documentation can be found at
https://github.com/qbicsoftware/postman-cli#readme.
The list
command comes with some special options. You can change how your output looks by
- listing the exact number of bytes for each file
--exact-filesize
- removing the header from the tabular output
--without-header
- listing the crc32 checksum for every file
--with-checksum
- specifying the output format.
Dataset Source Registration Size File
NGSQSTTS015A0 (20211026111452695-847006) NGSQSTTS015A0 2021-10-26T09:14:53.143812Z 1,00 B QSTTS015A0_awesome_genome.fastq/my_genome.fastq
Dataset Source Registration Size CRC32 File
NGSQSTTS015A0 (20211026111452695-847006) NGSQSTTS015A0 2021-10-26T09:14:53.143812Z 1,00 B fa0a3f23 QSTTS015A0_awesome_genome.fastq/my_genome.fastq
# Dataset NGSQSTTS015A0 (20211026111452695-847006)
# Source NGSQSTTS015A0
# Registration 2021-10-26T09:14:53.143812Z
# Size 1,00 B
1,00 B my_genome.fastq
# Dataset NGSQSTTS015A0 (20211026111452695-847006)
# Source NGSQSTTS015A0
# Registration 2021-10-26T09:14:53.143812Z
# Size 1,00 B
1,00 B fa0a3f23 my_genome.fastq
Usage: postman-cli download [-hV] [--ignore-subdirectories] [-o=<outputPath>]
[--pattern=<regex_pattern>] -u=<user> [-s=<suffix>[,
<suffix>...]]... (--password:
env=<environment-variable> | --password:
prop=<system-property> | --password) (-f=<filePath>
| SAMPLE_IDENTIFIER...)
Description:
Download data from QBiC.
Parameters:
SAMPLE_IDENTIFIER... one or more QBiC sample identifiers
Options:
--password:env=<environment-variable>
provide the name of an environment variable to
read in your password from
--password:prop=<system-property>
provide the name of a system property to read in
your password from
--password please provide your password
-u, --user=<user> openBIS user name
-f, --file=<filePath> a file with line-separated list of QBiC sample ids
-s, --suffix=<suffix>[,<suffix>...]
only include files ending with one of these
(case-insensitive) suffixes
--pattern=<regex_pattern>
only include files with paths matching this pattern
-o, --output-dir=<outputPath>
specify where to write the downloaded data
--ignore-subdirectories
put all files into one directory regardless of the
directory structure on the server; conflicts
with files with equal names are not addressed
-h, --help Show this help message and exit.
-V, --version Print version information and exit.
Optional: specify a config file by running postman with '@/path/to/config.txt'.
A detailed documentation can be found at https://github.com/qbicsoftware/postman-cli#readme.
The download
command allows you to download data from our storage to your machine.
Use the --output-dir
option to specify a location on your client the location will be interpreted relative to your working directory.
By using the --ignore-subdirectories
flag, you signal qpostman that you are not interested in the directory structure on the server.
qpostman will thus put all files it finds into the same top-level folder.
default download behaviour
my/awesome/path/file1.fastq.gz -> QABCD/my/awesome/path/file1.fastq.gz
my/awesome/other/path/file2.fastq.gz -> QABCD/my/awesome/other/path/file2.fastq.gz
my/awesome/additional/path/file3.fastq.gz -> QABCD/my/awesome/additional/path/file3.fastq.gz
with --ignore-subdirectories
my/awesome/path/file1.fastq.gz -> QABCD/file1.fastq.gz
my/awesome/other/path/file2.fastq.gz -> QABCD/file2.fastq.gz
my/awesome/additional/path/file3.fastq.gz -> QABCD/file3.fastq.gz
Postman computes the CRC32 checksum for all input streams using the native Java utility class CRC32. Postman favours CheckedInputStream
over the traditional InputStream, and promotes the CRC checksum computation.
Computed CRC32 checksums are compared with CRC32 checksums stored on our servers.
When the checksum does not match, then the download failed. Each download is attempted multiple times.
When all attempts fail, the mismatching checksum is recorded in your log directory in the checksum-mismatch.log
file.
The checksum-mismatch.log
file contains one line for each file that was not downloaded.
// values are tab separated
<expected checksum> <computed checksum> <absolute file path>
In addition, Postman writes the CRC32 checksum in an additional file <file-name-of-checked-file>.crc32
and stores it together with the according file.
-Dlog.path
: provide the log directory-Dlog.level
: provide the log level to use for logging--source-sample-type <sample-type>
: specify which sample type to consider as source sample type.--server-timeout <millis>
: the server timeout in milliseconds
--download-attempts <download-attempts>
provide the maximal amount attempted downloads--buffer-size <buffer-size>
provide a custom buffer size. Please only specify values that are a multiple of1024
.