% GTRANSFER(1) gtransfer 0.8.0 | User Commands % Frank Scheiner % Mar 23, 2017
gtransfer - The GridFTP data transfer script
{gtransfer|gt} [--source|-s sourceUrl] [--destination|-d destinationUrl] [--transfer-list|-f transferList] [--auto-optimize|-o transferMode] [--recursive|-r] [--checksum-data-channel|-c] [--encrypt-data-channel|-e] [--sync-level syncLevel] [--no-sync] [--guc-max-retries gucMaxRetries] [--gt-max-retries gtMaxRetries] [--gt-progress-indicator indicatorCharacter] [--verbose|-v] [--metric|-m dataPathMetric] [--logfile|-l logfile] [--auto-clean|-a] [--configfile configurationFile] [-- gucParameters]
gtransfer or short gt is a wrapper script for the tgftp(1) tool and provides an advanced command line interface for performing GridFTP data transfers.
gt has the following features:
gtransfer can transfer files along predefined paths by using transit sites and can therefore bridge different network domains. See dpath(5) for more details.
gtransfer can distribute a data transfer over multiple paths. This way users can benefit from the combined bandwidth of multiple paths. See option -m for usage details.
gtransfer supports usage of pre-optimized data transfer parameters for specific connections. See dparam(5) for more details. In addition gt can also automatically optimize a data transfer depending on the size of the files.
gtransfer supports interruption and continuation of transfers. You can
interrupt a transfer by hitting CTRL+C
. To continue an interrupted transfer
simply issue the very same command, gt will then continue the transfer where
it was interrupted. The same procedure works for a failed transfer.
gtransfer supports automatic retries of failed transfer steps. The number of retries is configurable.
gtransfer makes use of bash completion to ease usage. This supports
completion of options and URLs. URL completion also expands (remote) paths. Just
hit the TAB
key to see what's possible.
gtransfer can use host aliases as alternatives to host addresses. E.g. a user can use "myGridFTP:" and "gsiftp://host1.domain.tld:2811" synonymically.
gtransfer can use persistent identifiers (PIDs) as used by EUDAT and provided by EPIC as source of a data transfer.
The options are as follows:
Set the source URL for the transfer.
Possible URL examples:
- {[gsi]ftp|http[s]}://FQDN[:PORT]/path/to/file
- [file://]/path/to/file
"FQDN" is the fully qualified domain name.
Set the destination URL for the transfer.
Possible URL examples:
- [gsi]ftp://FQDN[:PORT]/path/to/file
- [file://]/path/to/file
"FQDN" is the fully qualified domain name.
NOTICE: Special chars in URLs (e.g. German umlauts, etc.) need to be given percent or URL encoded! You can use the following table for reference:
German umlaut | Percent/URL encoded |
---|---|
ä | %C3%A4 |
ö | %C3%B6 |
ü | %C3%BC |
Ä | %C3%84 |
Ö | %C3%96 |
Ü | %C3%9C |
ß | %C3%9F |
" " (space) | %20 |
A full table is available on http://www.w3schools.com/tags/ref_urlencode.asp
As alternative to providing source and destination URLs on the command line, one can also provide a list of source and destination URLs in a transfer list.
The format of each line of the transfer list file is as follows (including the double quotes!):
"<PROTOCOL>://<FQDN1>:<PORT>/path/to/file" "<PROTOCOL>://<FQDN2>:<PORT>/path/to/file[s/]"
Throughout all lines the source URL host part (e.g. "<PROTOCOL>://<FQDN1>:<PORT>") has to be identical. This is also required for the destination URL host part.
This option activates an automatic optimization of transfers depending on the size of files to be transferred. If less than 100 files are going to be transferred, gtransfer will fall back to list transfer. The transferMode controls how files of different size classes are transferred. Currently "seq[uential]" (different size classes are transferred sequentially) is possible. To define different file size classes use the file [...]/chunkConfig. See FILES section below for more details.
Transfer files recursively.
NOTICE: globus-url-copy(1) (even with option -cd and sync options) and therefore also gt will not create directories on the destination side that are empty on the source side!
Enable checksumming on the data channel. The option -c cannot be used in conjunction with -e!
Enable encryption on the data channel. The option -e cannot be used in conjunction with -c!
Set the sync level that should be used for the transfer. This is a globus-url-copy(1) option and the following sync levels are available:
-
Level 0 will only transfer if the destination does not exist.
-
Level 1 will transfer if the size of the destination does not match the size of the source.
-
Level 2 will transfer if the time stamp of the destination is older than the time stamp of the source.
-
Level 3 will perform a checksum of the source and destination and transfer if the checksums do not match.
By default gt transfers files conditionally and uses sync level 1. The option --sync-level cannot be used in conjunction with --no-sync!
NOTICE: globus-url-copy(1) (even with option -cd and sync options) and therefore also gt will not create directories on the destination side that are empty on the source side!
Disable sync(hronization) for the transfer. Gt then transfers files unconditionally. The option --no-sync cannot be used in conjunction with --sync-level!
This option sets the maximum number of retries globus-url-copy(1) will do for a transfer of a single file. By default this is set to 1, which means that globus-url-copy(1) will tolerate at max. one transfer error per file and retry the transfer once. Alternatively this option can also be set with the environment variable GUC_MAX_RETRIES.
This option sets the maximum number of retries gt will do for a single transfer step. By default this is set to 3, which means that gt will try to finish a single transfer step three times or fail. Alternatively this option can also be set with the environment variable GT_MAX_RETRIES.
Be verbose.
Set the metric to select the corresponding path of a data path. To enable multipathing, use either the keyword "all" to transfer data using all available paths or use a comma separated list with the metric values of the paths that should be used (e.g. "0,1,2"). You can also use metric values multiple times (e.g. "0,0").
Set the name for the logfile, tgftp(1) will generate for each transfer. If specified with ".log" as extension, gt will insert a "__step_#" string to the name of the logfile ("#" is the number of the transfer step performed). If omitted gt will automatically generate a name for the logfile(s).
Remove logfiles automatically after the transfer completed.
Set the name of the configuration file for gt. If not set, this defaults to:
- "/etc/gtransfer/gtransfer.conf" or
- "<GTRANSFER_BASE_PATH>/etc/gtransfer.conf" or
- "/etc/opt/gtransfer/gtransfer.conf" or
- "$HOME/.gtransfer/gtransfer.conf" or
- "$( dirname $BASH_SOURCE )/../etc/gtransfer/gtransfer.conf" in this order.
Set the globus-url-copy(1) parameters that should be used for all transfer steps. Notice the space between "--" and the actual parameters. This overwrites any available dparams and is not recommended for regular usage. There exists one exception for the -len|-partial-length X option. If this is provided, it will only be added to the transfer parameters from a dparam for a connection or, if no dparam is available, to the builtin default transfer parameters.
NOTICE: If specified, this option must be the last one in a gt command line.
General options:
Prints out a help message.
Prints out version information.
See option --guc-max-retries for details.
See option --gt-max-retries for details.
If set to 1, gt will keep its used temporary directory below ~/.gtransfer/tmp for inspection when exiting.
If set to 1, gt will not make use of the reliabilty functionality of globus-url-copy(1). This means that transfers always start from the beginning. I.e. transfers cannot be interrupted and later continued from where they were interrupted and transfers that failed temporarily will also start from the beginning, when retried.
The gt configuration file.
The chunk configuration file. In this file you can define the different file size classes for the auto-optimization. Practically the file is a table with three columns: MIN_SIZE_IN_MB, MAX_SIZE_IN_MB and GUC_PARAMETERS separated by a semicolon.
Each line defines a size class. The value for MIN_SIZE_IN_MB is not included in the class. The value for MAX_SIZE_IN_MB is included in the class. Use the keyword "min" in the column MIN_SIZE_IN_MB to default to the size of the smallest file available in a transfer list. Files of this size will be included in this class then. Use the keyword "max" in the column MAX_SIZE_IN_MB to default to the size of the biggest file available in a transfer list. The third column GUC_PARAMETERS defines the transfer parameters to use for the specific file size class.
Example:
#MIN_SIZE_IN_MB;MAX_SIZE_IN_MB;GUC_PARAMETERS
min;50;-cc 16 -tcp-bs 4M -stripe -sbs 4M -cd
50;250;-cc 8 -tcp-bs 8M -stripe -sbs 4M -cd
250;max;-cc 6 -p 4 -tcp-bs 8M -stripe -sbs 8M -g2 -cd
This directory contains the system dpaths usable by gt and is configurable.
This directory contains the system dparams usable by gt and is configurable.
This directory contains the user dpaths usable by gt. Can be created with dpath(1). If existing, dpaths in this directory have precedence.
This directory contains the user dparams usable by gt. Can be created with dparam(1). If existing, dparams in this directory have precedence.
dparam(1), dparam(5), dpath(1), dpath(5), halias(1), globus-url-copy(1), tgftp(1), uberftp(1C)