Skip to content

Concurrent, parallel transfer between S3, EC2, and local.

Notifications You must be signed in to change notification settings

deaconjs/s3turbo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 

Repository files navigation

s3turbo

A Python command-line program for parallel, concurrent transfer of large files to, from, and between s3 buckets, ec2 instances, and local.

Install

To use, first install boto and filechunkio.

$ pip install boto, filechunkio

Configure .boto credentials file as here, then clone the s3turbo repo.

Usage

  • single-line transfer:

    to download - s3turbo.py s3://bucket_name/path/key_name local:///full_path/filename

    to upload - s3turbo.py local:///full_path/filename s3://bucket_name/path/key_name

    to copy - s3turbo.py s3://bucket1/path/key_name s3://bucket2/path/key_name

  • OR key-name file input:

    s3turbo.py key_name_file

  • OR rsync functionality (end both args with slashes)

    s3turbo.py (s3|local):path/ (s3|local):path/ [include include_string] [exclude exclude_string] [remove_prefix prefix]

    e.g. s3turbo.py local:///home/username/path s3://owner.run.etc/etc_dir/ include .py exclude .pyc remove_prefix /home/username

The key_name_file format follows the same conventions as single-line format, with one line per file to transfer. If an input file is used, the file list can contain a mixture of download, copy, and upload commands, in any order.

Files are by default not overwritten, so it is safe to restart multiple file transfer operations that were interrupted. Download functionality skips existing local files by the same name but only if they are the same size. The copy and upload functionalities do check file names, but do not yet check file sizes.

  • Optional dryrun flag

    s3turbo.py args [dryrun] [args]

  • Optional reduced_redndancy flag

    s3turbo.py args [reduced_redundancy] [args]

The dry_run flag prints out the files to be transferred, without transferring any. Output is standard input format. The reduced_redundancy flag uses that class of AWS storage. This saves some money but has slightly higher odds of data loss.

About

Concurrent, parallel transfer between S3, EC2, and local.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages