-
Notifications
You must be signed in to change notification settings - Fork 127
Listing Only Some Files
S3 buckets can have millions of files. S3 returns at most 1,000 objects per query. However,
aws ls
automatically issues multiple calls to retrieve the entire list. If the bucket contains many keys, the command can take a long time and use lots of memory. Instead, you can reduce the size of the result several ways.
The command is
aws ls BUCKET/PREFIX
You can specify a prefix after the bucket. Only objects that match the prefix will be listed. For example
$ aws ls -1 test682
%gconf.xml
1GBfile
MEDVADEV0_09.DMP.tgz
Sabin Backfiles/hello.txt
US TEXT_DATA_1976_2000_Bills_BLue_Book/hello.txt
US TEXT_DATA_2000-2010_Grants_and_Apps_(for_Cloud)/hello.txt
a b/hello.txt
aws
big
h.txt
hello.txt
hello2.txt
m
mputest
test/big
x/y/hello.txt
$ aws ls -1 test682/he
hello.txt
hello2.txt
$ aws ls -1 test682/x/y
x/y/hello.txt
Keep in mind that S3 does not have subdirectories, so slash (/) is just another character, and the last example shows result x/y/hello.txt because it starts with x/y, not because x/y is interpreted in any special way.
You can specify a delimiter to tell S3 to parse the file names as though there were subdirectories. The most common usage is --delimiter=/, which can also be abbreviated -d.
$ aws ls -1d test682
%gconf.xml
1GBfile
MEDVADEV0_09.DMP.tgz
aws
big
h.txt
hello.txt
hello2.txt
m
mputest
Sabin Backfiles/
US TEXT_DATA_1976_2000_Bills_BLue_Book/
US TEXT_DATA_2000-2010_Grants_and_Apps_(for_Cloud)/
a b/
test/
x/
With -d (which sets the delimiter to slash), S3 lists keys up to the next occurrence of the delimiter.
You can combine the use of delimiter and prefix:
$ aws ls -1d test682/x/
y/
$ aws ls -1d test682/x/y/
hello.txt
$ aws ls -1 --delimiter=e test682/h
.txt
e
The last example is included to reinforce the point that S3 does not have subdirectories.
S3 returns at most 1,000 objects at a time, so "aws ls" makes multiple calls to retrieve a list of all objects in the bucket. You can disable the automatic iteration with --batch=X, which causes "aws" to return the results from a single call to S3. Because S3 returns at most 1,000 results in a call, --batch=X causes "aws" to return at most 1,000 results.
The value X is the "marker", used to define where the list should start. S3 will choose the first object whose key is larger than X. If you want to start with the first key in the bucket, use "--batch=" with an empty string for X. (The equal sign is required. Do not use "--batch", because it will set X to 1.)
You can combine --batch=X with --max-keys to get the list of objects back in batches of any size.
$ aws ls -l test681 --max-keys=10 --batch=
-rw------- 1 timkay681 6 2008-12-29 15:24:26 /tmp/subd/hello.txt
-rw------- 1 timkay681 30836 2008-10-20 21:52:27 /usr/bin/aws
-rw------- 1 timkay681 10 2013-10-02 22:04:24 1000000/9590534/13846970/Kenny's closet card-front.pdf
-rw------- 1 timkay681 13 2013-03-06 20:23:03 FolderSpade/SandCastle.bin
-rw------- 1 timkay681 15560494 2009-04-13 01:47:43 afl.zip
-rw------- 1 timkay681 1 2009-04-18 23:27:07 ascii/char_01_?.txt
-rw------- 1 timkay681 1 2009-04-18 23:27:07 ascii/char_02_?.txt
-rw------- 1 timkay681 1 2009-04-18 23:27:08 ascii/char_03_?.txt
-rw------- 1 timkay681 1 2009-04-18 23:27:08 ascii/char_04_?.txt
-rw------- 1 timkay681 1 2009-04-18 23:27:09 ascii/char_05_?.txt
$ aws ls -l test681 --max-keys=10 --batch=ascii/char_05
-rw------- 1 timkay681 1 2009-04-18 23:27:09 ascii/char_05_?.txt
-rw------- 1 timkay681 1 2009-04-18 23:27:09 ascii/char_06_?.txt
-rw------- 1 timkay681 1 2009-04-18 23:27:09 ascii/char_07_?.txt
-rw------- 1 timkay681 1 2009-04-18 23:27:09 ascii/char_08_?.txt
-rw------- 1 timkay681 1 2009-04-18 23:27:10 ascii/char_09_?.txt
-rw------- 1 timkay681 2 2009-04-19 00:18:46 ascii/char_0a_?.txt
-rw------- 1 timkay681 2 2009-04-18 23:27:10 ascii/char_0b_?.txt
-rw------- 1 timkay681 2 2009-04-18 23:27:10 ascii/char_0c_?.txt
-rw------- 1 timkay681 2 2009-04-18 23:27:11 ascii/char_0d_?.txt
-rw------- 1 timkay681 2 2009-04-18 23:27:11 ascii/char_0e_?.txt
$ aws ls -l test681 --max-keys=10 --batch=ascii/char_0e
-rw------- 1 timkay681 2 2009-04-18 23:27:11 ascii/char_0e_?.txt
-rw------- 1 timkay681 2 2009-04-18 23:27:11 ascii/char_0f_?.txt
-rw------- 1 timkay681 2 2009-04-18 23:27:12 ascii/char_10_?.txt
-rw------- 1 timkay681 2 2009-04-18 23:27:12 ascii/char_11_?.txt
-rw------- 1 timkay681 2 2009-04-18 23:27:12 ascii/char_12_?.txt
-rw------- 1 timkay681 2 2009-04-18 23:27:13 ascii/char_13_?.txt
-rw------- 1 timkay681 2 2009-04-18 23:27:14 ascii/char_14_?.txt
-rw------- 1 timkay681 2 2009-04-19 00:56:26 ascii/char_15_?.txt
-rw------- 1 timkay681 2 2009-04-19 00:56:26 ascii/char_16_?.txt
-rw------- 1 timkay681 2 2009-04-19 00:56:26 ascii/char_17_?.txt
Note that the --batch=X parameter is a key smaller than the next one that I want. The _?.txt suffix contains non-printing characters, so I was able to omit it, and yet the X (marker) worked as desired.