I am using:
aws s3api list-objects --endpoint-url https://my.end.point/ --bucket my.bucket.name --query 'Contents[].Key' --output text
to get the list of files in a bucket.
The aws s3api list-object
documentation page says that this command returns only up to a 1000 objects, however I noticed that in my case it returns the names of all files in my bucket. For example when I run the following command:
aws s3api list-objects --endpoint-url https://my.end.point/ --bucket my.bucket.name --query 'Contents[].Key' --output text | tr "\t" "\n" | wc -l
I get 13512 displayed, meaning that more than 13 thousand file names were returned.
Am I missing smth?
I use the following aws cli version:
aws-cli/1.10.57 Python/2.7.3 Linux/3.2.0-4-amd64 botocore/1.4.47
I think that the part "(up to 1000)" in the documentation's description is highly misleading. It refers to the maximal page size per underlying HTTP request which is sent by the cli. The documentation of the
--page-size
option makes this clear:It gets even clearer when reading the AWS documentation about pagination [2] which describes:
As Ankit already stated correctly, using the
--max-items
option is the correct solution to limit the result and stop the automatic pagination:References
[1] https://docs.aws.amazon.com/cli/latest/reference/s3api/list-objects.html
[2] https://docs.aws.amazon.com/cli/latest/userguide/cli-usage-pagination.html
Try using
--max-items
with the command.The doc mentions it will return
NextMarker
when the no of items are more thanmax-items
. You can pass it asstarting-token
in the next call to achieve pagination.