I currently use TransferManager to download all files in an S3 bucket, from a Lambda function.
// Initialize
TransferManagerBuilder txBuilder = TransferManagerBuilder.standard();
// txBuilder.setExecutorFactory(() -> Executors.newFixedThreadPool(50));
TransferManager tx = txBuilder.build();
final Path tmpDir = Files.createTempDirectory("/tmp/s3_download/");
// Download
MultipleFileDownload download = tx.downloadDirectory(bucketName,
bucketKey,
new File(tmpDir.toUri()));
download.waitForCompletion();
return Files.list(tmpDir.resolve(bucketKey)).collect(Collectors.toList());
It seems to take around 300 seconds
to download 10,000 files
(of size ~20KB each
), giving me a transfer rate of about 666 KBps
.
Increasing the thread pool size doesn't seem to affect the transfer rate at all.
The S3 endpoint, and the lambda function are in the same AWS region, and in the same AWS account.
How can I optimize the S3 downloads?