Tried this:
import boto3
from boto3.s3.transfer import TransferConfig, S3Transfer
path = "/temp/"
fileName = "bigFile.gz" # this happens to be a 5.9 Gig file
client = boto3.client('s3', region)
config = TransferConfig(
multipart_threshold=4*1024, # number of bytes
max_concurrency=10,
num_download_attempts=10,
)
transfer = S3Transfer(client, config)
transfer.upload_file(path+fileName, 'bucket', 'key')
Result: 5.9 gig file on s3. Doesn't seem to contain multiple parts.
I found this example, but part
is not defined.
import boto3
bucket = 'bucket'
path = "/temp/"
fileName = "bigFile.gz"
key = 'key'
s3 = boto3.client('s3')
# Initiate the multipart upload and send the part(s)
mpu = s3.create_multipart_upload(Bucket=bucket, Key=key)
with open(path+fileName,'rb') as data:
part1 = s3.upload_part(Bucket=bucket
, Key=key
, PartNumber=1
, UploadId=mpu['UploadId']
, Body=data)
# Next, we need to gather information about each part to complete
# the upload. Needed are the part number and ETag.
part_info = {
'Parts': [
{
'PartNumber': 1,
'ETag': part['ETag']
}
]
}
# Now the upload works!
s3.complete_multipart_upload(Bucket=bucket
, Key=key
, UploadId=mpu['UploadId']
, MultipartUpload=part_info)
Question: Does anyone know how to use the multipart upload with boto3?
Your code was already correct. Indeed, a minimal example of a multipart upload just looks like this:
You don't need to explicitly ask for a multipart upload, or use any of the lower-level functions in boto3 that relate to multipart uploads. Just call
upload_file
, and boto3 will automatically use a multipart upload if your file size is above a certain threshold (which defaults to 8MB).You seem to have been confused by the fact that the end result in S3 wasn't visible made up of multiple parts:
... but this is the expected outcome. The whole point of the multipart upload API is to let you upload a single file over multiple HTTP requests and end up with a single object in S3.
I would advise you to use boto3.s3.transfer for this purpose. Here is an example:
Change Part to Part1
Why not use just the copy option in boto3?
There are details on how to initialise s3 object and obviously further options for the call available here boto3 docs.
In your code snippet, clearly should be
part
->part1
in the dictionary. Typically, you would have several parts (otherwise why use multi-part upload), and the'Parts'
list would contain an element for each part.You may also be interested in the new pythonic interface to dealing with S3: http://s3fs.readthedocs.org/en/latest/