I wish to create a large (multi-GB) file in an AWS S3 bucket from an ASP.NET Core Web API. The file is sufficiently large that I wish not to load the Stream
into memory prior to uploading it to AWS S3.
Using PutObjectAsync()
I'm forced to pre-populate the Stream
prior to passing it on to the AWS SDK, illustrated below:
var putObjectRequest = new PutObjectRequest
{
BucketName = "my-s3-bucket",
Key = "my-file-name.txt",
InputStream = stream
};
var putObjectResponse = await amazonS3Client.PutObjectAsync(putObjectRequest);
My ideal pattern would involve the AWS SDK returning a StreamWriter
(of sorts) I could Write()
to many times and then Finalise()
when I'm done.
Two questions concerning my challenge:
- Am I misinformed about having to pre-populate the
Stream
prior to calling onPutObjectAsync()
? - How should I go about uploading my large (multi-GB) file?
For such situations AWS docs provides two options:
High-level API simply suggests you to create a
TransferUtilityUploadRequest
with aPartSize
specified, so the class itself could upload the file without any need to maintain the upload by yourself. In this case you can get the progress on the multipart upload with subscribing toStreamTransferProgress
event. You can upload a file, a stream, or a directory.Low-level API, obviously, is more complicated, but more flexible - you can initiate the upload, and after that you do upload the next part of a file in a loop. Sample code from documentation:
Asynchronous version of
UploadPart
is available too, so you should investigate that path, if you need a full control for your uploads.