Append data to an S3 object

2020-01-30 11:42发布

问题:

Let's say that I have a machine that I want to be able to write to a certain log file stored on an S3 bucket.

So, the machine needs to have writing abilities to that bucket, but, I don't want it to have the ability to overwrite or delete any files in that bucket (including the one I want it to write to).

So basically, I want my machine to be able to only append data to that log file, without overriding it or downloading it.

Is there a way to configure my S3 to work like that? Maybe there's some IAM policy I can attach to it so it will work like I want?

回答1:

Unfortunately, you can't.

S3 doesn't have an "append" operation.* Once an object has been uploaded, there is no way to modify it in place; your only option is to upload a new object to replace it, which doesn't meet your requirements.

*: Yes, I know this post is a couple of years old. It's still accurate, though.



回答2:

As the accepted answer states, you can't. The best solution I'm aware of is to use:

AWS Kinesis Firehose

https://aws.amazon.com/kinesis/firehose/

Their code sample looks complicated but yours can be really simple. You keep performing PUT (or BATCH PUT) operations onto a Kinesis Firehose delivery stream in your application (using the AWS SDK), and you configure the Kinesis Firehose delivery stream to send your streamed data to an AWS S3 bucket of your choice (in the AWS Kinesis Firehose console).

It's still not as convenient as >> from the Linux command line, because once you've created a file on S3 you again have to deal with downloading, appending, and uploading the new file but you only have to do it once per batch of lines rather than for every line of data so you don't need to worry about huge charges because of the volume of append operations. Maybe it can be done but I can't see how to do it from the console.



回答3:

Objects on S3 are not append-able. You have 2 solutions in this case:

  1. copy all S3 data to a new object, append the new content and write back to S3.
function writeToS3(input) {
    var content;
    var getParams = {
        Bucket: 'myBucket', 
        Key: "myKey"
    };

    s3.getObject(getParams, function(err, data) {
        if (err) console.log(err, err.stack);
        else {
            content = new Buffer(data.Body).toString("utf8");
            content = content + '\n' + new Date() + '\t' + input;
            var putParams = {
                Body: content,
                Bucket: 'myBucket', 
                Key: "myKey",
                ACL: "public-read"
             };

            s3.putObject(putParams, function(err, data) {
                if (err) console.log(err, err.stack); // an error occurred
                else     {
                    console.log(data);           // successful response
                }
             });
        }
    });  
}
  1. Second option is to use Kinesis Firehose. This is fairly straightforward. You need to create your firehose delivery stream and link the destination to S3 bucket. That's it!
function writeToS3(input) {
    var content = "\n" + new Date() + "\t" + input;
    var params = {
      DeliveryStreamName: 'myDeliveryStream', /* required */
      Record: { /* required */
        Data: new Buffer(content) || 'STRING_VALUE' /* Strings will be Base-64 encoded on your behalf */ /* required */
      }
    };

    firehose.putRecord(params, function(err, data) {
      if (err) console.log(err, err.stack); // an error occurred
      else     console.log(data);           // successful response
    }); 
}


回答4:

I had the similar issue and this is what I had asked

how to Append data in file using AWS Lambda

Here's What I come up with to solve the above problem:

Use getObject to retrive from the existing file

   s3.getObject(getParams, function(err, data) {
   if (err) console.log(err, err.stack); // an error occurred
   else{
       console.log(data);           // successful response
       var s3Projects = JSON.parse(data.Body);
       console.log('s3 data==>', s3Projects);
       if(s3Projects.length > 0) {
           projects = s3Projects;
       }   
   }
   projects.push(event);
   writeToS3(); // Calling function to append the data
});

Write function to append in the file

   function writeToS3() {
    var putParams = {
      Body: JSON.stringify(projects),
      Bucket: bucketPath, 
      Key: "projects.json",
      ACL: "public-read"
     };

    s3.putObject(putParams, function(err, data) {
       if (err) console.log(err, err.stack); // an error occurred
       else     console.log(data);           // successful response
        callback(null, 'Hello from Lambda');
     });
}

Hope this help!!



回答5:

As others have stated previously, S3 objects are not append-able.
However, another solution would be to write out to CloudWatch logs and then export the logs you want to S3. This would also prevent any attackers who access your server from deleting from your S3 bucket, since Lambda wouldn't require any S3 permissions.



回答6:

In case anyone wants to append data to an object with an S3-like service, the Alibaba Cloud OSS (Object Storage Service) supports this natively.

OSS provides append upload (through the AppendObject API), which allows you to directly append content to the end of an object. Objects uploaded by using this method are appendable objects, whereas objects uploaded by using other methods are normal objects. The appended data is instantly readable.