I have a scenario where we have many clients uploading to s3.
- What is the best approach to knowing that there is a new file?
- Is it realistic/good idea, for me to poll the bucket ever few seconds?
I have a scenario where we have many clients uploading to s3.
UPDATE:
Since November 2014, S3 supports the following event notifications:
s3:ObjectCreated:Put
– An object was created by an HTTP PUT operation.s3:ObjectCreated:Post
– An object was created by HTTP POST operation.s3:ObjectCreated:Copy
– An object was created an S3 copy operation.s3:ObjectCreated:CompleteMultipartUpload
– An object was created by the completion of a S3 multi-part upload.s3:ObjectCreated:*
– An object was created by one of the event types listed above or by a similar object creation event added in the future.s3:ReducedRedundancyObjectLost
– An S3 object stored with Reduced Redundancy has been lost.These notifications can be issued to Amazon SNS, SQS or Lambda. Check out the blog post that's linked in Alan's answer for more information on these new notifications.
Original Answer:
Although Amazon S3 has a bucket notifications system in place it does not support notifications for anything but the s3:ReducedRedundancyLostObject event (see the GET Bucket notification section in their API).
Currently the only way to check for new objects is to poll the bucket at a preset time interval or build your own notification logic in the upload clients (possibly based on Amazon SNS).
Push notifications are now built into S3:
http://aws.amazon.com/blogs/aws/s3-event-notification/
You can send notifications to SQS or SNS when an object is created via PUT or POST or a multi-part upload is finished.
Your best option nowadays is using the AWS Lambda service. You can write a Lambda using either node.js javascript, java or Python (probably more options will be added in time). The lambda service allows you to write functions that respond to events from S3 such as file upload. Cost effective, scalable and easy to use.