I need to put the all the records into kinesis from various servers and need to output the data into multiple S3 Files. I have been trying with ShardID, but, not able to make it work out.
Could you please help????
Python/Java would be fine.
I need to put the all the records into kinesis from various servers and need to output the data into multiple S3 Files. I have been trying with ShardID, but, not able to make it work out.
Could you please help????
Python/Java would be fine.
ShardID is not that important.
And with each shard, your data will be spread accross, so it is just about capacity. Those shards does not affect your input and output result. (It also affects parallelization with the help of hash - partition - key but that's another thing, I'm not explaining that not to confuse.)
You should be concerned about "put_record" or "put_records" methods in the producer (ie. input) part; and the record emitted (ie. output) on the consumer side. You should not worry about which shard has the record passed through, you just take the record on the consumer side and process with your business needs.
Using Kinesis Client Library ( https://github.com/awslabs/amazon-kinesis-client ) is the best for this abstraction.
There is also a sample project on GitHub Amazon Kinesis Connectors ( https://github.com/awslabs/amazon-kinesis-connectors ) that does consuming data and uploading it into S3.