DynamoDB InputFormat for Hadoop

2019-04-12 07:49发布

I have to process some data which is persisted in Amazon Dynamo DB using Hadoop map reduce.

I was searching over internet for Hadoop InputFormat for Dynamo DB and couldn't find it. I'm not familiar with Dynamo DB so I'm guessing there is some trick related to DynamoDB and Hadoop? If there is anywhere implementation of this Input Format could you please share it?

标签： hadoop amazon-web-services mapreduce amazon-dynamodb elastic-map-reduce

2条回答

爱情/是我丢掉的垃圾

2楼-- · 2019-04-12 08:37

After a lot of searching I found DynamoDBInputFormat and DynamoDBOutputFormat in one of Amazon's libraries.

On amazon elastic map reduce there is library called hive-bigbird-handler which contains input and output format for dynamoDB. Full class names are: org.apache.hadoop.hive.dynamodb.write.DynamoDBOutputFormat and org.apache.hadoop.hive.dynamodb.read.DynamoDBInputFormat

I hope these classes will be useful to community.

0人赞添加讨论(0) 举报

别忘想泡老子

3楼-- · 2019-04-12 08:40

Couldn't find an InputFormat which you could use directly in MapReduce. But, here is an article AWS HowTo: Using Amazon Elastic MapReduce with DynamoDB (Guest Post) to run MarReduce jobs using Hive.

0人赞添加讨论(0) 举报

DynamoDB InputFormat for Hadoop

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间