Custom inputformat to process protobufs in hadoop

2019-08-07 10:43发布

站内文章 / 后端开发

26 0

冷血范

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I'd like to process protobufs using hadoop....but am unsure where to start. I don't care about splitting large files. The protobufs are stored as binary data...what class should I extend to make it easier

回答1:

elephant-bird can process protobufs using hadoop. This framework generates hadoop I/O classes along with regular protobuf classes. It uses lzo compression.