Finding hostname of slave nodes in hadoop during e

2019-08-16 10:24发布

I want to know how to execute map reduce code on Hadoop 2.9.0 multi-node cluster? I wanna understand which node process which input. Actually, How to find every part of input data is processed by which mapper? I executed following python code on master:

import sys
import socket

for line in sys.stdin:
    line = line.strip()
    words = line.split()
    for word in words:
        print('%s\t%s\t%s' % (word, 1, socket.gethostname()))

I used socket.gethostname() to finding hostname of nodes. I expecte output of this mapper be (e.g):

Bye     1   hadoopmaster
Goodbye 1   hadoopmaster
Hadoop  1   hadoopmaster
Hadoop  1   hadoopslave1
Hello   1   hadoopmaster
Hello   1   hadoopslave2

But is:

Bye     1   hadoopmaster
Goodbye 1   hadoopmaster
Hadoop  1   hadoopmaster
Hadoop  1   hadoopmaster
Hello   1   hadoopmaster
Hello   1   hadoopmaster

Is the code not running on the slave nodes?

标签： python hadoop mapreduce hostname hadoop-streaming

0条回答

Finding hostname of slave nodes in hadoop during e

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间