Python Hadoop streaming on windows, Script not a v

2020-03-04 02:05发布

问题:

I have a problem to execute mapreduce python files on Hadoop by using Hadoop streaming.jar.

I use: Windows 10 64bit Python 3.6 and my IDE is spyder 3.2.6, Hadoop 2.3.0 jdk1.8.0_161

I can get answer while my maperducec code is written on java language, but my problem is when I want to mingle python libraries such as tensorflow or other useful machine learning libs on my data.

Installing hadoop 2.3.0: 1. hadoop-env export JAVA_HOME=C:\Java\jdk1.8.0_161 2. I created data -> dfs in hadoop folder

  1. For environment User Variable

    Hadoop_Home = D:\hadoop Java_Home = C:\Java\jdk1.8.0_161 M2_HOME = C:\apache-maven-3.5.2\apache-maven-3.5.2-bin\Maven-3.5.2 Platform = x64

System Varibales: Edit Path as:

D:\hadoop\bin
C:\java\jdk1.8.0_161\bin
C:\ProgramData\Anaconda3

My MapReduce Python code: D:\digit\wordcount-mapper.py

#!/usr/bin/python3
import sys
for line in sys.stdin:    
    line = line.strip()    # remove leading and trailing whitespace
    words = line.split()   # split the line into words
    for word in words:   
        print( '%s\t%s' % (word, 1))

D:\digit\wordcount-reducer.py

#!/usr/bin/python3
from operator import itemgetter
import sys
current_word = None
current_count = 0
word = None
for line in sys.stdin:    
    line = line.strip()   
    word, count = line.split('\t', 1)  
    try:    
        count = int(count)   
    except ValueError:
        continue       
    if current_word == word:    
        current_count += count
    else:
        if current_word:
            print( '%s\t%s' % (current_word, current_count))
        current_count = count
        current_word = word
if current_word == word:    
    print( '%s\t%s' % (current_word, current_count))

When I run my command prompt as administrator:

D:\hadoop\bin> hadoop namenode -format
D:\hadoop\sbin>start-dfs.cmd
D:\hadoop\sbin>start-yarn.cmd

I checked : localhost:8088/ and http://localhost:50070 all is ok.

Then when I enter:

D:\hadoop\sbin>hadoop fs -mkdir -p /input
D:\hadoop\sbin>hadoop fs -copyFromLocal D:\digit\mahsa.txt /input
D:\hadoop\sbin>D:\hadoop\bin\hadoop jar D:\hadoop\share\hadoop\tools\lib\hadoop-streaming-2.3.0.jar -file D:\digit\wordcount-mapper.py -mapper D:\digit\wordcount-mapper.py -file D:\digit\wordcount-reducer.py -reducer D:\digit\wordcount-reducer.py -input /input/mahsa.txt/ -output /output/

I have this error:

18/02/21 21:49:24 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
packageJobJar: [D:\digit\wordcount-mapper.py, D:\digit\wordcount-reducer.py, /D:/tmp/hadoop-Mahsa/hadoop-unjar7054071292684552905/] [] C:\Users\Mahsa\AppData\Local\Temp\streamjob2327207111481875361.jar tmpDir=null
18/02/21 21:49:25 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/02/21 21:49:25 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/02/21 21:49:28 INFO mapred.FileInputFormat: Total input paths to process : 1
18/02/21 21:49:28 INFO mapreduce.JobSubmitter: number of splits:2
18/02/21 21:49:29 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1519235874088_0003
18/02/21 21:49:29 INFO impl.YarnClientImpl: Submitted application application_1519235874088_0003
18/02/21 21:49:29 INFO mapreduce.Job: The url to track the job: http://Mahsa:8088/proxy/application_1519235874088_0003/
18/02/21 21:49:29 INFO mapreduce.Job: Running job: job_1519235874088_0003
18/02/21 21:49:42 INFO mapreduce.Job: Job job_1519235874088_0003 running in uber mode : false
18/02/21 21:49:42 INFO mapreduce.Job:  map 0% reduce 0%
18/02/21 21:49:52 INFO mapreduce.Job: Task Id : attempt_1519235874088_0003_m_000001_0, Status : FAILED
Error: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
        ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
        ... 17 more
Caused by: java.lang.RuntimeException: configuration exception
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222)
        at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
        ... 22 more
Caused by: java.io.IOException: Cannot run program "D:\tmp\hadoop-Mahsa\nm-local-dir\usercache\Mahsa\appcache\application_1519235874088_0003\container_1519235874088_0003_01_000003\.\wordcount-mapper.py": CreateProcess error=193, %1 is not a valid Win32 application
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)
        ... 23 more
Caused by: java.io.IOException: CreateProcess error=193, %1 is not a valid Win32 application
        at java.lang.ProcessImpl.create(Native Method)
        at java.lang.ProcessImpl.<init>(ProcessImpl.java:386)
        at java.lang.ProcessImpl.start(ProcessImpl.java:137)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
        ... 24 more

18/02/21 21:49:52 INFO mapreduce.Job: Task Id : attempt_1519235874088_0003_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
        ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
        ... 17 more
Caused by: java.lang.RuntimeException: configuration exception
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222)
        at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
        ... 22 more
Caused by: java.io.IOException: Cannot run program "D:\tmp\hadoop-Mahsa\nm-local-dir\usercache\Mahsa\appcache\application_1519235874088_0003\container_1519235874088_0003_01_000002\.\wordcount-mapper.py": CreateProcess error=193, %1 is not a valid Win32 application
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)
        ... 23 more
Caused by: java.io.IOException: CreateProcess error=193, %1 is not a valid Win32 application
        at java.lang.ProcessImpl.create(Native Method)
        at java.lang.ProcessImpl.<init>(ProcessImpl.java:386)
        at java.lang.ProcessImpl.start(ProcessImpl.java:137)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
        ... 24 more

18/02/21 21:50:02 INFO mapreduce.Job: Task Id : attempt_1519235874088_0003_m_000001_1, Status : FAILED
Error: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
        ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
        ... 17 more
Caused by: java.lang.RuntimeException: configuration exception
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222)
        at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
        ... 22 more
Caused by: java.io.IOException: Cannot run program "D:\tmp\hadoop-Mahsa\nm-local-dir\usercache\Mahsa\appcache\application_1519235874088_0003\container_1519235874088_0003_01_000004\.\wordcount-mapper.py": CreateProcess error=193, %1 is not a valid Win32 application
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)
        ... 23 more
Caused by: java.io.IOException: CreateProcess error=193, %1 is not a valid Win32 application
        at java.lang.ProcessImpl.create(Native Method)
        at java.lang.ProcessImpl.<init>(ProcessImpl.java:386)
        at java.lang.ProcessImpl.start(ProcessImpl.java:137)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
        ... 24 more

18/02/21 21:50:03 INFO mapreduce.Job: Task Id : attempt_1519235874088_0003_m_000000_1, Status : FAILED
Error: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
        ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
        ... 17 more
Caused by: java.lang.RuntimeException: configuration exception
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222)
        at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
        ... 22 more
Caused by: java.io.IOException: Cannot run program "D:\tmp\hadoop-Mahsa\nm-local-dir\usercache\Mahsa\appcache\application_1519235874088_0003\container_1519235874088_0003_01_000005\.\wordcount-mapper.py": CreateProcess error=193, %1 is not a valid Win32 application
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)
        ... 23 more
Caused by: java.io.IOException: CreateProcess error=193, %1 is not a valid Win32 application
        at java.lang.ProcessImpl.create(Native Method)
        at java.lang.ProcessImpl.<init>(ProcessImpl.java:386)
        at java.lang.ProcessImpl.start(ProcessImpl.java:137)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
        ... 24 more

18/02/21 21:50:13 INFO mapreduce.Job: Task Id : attempt_1519235874088_0003_m_000001_2, Status : FAILED
Error: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
        ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
        ... 17 more
Caused by: java.lang.RuntimeException: configuration exception
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222)
        at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
        ... 22 more
Caused by: java.io.IOException: Cannot run program "D:\tmp\hadoop-Mahsa\nm-local-dir\usercache\Mahsa\appcache\application_1519235874088_0003\container_1519235874088_0003_01_000007\.\wordcount-mapper.py": CreateProcess error=193, %1 is not a valid Win32 application
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)
        ... 23 more
Caused by: java.io.IOException: CreateProcess error=193, %1 is not a valid Win32 application
        at java.lang.ProcessImpl.create(Native Method)
        at java.lang.ProcessImpl.<init>(ProcessImpl.java:386)
        at java.lang.ProcessImpl.start(ProcessImpl.java:137)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
        ... 24 more

18/02/21 21:50:14 INFO mapreduce.Job: Task Id : attempt_1519235874088_0003_m_000000_2, Status : FAILED
Error: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
        ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
        ... 17 more
Caused by: java.lang.RuntimeException: configuration exception
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222)
        at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
        ... 22 more
Caused by: java.io.IOException: Cannot run program "D:\tmp\hadoop-Mahsa\nm-local-dir\usercache\Mahsa\appcache\application_1519235874088_0003\container_1519235874088_0003_01_000008\.\wordcount-mapper.py": CreateProcess error=193, %1 is not a valid Win32 application
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)
        ... 23 more
Caused by: java.io.IOException: CreateProcess error=193, %1 is not a valid Win32 application
        at java.lang.ProcessImpl.create(Native Method)
        at java.lang.ProcessImpl.<init>(ProcessImpl.java:386)
        at java.lang.ProcessImpl.start(ProcessImpl.java:137)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
        ... 24 more

18/02/21 21:50:24 INFO mapreduce.Job:  map 100% reduce 100%
18/02/21 21:50:34 INFO mapreduce.Job: Job job_1519235874088_0003 failed with state FAILED due to: Task failed task_1519235874088_0003_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0

18/02/21 21:50:34 INFO mapreduce.Job: Counters: 13
        Job Counters
                Failed map tasks=7
                Killed map tasks=1
                Launched map tasks=8
                Other local map tasks=6
                Data-local map tasks=2
                Total time spent by all maps in occupied slots (ms)=66573
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=66573
                Total vcore-seconds taken by all map tasks=66573
                Total megabyte-seconds taken by all map tasks=68170752
        Map-Reduce Framework
                CPU time spent (ms)=0
                Physical memory (bytes) snapshot=0
                Virtual memory (bytes) snapshot=0
18/02/21 21:50:34 ERROR streaming.StreamJob: Job not Successful!
Streaming Command Failed!

I really donot know what is problem it tool my time a lot. Thank you in advanced for your help or any idea?

回答1:

Solution:

  1. I used Hadoop version 2.7.2 with almost same configuration for *.xml.
  2. I removed #!/usr/bin/python3 from top of my python code.

I changed my command as:

D:\hadoop\bin\hadoop jar
D:\hadoop\share\hadoop\tools\lib\hadoop-streaming-2.7.2.jar
-file /in/wordcount-mapper.py -mapper "python wordcount-mapper.py"
-file /in/wordcount-reducer.py -reducer "python wordcount-reducer.py"
-input /in/mahsa.txt -output /output

Therefore I could get result.

hadoop fs -cat /output/part-00000