Hive jobs occurs mapreduce error : Call From hmast

2019-08-29 11:28发布

问题:

When I run in hive command line:

hive > select count(*) from alogs;

On the terminal, it shows the following :

    Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1417084377943_0009, Tracking URL = http://localhost:8088/proxy/application_1417084377943_0009/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1417084377943_0009
Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
2014-12-02 17:59:44,068 Stage-1 map = 0%,  reduce = 0%
Ended Job = job_1417084377943_0009 with errors
Error during job, obtaining debugging information...
**FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask**
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

Then i used resourceManager to see the error details :

Application application_1417084377943_0009 failed 2 times due to Error launching appattempt_1417084377943_0009_000002. Got exception: **java.net.ConnectException: Call From hmaster/127.0.0.1 to localhost:44849 failed on connection exception: java.net.ConnectException: Connection refused;** For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
    at org.apache.hadoop.ipc.Client.call(Client.java:1415)
    at org.apache.hadoop.ipc.Client.call(Client.java:1364)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
    at com.sun.proxy.$Proxy32.startContainers(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96)
    at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:119)
    at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:254)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
    Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:712)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:606)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:700)
    at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1463)
    at org.apache.hadoop.ipc.Client.call(Client.java:1382)
    ... 9 more
    . Failing the application. 

Though the error informs is detailed enough, i don't know where to set the configs 'localhost:44849', and what is the meaning of 'Call From hmaster/127.0.0.1 to localhost:44849 failed on connection exception'

回答1:

I have the same problem with you when I run my application written by spring yarn. I may find a solution and test the yarn application several times and didn't get this error.

First, modify all ther servers' /etc/hosts and write all slaves in the file, just like:

192.168.0.101 slave1
192.168.0.102 slave2
...

Second,modify all ther servers' yarn-site.xml in /home/user/hadoop/etc/hadoop/ and add property like:

  <property>
    <name>yarn.nodemanager.address</name>
    <value>slave1:57799</value>
  </property>

Notice the domain must same to the server and the port you may set a randorm number such as 57799. The port number must be consistent in all the yarn-site.xml files.

Third, restart resourcemanager and all nodemanagers.

I hope that may help you.

Also I think this problem my cased because I didn't add slaves lists in the file

/home/user/hadoop/etc/hadoop/slaves 

but I haven't test this.



回答2:

if you have a config file "..../hadoop-2.8.1/etc/hadoop/mapred-site.xml" in your hadoop install file,and you haven't run YARN,hive task may throw Retrying connect to server: 0.0.0.0/0.0.0.0:8032” exception. (you may find select * is ok,select sum() is wrong,┭┮﹏┭┮)

you can execute "jps" to check if YARN is running.

if YARN is not running,the result may like:

[cc@localhost conf]$ jps
36721 Jps
8402 DataNode
35458 RunJar
8659 SecondaryNameNode
8270 NameNode

if YARN is running,the result may like:

[cc@localhost sbin]$ jps
13237 Jps
9767 DataNode
9975 SecondaryNameNode
12651 ResourceManager (多了这个)
12956 NodeManager (多了这个)
9581 NameNode
13135 JobHistoryServer

There are two solutions:

1.rename mapred-site.xml file,execute linux command "mv mapred-site.xml mapred-site.xml.template" or delete mapred-site.xml file,then restart hadoop.

2.run YARN. ps:modify hadoop config and use "start-yarn.sh" to run YARN.