I am setting up a Hadoop YARN cluster and I am using a machine as both a master and a slave. When I start the YARN using the following command, it starts the nodemanager on slaves but not on the master node.
sbin/yarn-daemons.sh start nodemanager
I have a master which also is slave and then I have another two slaves within the cluster, the nodemanagers in the slaves are starting properly.
The error I get :
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.BindException: Problem binding to [0.0.0.0:8040] java.net.BindException: Address already in use; For more details see: http://wiki.apache.org/hadoop/BindException
Output of some of the Commands .
cat /etc/services | grep 8040
ampify 8040/tcp # Ampify Messaging Protocol
ampify 8040/udp # Ampify Messaging Protocol
lsof -i tcp:8040
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 28021 df 195u IPv6 3580602 0t0 TCP server1.mydomain.com:ampify (LISTEN
Under the default configuration that Hadoop ships, port 8040 is the port that the NodeManager uses for the localizer. This is basically a server endpoint responsible for bringing the files required to run a container onto the local node. (For example, this can be a MapReduce job's jar file or distributed cache files.)
Assuming that there is another server on the machine (here shown as Ampify) legitimately bound to port 8040, and you don't want to stop that service, then it is possible to reconfigure the port used by the NodeManager for the localizer. Set property yarn.nodemanager.localizer.address
in your yarn-site.xml file. This is documented here:
http://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
Pulling that from the XML source in the Hadoop tree, here is the documentation for the property:
<property>
<description>Address where the localizer IPC is.</description>
<name>yarn.nodemanager.localizer.address</name>
<value>${yarn.nodemanager.hostname}:8040</value>
</property>
Above error means, you are trying to start a process on 8040, which is already occupied by another instance.
To get rid of this error, you need to kill the process which is currently listening to port 8040. Your lsof output says pid is 28021. kill the process using the following command and start again
kill -9 28021