I am new to hadoop.
I'm trying to install hadoop in my laptop in Pseudo-Distributed mode.
I am running it with root user, but I'm getting the error below.
root@debdutta-Lenovo-G50-80:~# $HADOOP_PREFIX/sbin/start-dfs.sh
WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.
Starting namenodes on [localhost]
ERROR: Attempting to operate on hdfs namenode as root
ERROR: but there is no HDFS_NAMENODE_USER defined.
Aborting operation.
Starting datanodes
ERROR: Attempting to operate on hdfs datanode as root
ERROR: but there is no HDFS_DATANODE_USER defined.
Aborting operation.
Starting secondary namenodes [debdutta-Lenovo-G50-80]
ERROR: Attempting to operate on hdfs secondarynamenode as root
ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.
WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.
Also, I have to run hadoop in root user as hadoop is not able to access ssh service with other user.
How to fix the same?
just do what it asks you:
export HDFS_NAMENODE_USER="root"
export HDFS_DATANODE_USER="root"
export HDFS_SECONDARYNAMENODE_USER="root"
export YARN_RESOURCEMANAGER_USER="root"
export YARN_NODEMANAGER_USER="root"
Based on on the first warning, HADOOP_PREFIX
, sounds like you've not defined HADOOP_HOME
correctly.
This would be done in your /etc/profile.d
.
hadoop-env.sh
is where the remainder of those variables are are defined.
Please refer to the UNIX Shell Guide
hadoop is not able to access ssh service with other user
This has nothing to do with Hadoop itself. It's basic SSH account management. You need to
- Make the hadoop (and other, like yarn) accounts on all machines of a cluster (see
adduser
command documentation)
- Copy a passwordless SSH key using
ssh-copy-id hadoop@localhost
, for example
If you don't need distributed mode and just want to use Hadoop locally, you can use a Mini Cluster.
The documentation also recommends making a single node installation before continuing to pseudo distributed
The root cause of this problem,
- hadoop install for different user and you start yarn service for different user.
OR
- in hadoop config's hadoop-env.sh specified HDFS_NAMENODE_USER and HDFS_DATANODE_USER user is something else.
Hence we need to correct and make it consistent at every place. So a simple solution of this problem is to edit your hadoop-env.sh
file and add the user-name for which you want to start the yarn service. So go ahead and edit $HADOOP_HOME/etc/hadoop/hadoop-env.sh
by adding the following lines
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
Now save and start yarn, hdfs service and check that it works.
Vim ${HADOOP_HOME}sbin/start-dfs.sh
& ${HADOOP_HOME}sbin/stop-dfs.sh
, then add:
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root