Hbase error zookeeper exists failed after 3 retiri

2019-02-16 16:21发布

问题:

I am using HBASE 0.94.8 standalone mode in Ubuntu. Its working fine i am able to do every operations in Hbase-shell. But after i logged of my system its giving following error

15/07/28 15:10:30 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 3 retries
15/07/28 15:10:30 WARN zookeeper.ZKUtil: hconnection-0x14ed40513350009 Unable to set watcher on znode (/hbase)
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
    at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
    at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:172)
    at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:450)
    at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
    at org.apache.hadoop.hbase.zookeeper.RootRegionTracker.waitRootRegionLocation(RootRegionTracker.java:77)
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:885)
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:998)
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:896)
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:998)
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:900)
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:857)

Yes sure i have searched a lot. I have found some information ZooKeeper exists failed after 3 retries. May be this error is because zookeeper is stopped. But i don't know to to restart it again. I tried to start Hbase and thrift again but still this is issue.

This command ps axww | grep QuorumPeerMain gives me following output:

 6162 pts/2    S+     0:00 grep --color=auto QuorumPeerMain

Hbase starts working if i restart my system. But i want proper solution.


Temporary solution

with following command i grep this process of HBASE:

ps -fe grep | hbase

and then kill all process of HBASE :

kill -9 4555//assuming 4555 is process id of hbase

Then restarted hbase with sudo and thrift and it start working but i want permanent solution. Because if i am using HBASE in server (means not local machine) i can't restart HBASE everytime.

回答1:

Issue:

Hbase error zookeeper exists failed after 3 retiries clearly indicates that zookeeper quorum is not running - most probable cause can be some inconsistency with your zookeeper.quorum setting in conf/hbase-site.xml, the minimal has to be:

<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>file:///home/testuser/hbase</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/home/testuser/zookeeper</value>
  </property>
</configuration>

In the next section it is succinctly mentioned why zookeeper is required and how can one verify if it's running.


An overview:

Pre-assuming from your text (standalone setup) - you're mixing things up. Zookeeper in simple words manages HBase, and is a must requirement.

By default HBase itself handles zookeeper setup, start-stop (though one can change) - to verify look into the file conf/hbase-evn.sh (in your hbase directory) there must be a line:

export HBASE_MANAGES_ZK=true

Basically tells HBase whether it should manage its own instance of Zookeeper or not. In case it is set to false, edit to true.

Now for verification there's a helpful command (forget about the ps and then grep):

$ jps

the command will list all the java processes (HBase is itself a Java application) on the machine i.e. the probable output has to be (for a minimal standalone HBase setup):

62019 Jps
61098 HMaster        
61233 HRegionServer     
61003 HQuorumPeer

Don't just kill the HBase process, instead use the start-stop utility:

$ ./bin/stop-hbase.sh

make the neccessary changes and start it again:

$ ./bin/start-hbase.sh

P.S. I could have misinterpreted your question (completely), do let me know in the comments I'll get back to you again and get the solution right - for the upcoming SO visitors.



回答2:

When you look into the log files you will find that zookeeper is unable to connect with a port. For example, 543210. That simply means you have previously installed Hadoop on your machine, so hbase tries to lookup the previous hadoop installation's zookeeper. Please rename your existing hadoop setup or remove completely hadoop from your system. (But note that zookeeper seems to leave things around even after a deletion.)

  • Rename hadoop installation folder
  • Remove entry from .bashrc file
  • Restart computer


回答3:

It looks like the issue is not related to hbase or zookeeper. It is a system setting issue.

I've got the same issue after my Mac OS X update.

It turned out that DNS settings were changed by the update. I saw that in hbase logs:

2017-06-09 11:40:18,454 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster
    at org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:143)
[SKIP]
    at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2432)
Caused by: java.lang.NullPointerException
    at org.apache.hadoop.net.DNS.reverseDns(DNS.java:92)

After removing DNS settings in hbase-site.xml the issue dissipeared:

  <!--property>
    <name>hbase.zookeeper.dns.interface</name>
    <value>lo0</value>
  </property>
  <property>
    <name>hbase.regionserver.dns.interface</name>
    <value>lo0</value>
  </property>
  <property>
    <name>hbase.master.dns.interface</name>
    <value>lo0</value>
  </property-->


回答4:

If its only starting zookeeper, this should help you. I hope you are aware that zookeeper should be up and running before we start hbase.



回答5:

I've got almost the same error "ZooKeeper exists failed after 4 retries". It was caused by running ./start-hbase.sh without having permissions to connect to the port 2181. The solution turned out to be really simple:

sudo ./start-hbase.sh

I've used the same configuration of hbase-site.xml as it is in Nabeel Ahmed's post.



回答6:

Mine is working with sudo command

hbase/bin$sudo ./start-habase.sh