hbase cannot find an existing table

2019-01-23 13:24发布

I set up a hbase cluster to store data from opentsdb. Recently due to reboot of some of the nodes, hbase lost the table "tsdb". I can still it on hbase's master node page, but when I click on it, it gives me a tableNotFoundException

org.apache.hadoop.hbase.TableNotFoundException: tsdb
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:952)
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:818)
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:782)
    at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:249)
    at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:213)
    at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:171)
......

I entered hbase shell, trying to locate 'tsdb' table, but got the similar message

hbase(main):018:0> scan 'tsdb'
ROW                                                          COLUMN+CELL

ERROR: Unknown table tsdb!

However when I tried to re-create this table, hbase shell told me the table already exist...

hbase(main):013:0> create 'tsdb', {NAME => 't', VERSIONS => 1, BLOOMFILTER=>'ROW'}

ERROR: Table already exists: tsdb!

And I can also list the table in hbase shell

hbase(main):001:0> list
TABLE
tsdb
tsdb-uid
2 row(s) in 0.6730 seconds

Taking a look at the log, I found this which should be the cause of my issue

2012-05-14 12:06:22,140 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Encountered problems when prefetch META table:
org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for table: tsdb, row=tsdb,,99999999999999
    at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:157)
    at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52)
    at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130)
    at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127)

It says cannot find row of tsbb in .META., but there are indeed tsdb rows in .META.

hbase(main):002:0> scan '.META.'
ROW                                                          COLUMN+CELL
 tsdb,\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\ column=info:regioninfo, timestamp=1336311752799, value={NAME => 'tsdb,\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\x00\x02\x00\x00\x12\x00\x00\x03\x00\x00\x13\x00\x00\x
 x00\x02\x00\x00\x12\x00\x00\x03\x00\x00\x13\x00\x00\x05\x00 05\x00\x001,1336311752340.7cd0d2205d9ae5fcadf843972ec74ec5.', STARTKEY => '\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\x00\x02\x00\x00\x12\x00\x00\x03\x00\x00\x13\x00\
 \x001,1336311752340.7cd0d2205d9ae5fcadf843972ec74ec5.       x00\x05\x00\x001', ENDKEY => '\x00\x00\x10O\xA3\x8C\x80\x00\x00\x01\x00\x00\x0B\x00\x00\x02\x00\x00\x19\x00\x00\x03\x00\x00\x1A\x00\x00\x05\x00\x001', ENCODED => 7cd0d2205d9ae5f
                                                             cadf843972ec74ec5,}
 tsdb,\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\ column=info:server, timestamp=1337011527000, value=brycobapd01.usnycbt.amrs.bankofamerica.com:60020
 x00\x02\x00\x00\x12\x00\x00\x03\x00\x00\x13\x00\x00\x05\x00
 \x001,1336311752340.7cd0d2205d9ae5fcadf843972ec74ec5.
 tsdb,\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\ column=info:serverstartcode, timestamp=1337011527000, value=1337011518948
......

 tsdb-uid,,1336081042372.a30d8074431c6a31c6a0a30e61fedefa.   column=info:server, timestamp=1337011527458, value=bry200163111d.usnycbt.amrs.bankofamerica.com:60020
 tsdb-uid,,1336081042372.a30d8074431c6a31c6a0a30e61fedefa.   column=info:serverstartcode, timestamp=1337011527458, value=1337011519807
6 row(s) in 0.2950 seconds

Here is the result after I ran "hbck" on the cluster

ERROR: Region hdfs://slave-node-1:9000/hbase/tsdb/249438af5657bf1881a837c23997747e on HDFS, but not listed in META or deployed on any region server
ERROR: Region hdfs://slave-node-1:9000/hbase/tsdb/4f8c65fb72910870690b94848879db1c on HDFS, but not listed in META or deployed on any region server
ERROR: Region hdfs://slave-node-1:9000/hbase/tsdb/63276708b4ac9f11e241aca8b56e9def on HDFS, but not listed in META or deployed on any region server
ERROR: Region hdfs://slave-node-1:9000/hbase/tsdb/e54ee4def67d7f3b6dba75a3430e0544 on HDFS, but not listed in META or deployed on any region server
ERROR: (region tsdb,\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\x00\x02\x00\x00\x12\x00\x00\x03\x00\x00\x13\x00\x00\x05\x00\x001,1336311752340.7cd0d2205d9ae5fcadf843972ec74ec5.) First region should start with an empty key.  You need to  create a new region and regioninfo in HDFS to plug the hole.
ERROR: Found inconsistency in table tsdb
Summary:
  -ROOT- is okay.
    Number of regions: 1
    Deployed on:  master-node,60020,1337011518948
  .META. is okay.
    Number of regions: 1
    Deployed on:  slave-node-2,60020,1337011519845
Table tsdb is inconsistent.
    Number of regions: 5
    Deployed on:  slave-node-2,60020,1337011519845 slave-node-1,60020,1337011519807 master-node,60020,1337011518948
  tsdb-uid is okay.
    Number of regions: 1
    Deployed on:  slave-node-1,60020,1337011519807
5 inconsistencies detected.
Status: INCONSISTENT

I have run

bin/hbase hbck -fix

which unfortunately does not solve my problem

Could someone help me out on this that

  1. Is it possible to recover this table "tsdb"?
  2. If 1 cannot be done, is it a suggested way to gracefully remove 'tsdb', and create a new one?
  3. I'd be greatly appreciated if anybody can let me know what is the most suggested way to reboot a node? Currently, I am leaving my master node always up. For other nodes, I run this command immediately after its reboot.

command:

# start data node
bin/hadoop-daemon.sh start datanode
bin/hadoop-daemon.sh start jobtracker    
# start hbase
bin/hbase-daemon.sh start zookeeper
bin/hbase-daemon.sh start regionserver 

Many Thanks!

10条回答
太酷不给撩
2楼-- · 2019-01-23 14:02

try to fix meta

  1. hbase hbck
  2. hbase hbck -fixMeta
  3. hbase hbck -fixAssignments
  4. hbase hbck -fixReferenceFiles

after and try again

查看更多
我欲成王,谁敢阻挡
3楼-- · 2019-01-23 14:08

If you are using cdh4.3 then the path in zookeeper should be /hbase/table94/

查看更多
The star\"
4楼-- · 2019-01-23 14:09

hbase-clean.sh --cleanZk

It works well, simple enough.

查看更多
萌系小妹纸
5楼-- · 2019-01-23 14:12

I get a similar error message when I try an HBase connection from a Java client on a machine that doesn't have the TCP privilege to access the HBase machines. The table indeed exists when I do hbase shell on the HBase machine itself.

Does opentsdb has all the privileges/port config to access the HBase machine ?

查看更多
贼婆χ
6楼-- · 2019-01-23 14:13

A bit late, maybe it's helpful to the searcher.

  1. Run the ZooKeeper shell hbase zkcli
  2. In the shell run ls /hbase/table
  3. Run rmr /hbase/table/TABLE_NAME
  4. Restart Hbase
查看更多
Ridiculous、
7楼-- · 2019-01-23 14:14

More instructions on deleting the tables:

~/hbase-0.94.12/bin/hbase shell

> truncate 'tsdb'
> truncate 'tsdb-meta'
> truncate 'tsdb-uid'
> truncate 'tsdb-tree'
> exit

I also had to restart the tsd daemon.

查看更多
登录 后发表回答