Hive Locks entire database when running select on

2019-05-27 05:46发布

HIVE 0.13 will SHARED lock the entire database(I see a node like LOCK-0000000000 as a child of the database node in Zookeeper) when running a select statement on any table in the database. HIVE creates a shared lock on the entire schema even when running a select statement - this results in a freeze on CREATE/DELETE statements on other tables in the database until the original query finishes and the lock is released.

Does anybody know a way around this? Following link suggests concurrency to be turned off but we can't do that as we are replacing the entire table and we have to make sure that no select statement is accessing the table before we replace the entire contents.

http://mail-archives.apache.org/mod_mbox/hive-user/201408.mbox/%3C0eba01cfc035$3501e4f0$9f05aed0$@com%3E

use mydatabase;
select count(*) from large_table limit 1;     # this table is very large and hive.support.concurrency=true`

In another hive shell, meanwhile the 1st query is executing:

use mydatabase;
create table sometable (id string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'  STORED AS TEXTFILE ;

The problem is that the “create table” does not execute untill the first query (select) has finished.

Update: We are using Cloudera's distribution of Hive CDH-5.2.1-1 and we are seeing this issue.

标签: hadoop hive
1条回答
萌系小妹纸
2楼-- · 2019-05-27 06:32

I think they never made such that in Hive 0.13. Please verify your Resource manager and see that you have enough memory when you are executing multiple Hive queries.

As you know each Hive query will trigger a map reduce job and if YARN doesn't have enough resources it will wait till the previous running job completes. Please approach your issue from memory point of view.

All the best !!

查看更多
登录 后发表回答