Hive/Beeline, how can I set the job .staging direc

2019-09-11 05:33发布

问题:

On the cluster I'm working on every user is given 60GB of Hadoop quota. Historically the project I'm working on generates a lot of Hive queries. In order for things to work faster I'm trying to parallel these queries (which are unrelated) but as a result the directory /user/{myusername}/.staging/ is being filled with job_{someid} directories which in turn are filled with the hive jars and consume these 60GB very fast. While I can limit the parallelization factor I would also like to see if I can ask Hive to put these jars on a different directory. Say /tmp/{myusername} where I have a lot more space.

Any idea how do I tell Hive/Beeline to create the .staging directory under /tmp/{myusername}?

回答1:

Easiest way is on execution of your beeline session.

beeline --hive.exec.stagingdir=/tmp/{myusername}

Think you can do it via !set inside beeline but don't have the syntax to hand.



回答2:

The above doesn't work.

We found the following working

beeline --hiveconf hive.exec.stagingdir=/tmp/{myusername}