How do I control a hive job name but keep the stag

2020-03-09 07:16发布

I have a number of hive queries that my system executes on a regular basis. When you look at the job tracker, they show up as "SELECT field, other_field ..... (Stage-1)" and similar. That's not particularly helpful to me, so I added:

set mapred.job.name = more helpful name;
to the query. Now I can tell them apart better. However, now my queries that get split into multiple stages all show up as the same name. What I'd ideally like is something along the lines of

set mapred.job.name = more helpful name (Stage-%d);
where the %d would get replaced by the current stage number.
Is this possible, and does anyone know how?

标签: hadoop hive
3条回答
太酷不给撩
2楼-- · 2020-03-09 08:00

I'm not sure there is a way to implement exactly what you wish but I can offer something else.
Instead of using set mapred.job.name you can add a comment in the beginning of the query with a more helpful name like this :
-- this is a more helpful name
SELECT field, other_field ....

Then, in the jobtracker you'll see -- this is a more helpful name ..... (Stage-%d)"

查看更多
冷血范
3楼-- · 2020-03-09 08:01

I've found this site: https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration

on it there is a property called: hive.query.string

so set hive.query.string = even more helpful name should work.

It works perfectly for me.

查看更多
Juvenile、少年°
4楼-- · 2020-03-09 08:17

I know this is a very late reply but anyways if this helps let me know.

This happens because HIVE does not allow certain parameters to be set at run time. Still if you want to set it follow the following steps:

  1. Log in into Ambari UI as admin.
  2. Go to hive Configs
  3. Open custom HiveSite.xml
  4. Add following key value pair
    KEY: hive.security.authorization.sqlstd.confwhitelist.append
    VALUE: mapred.job.name
  5. Restart HIVE service

You can any key-value pair in this config for which you get this runtime error

查看更多
登录 后发表回答