HiveQL: Using query results as variables

2020-01-24 10:07发布

问题:

in Hive I'd like to dynamically extract information from a table, save it in a variable and further use it. Consider the following example, where I retrieve the maximum of column var and want to use it as a condition in the subsequent query.

set maximo=select max(var) from table;

select
  *
from
  table
where
  var=${hiveconf:maximo}

It does not work, although

set maximo=select max(var) from table;

${hiveconf:maximo}

shows me the intended result.

Doing:

select '${hiveconf:maximo}'

gives

"select max(var) from table"

though.

Best

回答1:

Hive substitutes variables as is and does not execute them. Use shell wrapper script to get result into variable and pass it to your Hive script.

maximo=$(hive -e "set hive.cli.print.header=false; select max(var) from table;")
hive -hiveconf "maximo"="$maximo" -f your_hive_script.hql

And after this inside your script you can use select '${hiveconf:maximo}'