Storing query result in a variable

2019-04-25 02:42发布

I have a query whose result I wanted to store in a variable How can I do it ? I tried

./hive -e  "use telecom;insert overwrite local directory '/tmp/result' select
avg(a) from abc;"

./hive --hiveconf MY_VAR =`cat /tmp/result/000000_0`;

I am able to get average value in MY_VAR but it takes me in hive CLI which is not required and is there a way to access unix commands inside hive CLI?

5条回答
乱世女痞
2楼-- · 2019-04-25 03:01

You can simply achieve this using a shell script.

create a shell script file: avg_op.sh

#!/bin/sh
hive -e 'use telecom;select avg(a) from abc;' > avg.txt
wait
value=`cat avg.txt`
hive --hiveconf avgval=$value -e "set avgval;set hiveconf:avgval;
use telecom;
select * from abc2 where avg_var=\${hiveconf:avgval};"

execute the .sh file

>bash avg_op.sh
查看更多
做自己的国王
3楼-- · 2019-04-25 03:05

You can use BeeTamer for that. It allows to store result (or part of it) in a variable, and use this variable later in your code.

Beetamer is a macro language / macro processor that allows to extend functionality of the Apache Hive and Cloudera Impala engines.

select avg(a) from abc;
%capture MY_AVERAGE;
select * from abc2 where avg_var=#MY_AVERAGE#;

In here you save average value from you query into macro variable MY_AVERAGE and then reusing it in the second query.

查看更多
Viruses.
4楼-- · 2019-04-25 03:10

Storing hive query output in a variable and using it in another query.

In shell create a variable with desired value by doing:

var=`hive -S -e "select max(datekey) from ....;"`
echo $var

Use the variable value in another hive query by:

hive -hiveconf MID_DATE=$var -f test.hql
查看更多
够拽才男人
5楼-- · 2019-04-25 03:13

Use Case: in mysql the following is valid:

set @max_date := select max(date) from some_table;
select * from some_other_table where date > @max_date;

This is super useful for scripts that need to repeatedly call this variable since you only need to execute the max date query once rather than every time the variable is called.

HIVE does not currently support this. (please correct me if I'm wrong! I have been trying to figure out how to do this all all afternoon)

My workaround is to store the required variable in a table that is small enough to map join onto the query in which it is used. Because the join is a map rather than a broadcast join it should not significantly hurt performance. For example:

drop table if exists var_table;

create table var_table as
select max(date) as max_date from some_table;

select some_other_table.*
from some_other_table
left join var_table
where some_other_table.date > var_table.max_date;

The suggested solution by @visakh is not optimal because stores the string 'select count(1) from table_name;' rather than the returned value and so will not be helpful in cases where you need to call a var repeatedly during a script.

查看更多
等我变得足够好
6楼-- · 2019-04-25 03:13

try below :

$ var=$(hive -e "select '12' ")

$ echo $var

12 -- output

查看更多
登录 后发表回答