Storing query result in a variable

I have a query whose result I wanted to store in a variable How can I do it ? I tried

./hive -e  "use telecom;insert overwrite local directory '/tmp/result' select
avg(a) from abc;"

./hive --hiveconf MY_VAR =`cat /tmp/result/000000_0`;

I am able to get average value in MY_VAR but it takes me in hive CLI which is not required and is there a way to access unix commands inside hive CLI?

标签： variables hadoop hive

5条回答

乱世女痞

2楼-- · 2019-04-25 03:01

You can simply achieve this using a shell script.

create a shell script file: avg_op.sh

#!/bin/sh
hive -e 'use telecom;select avg(a) from abc;' > avg.txt
wait
value=`cat avg.txt`
hive --hiveconf avgval=$value -e "set avgval;set hiveconf:avgval;
use telecom;
select * from abc2 where avg_var=\${hiveconf:avgval};"

execute the .sh file

>bash avg_op.sh

0人赞添加讨论(0) 举报

做自己的国王

3楼-- · 2019-04-25 03:05

You can use BeeTamer for that. It allows to store result (or part of it) in a variable, and use this variable later in your code.

Beetamer is a macro language / macro processor that allows to extend functionality of the Apache Hive and Cloudera Impala engines.

select avg(a) from abc;
%capture MY_AVERAGE;
select * from abc2 where avg_var=#MY_AVERAGE#;

In here you save average value from you query into macro variable MY_AVERAGE and then reusing it in the second query.

0人赞添加讨论(0) 举报

Viruses.

4楼-- · 2019-04-25 03:10

Storing hive query output in a variable and using it in another query.

In shell create a variable with desired value by doing:

var=`hive -S -e "select max(datekey) from ....;"`
echo $var

Use the variable value in another hive query by:

hive -hiveconf MID_DATE=$var -f test.hql

0人赞添加讨论(0) 举报

够拽才男人

5楼-- · 2019-04-25 03:13

Use Case: in mysql the following is valid:

set @max_date := select max(date) from some_table;
select * from some_other_table where date > @max_date;

This is super useful for scripts that need to repeatedly call this variable since you only need to execute the max date query once rather than every time the variable is called.

HIVE does not currently support this. (please correct me if I'm wrong! I have been trying to figure out how to do this all all afternoon)

My workaround is to store the required variable in a table that is small enough to map join onto the query in which it is used. Because the join is a map rather than a broadcast join it should not significantly hurt performance. For example:

drop table if exists var_table;

create table var_table as
select max(date) as max_date from some_table;

select some_other_table.*
from some_other_table
left join var_table
where some_other_table.date > var_table.max_date;

The suggested solution by @visakh is not optimal because stores the string 'select count(1) from table_name;' rather than the returned value and so will not be helpful in cases where you need to call a var repeatedly during a script.

0人赞添加讨论(0) 举报

等我变得足够好

6楼-- · 2019-04-25 03:13

try below :

$ var=$(hive -e "select '12' ")

$ echo $var

12 -- output

0人赞添加讨论(0) 举报

Storing query result in a variable

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间