Concatenate multiple columns into one in hive

2020-06-21 03:50发布

问题:

I need to concatenate column values into a single column. I have column names in a variable as colnames=col1,col2,col3 . I am writing the below query from a unix shell and calling the hive. But when I do this, I am getting only the column names concatenated not the values of those columns.

select concat('regexp_replace("${colnames}",",","^")) as result from table;

I would like the output as:

ABCD^10^XYZ

(ABCD,10,XYZ are the column values)

回答1:

You need concat_ws function to concatenate values with ^ as a delimiter.

Example with constants:

hive> select concat_ws('^','ABCD','10', 'XYZ');
OK
ABCD^10^XYZ

Command with column names after shell variable substitution should look like this:

 select concat_ws('^',col1,col2,col3) as result from table;

In the shell it will look like this:

colnames=col1,col2,col3
hive -e "select concat_ws('^',${colnames}) as result from table"

If columns are not string, wrap them with cast as string using shell, this will allow concat_ws work with strings and not-string columns.

Example

colnames=col1,col2,col3
colnames2=$(echo "cast( $colnames as string)" | sed "s/,/ as string), cast( /g")
echo "$colnames2"

Output:

cast( col1 as string), cast( col2 as string), cast( col3 as string)

Use new variable to pass to hive as in the previous example.