I need to concatenate column values into a single column.
I have column names in a variable as colnames=col1,col2,col3
.
I am writing the below query from a unix shell and calling the hive. But when I do this, I am getting only the column names concatenated not the values of those columns.
select concat('regexp_replace("${colnames}",",","^")) as result from table;
I would like the output as:
ABCD^10^XYZ
(ABCD
,10
,XYZ
are the column values)
You need concat_ws
function to concatenate values with ^
as a delimiter.
Example with constants:
hive> select concat_ws('^','ABCD','10', 'XYZ');
OK
ABCD^10^XYZ
Command with column names after shell variable substitution should look like this:
select concat_ws('^',col1,col2,col3) as result from table;
In the shell it will look like this:
colnames=col1,col2,col3
hive -e "select concat_ws('^',${colnames}) as result from table"
If columns are not string, wrap them with cast as string using shell, this will allow concat_ws work with strings and not-string columns.
Example
colnames=col1,col2,col3
colnames2=$(echo "cast( $colnames as string)" | sed "s/,/ as string), cast( /g")
echo "$colnames2"
Output:
cast( col1 as string), cast( col2 as string), cast( col3 as string)
Use new variable to pass to hive as in the previous example.