Based on the following tutorial, Hive has a map type. However, there does not seem to be a documented way to insert a new key-value pair into a Hive map, via a SELECT
with some UDF or built-in function. Is this possible?
As a clarification, suppose I have a table called foo
with a single column, typed map
, named column_containing_map
.
Now I want to create a new table that also has one column, typed map
, but I want each map (which is contained within a single column) to have an additional key-value pair.
A query might look like this:
CREATE TABLE IF NOT EXISTS bar AS
SELECT ADD_TO_MAP(column_containing_map, "NewKey", "NewValue")
FROM foo;
Then the table bar
would contain the same maps as table foo
except each map in bar
would have an additional key-value pair.
Consider you have a student table which contains student marks in various subjects.
hive> desc student;
id string
name string
class string
marks map<string,string>
You can insert values directly to table as below.
INSERT INTO TABLE student
SELECT STACK(1,
'100','Sekar','Mathematics',map("Mathematics","78")
)
FROM empinfo
LIMIT 1;
Here 'empinfo' table can be any table in your database.
And Results are:
100 Sekar Mathematics {"Mathematics":"78"}
I'm sorry, I didn't quite get this. What do you mean by with some UDF or built-in function?If you wish to insert into a table which has a Map field it's similar to any other datatype. For example :
I have a table called complex1, created like this :
CREATE TABLE complex1(c1 array<string>, c2 map<int,string> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' COLLECTION ITEMS TERMINATED BY '-' MAP KEYS TERMINATED BY ':' LINES TERMINATED BY '\n';
I also have a file, called com.txt, which contains this :
Mohammad-Tariq,007:Bond
Now, i'll load this data into the above created table :
load data inpath '/inputs/com.txt' into table complex1;
So this table contains :
select * from complex1;
OK
["Mohammad","Tariq"] {7:"Bond"}
Time taken: 0.062 seconds
I have one more table, called complex2 :
CREATE TABLE complex2(c1 map<int,string>);
Now, to select data from complex1 and insert into complex2 i'll do this :
insert into table complex2 select c2 from complex1;
Scan the table to cross check :
select * from complex2;
OK
{7:"Bond"}
Time taken: 0.062 seconds
HTH
I think the combine function from brickhouse will do what you need. Slightly modifying the query in your original question, it would look something like this
SELECT
combine(column_containing_map, str_to_map("NewKey:NewValue"))
FROM
foo;
The limitation with this example is that str_to_map creates a MAP< STRING,STRING >. If your hive map contains other primitive types for the keys or values, this won't work.