I am new to hive. I have successfully setup a single node hadoop cluster for development purpose and on top of it, I have installed hive and pig.
I created a dummy table in hive:
create table foo (id int, name string);
Now, I want to insert data into this table. Can I add data just like sql one record at a time? kindly help me with an analogous command to:
insert into foo (id, name) VALUES (12,"xyz);
Also, I have a csv file which contains data in the format:
1,name1
2,name2
..
..
..
1000,name1000
How can I load this data into the dummy table?
this is supported from version hive 0.14
INSERT INTO TABLE pd_temp(dept,make,cost,id,asmb_city,asmb_ct,retail) VALUES('production','thailand',10,99202,'northcarolina','usa',20)
What ever data you have inserted into one text file or log file that can put on one path in hdfs and then write a query as follows in hive
EXAMPLE:
You may try this, I have developed a tool to generate hive scripts from a csv file. Following are few examples on how files are generated. Tool -- https://sourceforge.net/projects/csvtohive/?source=directory
Select a CSV file using Browse and set hadoop root directory ex: /user/bigdataproject/
Tool Generates Hadoop script with all csv files and following is a sample of generated Hadoop script to insert csv into Hadoop
Sample of generated Hive scripts
Thanks Vijay
Hive apparently supports INSERT...VALUES starting in Hive 0.14.
Please see the section 'Inserting into tables from SQL' at: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
There's no direct way to insert 1 record at a time from the terminal, however, here's an easy straight forward workaround which I usually use when I want to test something:
Assuming that
t
is a table with at least 1 record. It doesn't matter what is the type or number of columns.It's a limitation of hive.
1.You cannot update data after it is inserted
2.There is no "insert into table values ... " statement
3.You can only load data using bulk load
4.There is not "delete from " command
5.You can only do bulk delete
But you still want to insert record from hive console than you can do select from statck. refer this