Insert data into hive table

2019-03-15 02:37发布

问题:

Using a Cygwin distribution, I've installed Hadoop 0.20.3 and Hive 0.11.0.

First of all, I don't understand how to use the Hive CLI:

hive> show tables;

Then enter and nothing happens. I can execute queries using hive -e/-f.

Then, I've created a table:

CREATE TABLE tweet_table(
tweet STRING
)
COMMENT 'Table of string'

But how can I insert data into this table? I see some INSERT INTO examples but when I try:

INSERT INTO TABLE tweet_table (tweet) VALUES ("data")

I've got an error:

FAILED: ParseException line 1:30 cannot recognize input near '(' 'tweet' ')' in select clause

How can I append data in my table?

回答1:

You can insert new data into table by two ways.

  1. Load the data of a file into table using load command.

    LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename.
    
  2. You can insert new data into table by using select query.

    INSERT INTO table tablename1 select columnlist FROM secondtable;
    


回答2:

If you already have a table pre_loaded_tbl with some data. You can use a trick to load the data into your table with following query

INSERT INTO TABLE tweet_table 
  SELECT  "my_data" AS my_column 
    FROM   pre_loaded_tbl 
   LIMIT   5;

Also please note that "my_data" is independent of any data in the pre_loaded_tbl. You can select any data and write any column name (here my_data and my_column). Hive does not require it to have same column name. However structure of select statement should be same as that of your tweet_table. You can use limit to determine how many times you can insert into the tweet_table.

However if you haven't' created any table, you will have to load the data using file copy or load data commands in above answers.



回答3:

Try to use this with single quotes in data:

insert into table test_hive values ('1','puneet');


回答4:

I faced similar problem so I used impala to insert the single row into my table and it worked for the same query. This will work for sure.



回答5:

If table is without partition then code will be,

Insert into table table_name select col_a,col_b,col_c from another_table(source table)

--here any condition can be applied such as limit, group by, order by etc...

If table is with partitions then code will be,

set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;

insert into table table_name partition(partition_col1, paritition_col2) select col_a,col_b,col_c,partition_col1,partition_col2 from another_table(source table)

--here any condition can be applied such as limit, group by, order by etc...