How to load a text file into a Hive table stored a

2020-05-13 03:02发布

问题:

I have a hive table stored as a sequencefile.

I need to load a text file into this table. How do I load the data into this table?

回答1:

You can load the text file into a textfile Hive table and then insert the data from this table into your sequencefile.

Start with a tab delimited file:

% cat /tmp/input.txt
a       b
a2      b2

create a sequence file

hive> create table test_sq(k string, v string) stored as sequencefile;

try to load; as expected, this will fail:

hive> load data local inpath '/tmp/input.txt' into table test_sq;

But with this table:

hive> create table test_t(k string, v string) row format delimited fields terminated by '\t' stored as textfile;

The load works just fine:

hive> load data local inpath '/tmp/input.txt' into table test_t;
OK
hive> select * from test_t;
OK
a       b
a2      b2

Now load into the sequence table from the text table:

insert into table test_sq select * from test_t;

Can also do load/insert with overwrite to replace all.



回答2:

You cannot directly create a table stored as a sequence file and insert text into it. You must do this:

  1. Create a table stored as text
  2. Insert the text file into the text table
  3. Do a CTAS to create the table stored as a sequence file.
  4. Drop the text table if desired

Example:

CREATE TABLE test_txt(field1 int, field2 string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';

LOAD DATA INPATH '/path/to/file.tsv' INTO TABLE test_txt;

CREATE TABLE test STORED AS SEQUENCEFILE
AS SELECT * FROM test_txt;

DROP TABLE test_txt;


回答3:

The simple way is to create table as textfile and move the file to the appropriate location

CREATE EXTERNAL TABLE mytable(col1 string, col2 string)
row format delimited fields terminated by '|' stored as textfile;

Copy the file to the HDFS Location where table is created.
Hope this helps!!!



标签: hadoop hive