I know there is a known issue with delimiters when saving a table to csv (or other text file) in Hive. So I'm wondering if you guys can help me get around that.
I have an existing table (Table A) and I would like to save it in csv format to hdfs. From reading other responses I believe I would have to first create an external table (but I'm not sure how the whole thing would look).
Can anyone help?
Try this in hive shell:
INSERT OVERWRITE LOCAL DIRECTORY '/path/to/hive/csv' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' SELECT * FROM hivetablename;
Change your /path/to/csv
to the location where you want to store csv file. hivetablename
to your hive table to be stored in csv format.
This three step process worked fine for me:
in HIVE, create a new temporary table stored as textfile
CREATE TABLE temp_table( id INT, name STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE;
also in HIVE, write the source data into the temporary table
INSERT OVERWRITE TABLE temp_table
SELECT id, name FROM source_table;
From the command line, copy the Hive table from its location in HDFS to you local file - inspection should reveal the file as csv (rename accordingly if desired)
hdfs dfs -copyToLocal /apps/hive/warehouse/temp_table/* /tmp/local_dir/
If you run the command SHOW CREATE TABLE temp_table
in HIVE, it will tell you the exact location of the table in HDFS, e.g.
| LOCATION |
| 'hdfs://hadoop_cluster/apps/hive/warehouse/temp_table' |
For external table in hive, you can follow the below steps:
Create external table in hive
CREATE EXTERNAL TABLE external_table( number INT, name STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION '/user/hive/external/mytable/';
2. Load the data file from local to HDFS location
hadoop fs -put /home/user1/Desktop/filename.csv /user/hive/external/mytable/
The above two steps can solve your problem.