I am trying to export a table to a local csv file in hive.
INSERT OVERWRITE LOCAL DIRECTORY '/home/sofia/temp.csv'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
ESCAPED BY '\\'
LINES TERMINATED BY '\n'
select * from mytable;
The problem is that some of the values contain the newline "\n" character and the resulting file becomes really messy.
Is there any way of enclosing the values in quotes when exporting in Hive, so that the csv file can contain special characters (and especially the newline)?
One possible solution could be to use
Hive CSV SerDe (Serializer/Deserializer)
. It provides a way to specifycustom delimiters, quote, and escape characters
.Limitation:
It does not handle
embedded newlines
Availability:
The
CSV Serde
is available in Hive 0.14 and greater.Background:
The
CSV SerDe
is based from https://github.com/ogrodnek/csv-serde, and was added to theHive
distribution in HIVE-7777.Usage:
This
SerDe
works for mostCSV data
, but does not handleembedded newlines
. To use theSerDe
, specify the fully qualified class nameorg.apache.hadoop.hive.serde2.OpenCSVSerde
.original documentation is available at https://github.com/ogrodnek/csv-serde.
Default separator, quote, and escape characters if unspecified
Reference: Hive csv-serde