Export table enclosing values with quotes to local

2019-04-14 09:13发布

I am trying to export a table to a local csv file in hive.

INSERT OVERWRITE LOCAL DIRECTORY '/home/sofia/temp.csv' 
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ',' 
ESCAPED BY '\\' 
LINES TERMINATED BY '\n'
select * from mytable;

The problem is that some of the values contain the newline "\n" character and the resulting file becomes really messy.

Is there any way of enclosing the values in quotes when exporting in Hive, so that the csv file can contain special characters (and especially the newline)?

1条回答
Bombasti
2楼-- · 2019-04-14 10:05

One possible solution could be to use Hive CSV SerDe (Serializer/Deserializer). It provides a way to specify custom delimiters, quote, and escape characters.

Limitation:

It does not handle embedded newlines

Availability:

The CSV Serde is available in Hive 0.14 and greater.

Background:

The CSV SerDe is based from https://github.com/ogrodnek/csv-serde, and was added to the Hive distribution in HIVE-7777.

Usage:

This SerDe works for most CSV data, but does not handle embedded newlines. To use the SerDe, specify the fully qualified class name org.apache.hadoop.hive.serde2.OpenCSVSerde.

original documentation is available at https://github.com/ogrodnek/csv-serde.

CREATE TABLE my_table(a string, b string, ...)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
   "separatorChar" = "\t",
   "quoteChar"     = "'",
   "escapeChar"    = "\\"
)  
STORED AS TEXTFILE;

Default separator, quote, and escape characters if unspecified

DEFAULT_ESCAPE_CHARACTER \
DEFAULT_QUOTE_CHARACTER  "
DEFAULT_SEPARATOR        ,

Reference: Hive csv-serde

查看更多
登录 后发表回答