hive using serdeproperties gives error

2019-08-21 08:18发布

问题:

I am trying to create the hive table so that the hdfs file system have UTF-8 Format, the problem is the query is giving error, not sure what I am doing wrong.

DROP TABLE IF EXISTS output_2057565014;
CREATE TABLE temp.output_2057565014
ROW FORMAT DELIMITED
FIELDS TERMINATED BY 'ธ'
COLLECTION ITEMS TERMINATED BY '|'
MAP KEYS TERMINATED BY '$'
with serdeproperties('serialization.encoding'='UTF-8') 
LOCATION '/tmp/test-2057565014' 
AS
SELECT * from temp.abc

回答1:

"the query is giving error" > yeah, but what kind?? Maybe reading that error message would help. Without it, it's just guesswork.
So, let's guess.


ROW FORMAT DELIMITED clause implicitly assumes that delimiter characters are single ASCII-7 characters, either defined explicitly (when printable) or by their octal code.

Hence FIELDS TERMINATED BY 'ธ' is not valid.

You can try different workarounds -- changing the delimiter in the upstream file creation process; changing the delimiter in situ before loading to HDFS (e.g. with a good old sed command); trying a hard-coded column mapping with RegExSerde (cf. Language Manual DLL / CREATE TABLE under "Row Formats & SerDe")...



标签: hive hiveql