Thorn character delimiter is not recognized in Hiv

2019-05-26 23:41发布

问题:

As mentioned in post Using the Icelandic Thorn character as a delimiter in Hive The thorn character delimiter is not recognized in Hive

Sample table

CREATE EXTERNAL TABLE IF NOT EXISTS zzzzz_raw ( spot_id INT, activity_type_id INT, activity_type STRING, activity_id INT, activity_sub_type STRING, report_name STRING, tag_method_id INT ) PARTITIONED BY ( dt DATE ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\-2' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION '/raw/data/networkmatchtablesactivity/activity_cat';

Output

select * from activity_cat_raw limit 1;

4552126þ805759þeaasv101þ2275868þbfeaac01þBF_EA Access_Info Pageþ2       NULL    NULL    NULL    NULL    NULL    NULL    2015-03-24

Am I missing something?

回答1:

I found the answer. Instead of '-2' (thorn delimiter) , i used '-61' delimiter then a substring to remove the additional symbol, something like below

CREATE EXTERNAL TABLE IF NOT EXISTS SSSSSS ( spot_id STRING, activity_type_id STRING, activity_type STRING, activity_id STRING, activity_sub_type STRING, report_name STRING, tag_method_id STRING ) PARTITIONED BY ( dt STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\-61' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION 'SSSSSS';

and then use substring to remove other symbols

INSERT OVERWRITE TABLE vvvvvv PARTITION (dt) SELECT spot_id STRING, substr(activity_type_id,2), dt FROM SSSSS

Hope it helps..