I am trying to load data into hive tables which is delimited by double pipe(||). When I try this :
Sample I/P:
1405983600000||111.111.82.41||806065581||session-id
Creating table in hive:
create table test_hive(k1 string, k2 string, k3 string, k4 string,) row format delimited fields terminated by '||' stored as textfile;
Loading data from text file:
load data local inpath '/Desktop/input.txt' into table test_hive;
When I do this it is storing data in the below format:
1405983600000 tabspace-as-second-column 111.111.82.41 tabspace-as-fourth-column
Where as I am expecting the data in table to be
1405983600000 111.111.82.41 806065581 session-id
Kindly help me out I have tried different options on this but unable to resolve it
This issue has been resolved in hive 14 with the use of multidelimiter serde. Please find documentation here. https://cwiki.apache.org/confluence/display/Hive/MultiDelimitSerDe
You could do this if you don't want to use alternate serde or have earlier version of hive:
Then create view on top:
Query the view. Good luck.
Multicharater delimiter eg. || is not supported in Hive till ver 0.13 . So fields terminated by || won't work out.There is an alter native for this.
The default serde can be used. Multi character delimiters can be used for fields , line , escape characters by specifying them in the serde properties.