How to Define nested Collection items in Hive

2020-03-14 22:45发布

问题:

I am trying to create a hive table with nested Collection items. Suppose I have an array of struct.

    CREATE TABLE SAMPLE(
    record array<struct<col1:string,col2:string>>
    )row format delimited
    fields terminated by ','
    collection items terminated by '|';

First level, the separator ',' will override the default delimiter '^A'.

Second level, the separator '|' will override the default second level delimiter '^B' to separate out the outer most structure (i.e. Array).

Third level hive will use the default third level delimiter '^C' as the separator for the Struct

Now my question is how can I define a separator for the second level (i.e. Struct), because '^C' character is hard to read as well as to generate.

Is there any way to explicitly define the separator instead of ^C ?

Thanks in advance.

回答1:

Try something like this:

CREATE TABLE SAMPLE(
id BIGINT,
record array<struct<col1:string,col2:string>>
)row format delimited
fields terminated by ','
collection items terminated by '|'
map keys terminated by ':';

Now you data in text file will look like this:

1345653,110909316904:1341894546|221065796761:1341887508

You can then query it like :

select record.col1 from SAMPLE;


标签: nested hive