How to Define nested Collection items in Hive

2020-03-14 22:49发布

I am trying to create a hive table with nested Collection items. Suppose I have an array of struct.

    CREATE TABLE SAMPLE(
    record array<struct<col1:string,col2:string>>
    )row format delimited
    fields terminated by ','
    collection items terminated by '|';

First level, the separator ',' will override the default delimiter '^A'.

Second level, the separator '|' will override the default second level delimiter '^B' to separate out the outer most structure (i.e. Array).

Third level hive will use the default third level delimiter '^C' as the separator for the Struct

Now my question is how can I define a separator for the second level (i.e. Struct), because '^C' character is hard to read as well as to generate.

Is there any way to explicitly define the separator instead of ^C ?

Thanks in advance.

标签: nested hive
1条回答
叛逆
2楼-- · 2020-03-14 23:26

Try something like this:

CREATE TABLE SAMPLE(
id BIGINT,
record array<struct<col1:string,col2:string>>
)row format delimited
fields terminated by ','
collection items terminated by '|'
map keys terminated by ':';

Now you data in text file will look like this:

1345653,110909316904:1341894546|221065796761:1341887508

You can then query it like :

select record.col1 from SAMPLE;
查看更多
登录 后发表回答