inserting JSON data from one table to another tabl

I want to insert JSON data from one table to other tables based on the key fields on the data.

my data looks like this

{"Rtype":{"ver":"1","os":"ms","type":"ns","vehicle":"Mh-3412","MOD":{"Version":[{"ABC":{"XYZ":"123.dfer","founder":"3.0","GHT":"Florida","fashion":"fg45","cdc":"new","dof":"yes","ts":"2000-04-01T00:00:00.171Z"}}]}}}

{"Rtype":{"ver":"1","os":"ms","type":"ns","vehicle":"Mh-3412","MOD":{"Version":[{"GAP":{"XVY":"123.dfer","FAH":"3.0","GHT":"Florida","fashion":"fg45","cdc":"new","dof":"yes","ts":"2000-04-01T00:00:00.171Z"}}]}}}

{"Rtype":{"ver":"1","os":"ms","type":"ns","vehicle":"Mh-3412","MOD":{"Version":[{"BOX":{"VOG":"123.dfer","FAH":"3.0","FAX":"Florida","fashion":"fg45","cdc":"new","dof":"yes","ts":"2000-04-01T00:00:00.171Z"}}]}}}

Here based on Version, whther it is "BOX" or "GAP" or "ABC" i want to populate the fields on particular JSON rows to another table

for ex: if version is "GAP" then populate particular rows in one table if it is "BOX" then populate to another tables... I mean all rows for BOX...

how can I achieve this using HIVE. Please help.

NOTE: My JSON data is in one table as a column with type string

标签： json hadoop hive

2条回答

倾城　Initia

2楼-- · 2019-07-26 20:57

Demo

create table src (myjson string);

insert into src values
    ('{"Rtype":{"ver":"1","os":"ms","type":"ns","vehicle":"Mh-3412","MOD":{"Version":[{"ABC":{"XYZ":"123.dfer","founder":"3.0","GHT":"Florida","fashion":"fg45","cdc":"new","dof":"yes","ts":"2000-04-01T00:00:00.171Z"}}]}}}')
   ,('{"Rtype":{"ver":"1","os":"ms","type":"ns","vehicle":"Mh-3412","MOD":{"Version":[{"GAP":{"XVY":"123.dfer","FAH":"3.0","GHT":"Florida","fashion":"fg45","cdc":"new","dof":"yes","ts":"2000-04-01T00:00:00.171Z"}}]}}}')
   ,('{"Rtype":{"ver":"1","os":"ms","type":"ns","vehicle":"Mh-3412","MOD":{"Version":[{"BOX":{"VOG":"123.dfer","FAH":"3.0","FAX":"Florida","fashion":"fg45","cdc":"new","dof":"yes","ts":"2000-04-01T00:00:00.171Z"}}]}}}')
;

create table trg_abc (myjson string);
create table trg_gap (myjson string);
create table trg_box (myjson string);

from src
insert into trg_abc select myjson where get_json_object(myjson,'$.Rtype.MOD.Version[0].ABC') is not null
insert into trg_gap select myjson where get_json_object(myjson,'$.Rtype.MOD.Version[0].GAP') is not null
insert into trg_box select myjson where get_json_object(myjson,'$.Rtype.MOD.Version[0].BOX') is not null
;

0人赞添加讨论(0) 举报

疯言疯语

3楼-- · 2019-07-26 21:04

First of all you will need to store your data as json in hive tables:

I assume your hive table is EXTERNAL (usually are - check it with SHOW CREATE TABLE your_table).
If so the entire dataset lies in some hdfs/s3 path, for example s3a://your_bucket/your_jsons_location/

Download json-udf-1.3.7-jar-with-dependencies.jar and run ADD JARS s3a://your_bucket/lib/json-udf-1.3.7-jar-with-dependencies.jar;
Then you must create a dedicated json table for each json schema you have:

CREATE EXTERNAL TABLE boxes
(Rtype struct<ver:string,os:string,type:string,vehicle:string,MOD:struct<Version:array<struct<BOX:struct<VOG:string,FAH:string,FAX:string,fashion:string,cdc:string,dof:string,ts:string>>>>>)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' location 's3a://your_bucket/your_jsons_location/';

CREATE EXTERNAL TABLE gaps
(Rtype struct<ver:string,os:string,type:string,vehicle:string,MOD:struct<Version:array<struct<GAP:struct<XVY:string,FAH:string,GHT:string,fashion:string,cdc:string,dof:string,ts:string>>>>>)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' location 's3a://your_bucket/your_jsons_location/';

CREATE EXTERNAL TABLE abcs
(Rtype struct<ver:string,os:string,type:string,vehicle:string,MOD:struct<Version:array<struct<ABC:struct<XYZ:string,founder:string,GHT:string,fashion:string,cdc:string,dof:string,ts:string>>>>>)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' location 's3a://your_jsons_location/';

Now if you'll run:

SELECT * FROM boxes;
SELECT * FROM gaps;
SELECT * FROM abcs;

you will see that each table parsed correctly only the matching jsons (according to the schema that was specified in the create statment). The unmatching ones in each table are NULL.

To filter out the unrelevant records run:
SELECT * FROM abcs WHERE Rtype.mod.version[0].abc IS NOT NULL;

Note: that this whole explanation assumes that your jsons are stored externally to the hive table (specifically I used S3 but it can also be HDFS)

0人赞添加讨论(0) 举报

inserting JSON data from one table to another tabl

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间