USQL- JsonArray column explode

2019-08-30 08:26发布

问题:

I’ve a TSV file and one column is a json string, which has array of objects. I need to convert rows into multiple rows based on jsonarray. Can you please guide me to extract the data?

Example row: Product ID Customers Azure SQL 465383 [{"Customer": "Dell", "Country": "US"},{"Customer": "HP","Country": "Germany"}]

Output Expected: Product ID Customer Country Azure SQL 465383 Dell US Azure SQL 465383 HP Germany

Thanks in advance!

回答1:

You can use the JsonTuple method in the Microsoft.Analytics.Samples.Formats JSON samples supplied in github here. If you are not sure how to install them, follow the tutorial here.

For example, I got this script to work using this file.

REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats]; 

USING Microsoft.Analytics.Samples.Formats.Json;

DECLARE @inputFile string = "/input/input103.tsv";

@input =
    EXTRACT Product string,
            id string,
            Customers string        // JSON column
    FROM @inputFile
    USING Extractors.Tsv(skipFirstNRows:1);


// Convert Customer JSON string to tuple
@working =
    SELECT Product,
           id,
           JsonFunctions.JsonTuple(Customers).Values AS Customers_map
    FROM @input;


// Explode the MAP to get a row per key-value pair
@output =
    SELECT Product,
           id,
           JsonFunctions.JsonTuple(y) ["Customer"] AS Customer,
           JsonFunctions.JsonTuple(y) ["Country"] AS Country
    FROM @working
         CROSS APPLY
             EXPLODE(Customers_map) AS x(y);


OUTPUT @output
TO "/output/output.csv"
USING Outputters.Csv();

My results:



标签: u-sql