I’ve a TSV file and one column is a json string, which has array of objects. I need to convert rows into multiple rows based on jsonarray. Can you please guide me to extract the data?
Example row:
Product ID Customers
Azure SQL 465383 [{"Customer": "Dell", "Country": "US"},{"Customer": "HP","Country": "Germany"}]
Output Expected:
Product ID Customer Country
Azure SQL 465383 Dell US
Azure SQL 465383 HP Germany
Thanks in advance!
You can use the JsonTuple
method in the Microsoft.Analytics.Samples.Formats JSON samples supplied in github here. If you are not sure how to install them, follow the tutorial here.
For example, I got this script to work using this file.
REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
USING Microsoft.Analytics.Samples.Formats.Json;
DECLARE @inputFile string = "/input/input103.tsv";
@input =
EXTRACT Product string,
id string,
Customers string // JSON column
FROM @inputFile
USING Extractors.Tsv(skipFirstNRows:1);
// Convert Customer JSON string to tuple
@working =
SELECT Product,
id,
JsonFunctions.JsonTuple(Customers).Values AS Customers_map
FROM @input;
// Explode the MAP to get a row per key-value pair
@output =
SELECT Product,
id,
JsonFunctions.JsonTuple(y) ["Customer"] AS Customer,
JsonFunctions.JsonTuple(y) ["Country"] AS Country
FROM @working
CROSS APPLY
EXPLODE(Customers_map) AS x(y);
OUTPUT @output
TO "/output/output.csv"
USING Outputters.Csv();
My results: