follwing this question: how to cross join unnest a json array in presto
I tried to run the example provided but I get and error while doing so
the SQL command:
select x.n
from
unnest(cast(json_extract('{"payload":[{"type":"b","value":"9"},
{"type":"a","value":"8"}]}','$.payload') as array<varchar>)) as x(n)
the error I got:
Value cannot be cast to array<varchar>
java.lang.RuntimeException: java.lang.NullPointerException: string is null
SELECT JSON_EXTRACT('{"payload":[{"type":"b","value":"9"}, {"type":"a","value":"8"}]}','$.payload')
gives:
[{"type":"b","value":"9"}, {"type":"a","value":"8"}]
which is
ARRAY<MAP<VARCHAR,VARCHAR>>
. you can change your query to:SELECT x.n FROM UNNEST (CAST(JSON_EXTRACT('{"payload":[{"type":"b","value":"9"},{"type":"a","value":"8"}]}','$.payload') AS ARRAY<MAP<VARCHAR, VARCHAR>>)) AS x(n)
You can use JSON_EXTRACT,CAST and finally UNNEST to respective columns
gives output as below
One possible interpretation of the return datatype is the following:
but has the downside that accessing values in a map can't be done using dot notation.
An alternative datatype to assume would be this:
ARRAY(ROW(type VARCHAR, value VARCHAR))
Which resembles the
ARRAY<STRUCT<
Hive datatype equivalent.Massive digression here>> JSON is a bit ambiguous.
Which one is correct? Is a JSON object a representation of a map ( hashmap, dictionary, key-value pairs whatever your language calls it) or is it more like a struct (object, class, bag of names properties whatever your language calls it)? It originates from JavaScript ( Object Notation) intended to cater for arrays, objects and primitive types, but more widespread usage means it has ambiguous mapping (ha) in other languages. Perhaps functionally equivalent but in theory the
MAP
should be quicker for random reads/writes and theROW
probably has some extra object oriented overhead, but this is all implemented in Java where everything is an object anyway so I have no answer. Use whatever you like. << I digress.You found this a bit verbose:
Here's the alternative
It's just as verbose; the names of the columns are just shifted to the CAST expression, but perhaps (subjective!) easier to look at.