I have tried the JSON SerDe that Amazon provides for EMR instance and works great if you need to address/map JSON dictionary fields to columns. However I wasn't been able to figure how to do the same with JSON arrays. For example if there is a JSON array as follows:
[23123.32, "Text Text", { "key1": "value1" } ]
Is there a way to map the first element of an array to a column in Hive table? What about the embedded dictionary fields?
I was struggling with the same problem till I found this serde on github - https://github.com/rcongiu/Hive-JSON-Serde Just include it using the 'add jar' command once you start hive and it works like a charm.