I have read many existing questions at SO but none of them answers what I am looking for. I know it is difficult to parse json in bash using sed/awk but I only need a few key-value pairs per record out of a whole list of key-value pairs per record. I want to do this because it will be faster as the main JSON is pretty big with millions of records.
The JSON format is like following:
{
"documents":
[
{
"title":"a", //needed
"description":"b", //needed
"id":"c", //needed
....(some more:not useful)....
"conversation":
[
{
"message":"",
"id":"d", //not needed
.....(some more)....
"createDate":"e", //not needed
},
...(some more messages)....
],
"createDate":"f", //needed
....(many more labels).....
}
],
....(some more global attributes)....
}
Now for this I require attributes which are marked as needed but their common key make it a problem to get by simple sed/awk. Could anyone suggest if we can do it with sed/awk. if possible any help to achieve the same would be appreciated.
P.S.: I know about jsawk
but I do not want to introduce any dependency, so if possible please suggest usage of sed/awk.
EDIT: Multiple extries of the format given below(as in document we have a list)
"title":"a",
"description":"b"
"id":"c"
"createDate":"f"
EDIT: The JSON is without any spaces. It has been formated for readability.