I'm looking for a solution where I'm building out a JSON record and need to generate some text in JQ but pipe this text to an MD5 sum function and use it as a value for a key.
echo '{"first": "John", "last": "Big"}' | jq '. | { id: (.first + .last) | md5 }'
From looking at the manual and the GH issues I can't figure out how to do this since a function can't call out to a shell and there is not built in that provides a unique hash like functionality.
Edit
A better example what I'm looking for is this:
echo '{"first": "John", "last": "Big"}' | jq '. | {first, last, id: (.first + .last | md5) }'
to output:
{
"first": "John",
"last": "Big",
"id": "cda5c2dd89a0ab28a598a6b22e5b88ce"
}
Edit2
and a little more context. I'm creating NDJson files for use with esbulk. I need to generate a unique key for each record. Initially, I thought piping out to the shell would be the simplest solution so I could either use sha1sum or some other hash function easily, but that is looking more challenging than I thought.
A better example what I'm looking for is this:
echo '[{"first": "John", "last": "Big"}, {"first": "Justin", "last": "Frozen"}]' | jq -c '.[] | {first, last, id: (.first + .last | md5) }'
to output:
{"first":"John","last":"Big","id":"cda5c2dd89a0ab28a598a6b22e5b88ce"}
{"first":"Justin","last":"Frozen","id":"af97f1bd8468e013c432208c32272668"}
Here is an efficient solution to the restated problem. There are altogether just two calls to jq, no matter the length of the array:
This produces an array. Just tack on
|.[]
at the end to produce a stream of the elements.Or a bit more tersely, with the goal of emitting one object per line without calling jq within the loop:
Distinct Digest for each Record
It would therefore make sense to compute the digest based on each entire JSON object (or more generally, the entire JSON value), i.e. use
jq -c ‘.[]’
jq
+md5sum
trick:Sample output:
Using
tee
allows a pipeline to be used, e.g.:Output:
Edit2:
The following uses a
while
loop to iterate through the elements of the array, but it calls jq twice at each iteration. For a solution that does not call jq at all within the loop, see elsewhere on this page.Looking around a little farther I ended up finding this: jq json parser hash the field value which was helpful in getting to my answer of:
output