How do I sum all numbers from output of jq

2020-04-08 14:09发布

I have this command that I would like to sum all the numbers from the output.

The command looks like this

$(hadoop fs -ls -R /reports/dt=2018-08-27 | grep _stats.json | awk '{print $NF}' | xargs hadoop fs -cat | jq '.duration')

So it's going to list all the folders in /reports/dt=2018-08-27 and get only _stats.json and pass that through jq from hadoop -cat and get only .duration from the json. Which in the end I get the result like this.

1211789 1211789 373585 495379 1211789

But I would like the command to sum all those numbers together to become 4504331

标签: jq
6条回答
手持菜刀,她持情操
2楼-- · 2020-04-08 14:36

You can just use add now.

jq '.duration | add'
查看更多
小情绪 Triste *
3楼-- · 2020-04-08 14:37

Use a for loop.

total=0
for num in $(hadoop fs -ls -R /reports/dt=2018-08-27 | grep _stats.json | awk '{print $NF}' | xargs hadoop fs -cat | jq '.duration')
do
    ((total += num))
done
echo $total
查看更多
SAY GOODBYE
4楼-- · 2020-04-08 14:39

Another option (and one that works even if not all your durations are integers) is to make your jq code do the work:

sample_data='{"duration": 1211789}
{"duration": 1211789}
{"duration": 373585}
{"duration": 495379}
{"duration": 1211789}'

jq -n '[inputs | .duration] | reduce .[] as $num (0; .+$num)' <<<"$sample_data"

...properly emits as output:

4504331

Replace the <<<"$sample_data" with a pipeline on stdin as desired.

查看更多
Deceive 欺骗
5楼-- · 2020-04-08 14:39

awk to the rescue!

$ ... | awk '{sum+=$0} END{print sum}'

4504331
查看更多
手持菜刀,她持情操
6楼-- · 2020-04-08 14:41

For clarity and generality, it might be worthwhile defining sigma(s) to add a stream of numbers:

... | jq -n '
  def sigma(s): reduce s as $x(0;.+$x); 
  sigma(inputs | .duration)'
查看更多
Root(大扎)
7楼-- · 2020-04-08 14:42
jq '[.duration] | add'

will do

查看更多
登录 后发表回答