How to find out the max value of the third field a

2020-04-30 03:16发布

The file content is as follows:

333379266       834640619       88
333379280       834640621       99
333379280       834640621       66
333376672       857526666       99
333376672       857526666       78
333376672       857526666       62

The first two columns may be duplicate, and I want to output the first two columns and the corresponding max value of the third column.In this case,The result file should be as follows:

333379266       834640619       88
333379280       834640621       99
333376672       857526666       99

My attemp is:

awk '{d[$1" "$2]=$3;if ($3>=d[$1" "$2]){num[$1" "$2]=$3} else{num[$1" "$2]=d[$1" "$2]} }END{for(i in num) print i,num[i]}'

But it does not work,because $3>=d[$1" "$2] is always right , the value of num is always $3, and awk reads the file line by line,so the value of num is always the last one,not the max one.

I'll be appreciated if anyone can give me the solution.Thanks in advance.

标签: shell awk
2条回答
兄弟一词,经得起流年.
2楼-- · 2020-04-30 03:58

This one liner applied the same idea as your codes, the only difference is, using FS instead of space.

awk '{k=$1FS$2;a[k]=a[k]>$NF?a[k]:$NF}END{for(i in a)print i,a[i]}' file
查看更多
走好不送
3楼-- · 2020-04-30 04:04

Could you please try following.

awk '
{
  array[$1,$2]=array[$1,$2]>$3?array[$1,$2]:$3
}
END{
  for(i in array){
    print i,array[i]
  }
}
'  Input_file

Issues with OP's code:

On your line d[$1" "$2]=$3;if ($3>=d[$1" "$2]); since you are assigning array d's value before comparison to current line's 3rd field so your this condition is always going to be true is what I could see major issue in OP's attempt.

OP's attempt fix: IMHO my solution above should be good but trying to fix OP's attempt here.

awk '{if ($3>=d[$1" "$2]){num[$1" "$2]=$3} else{num[$1" "$2]=d[$1" "$2]};d[$1" "$2]=$3}END{for(i in num) print i,num[i]}'  Input_file
查看更多
登录 后发表回答