Finding max value of a specific date awk

I have a file with several rows and with each row containing the following data-

name 20150801|1 20150802|4  20150803|6  20150804|7  20150805|7  20150806|8  20150807|11532  20150808|12399  2015089|12619   20150810|12773  20150811|14182  20150812|27856  20150813|81789  20150814|41168  20150815|28982  20150816|24500  20150817|22534  20150818|3  20150819|4  20150820|47773  20150821|33168  20150822|53541  20150823|46371  20150824|34664  20150825|32249  20150826|29181  20150827|38550  20150828|28843  20150829|3  20150830|23543  20150831|6  

name2 20150801|1    20150802|4  20150803|6  20150804|7  20150805|7  20150806|8  20150807|11532  20150808|12399  2015089|12619   20150810|12773  20150811|14182  20150812|27856  20150813|81789  20150814|41168  20150815|28982  20150816|24500  20150817|22534  20150818|3  20150819|4  20150820|47773  20150821|33168  20150822|53541  20150823|46371  20150824|34664  20150825|32249  20150826|29181  20150827|38550  20150828|28843  20150829|3  20150830|23543  20150831|6

The pipe separated value indicates the value for each of the dates in the month. Each row has the same format with same number of columns. The first column name indicates a unique name for the row e.g. 20150818 is yyyyddmm

Given a specific date, how do I extract the name of the row that has the largest value on that day?

标签： date awk date-format

3条回答

Deceive 欺骗

2楼-- · 2019-03-07 04:41

You couldn't have taken 5 seconds to give your sample input different values? Anyway, this may work when run against input that actually has different values for the dates:

$ cat tst.awk
BEGIN { FS="[|[:space:]]+" }
FNR==1 {
    for (i=2;i<=NF;i+=2) {
        if ( $i==tgt ) {
            f = i+1
        }
    }
    max = $f
}
$f >= max { max=$f; name=$1 }
END { print name }

$ awk -v tgt=20150801 -f tst.awk file
name2

0人赞添加讨论(0) 举报

仙女界的扛把子

3楼-- · 2019-03-07 04:46

As a quick&dirty solution, we can perform this in following Unix commands:

yourdatafile=<yourdatafile>
yourdate=<yourdate>

cat $yourdatafile | sed 's/|/_/g' | awk -F "${yourdate}_" '{print $1" "$2}' | sed 's/[0-9]*_[0-9]*//g' | awk '{print $1" "$2}' |sort -k 2n | tail -n 1

With following sample data:

$ cat $yourdatafile
Alice 20150801|44 20150802|21  20150803|7  20150804|76  20150805|71
Bob 20150801|31 20150802|5 20150803|21 20150804|133 20150805|71

and yourdate=20150803 we get:

$ cat $yourdatafile | sed 's/|/_/g' | awk -F "${yourdate}_" '{print $1" "$2}' | sed 's/[0-9]*_[0-9]*//g' | awk '{print $1" "$2}' |sort -k 2n | tail -n 1
Bob 21

and for yourdate=20150802 we get:

$ cat $yourdatafile | sed 's/|/_/g' | awk -F "${yourdate}_" '{print $2" "$1}' | sed 's/[0-9]*_[0-9]*//g' | awk '{print $2" "$1}' | sort -k 2n | tail -n 1
Alice 21

The drawback is that only one line is printed the highest value of a day was achieved by more than one name as can be seen with:

$ yourdate=20150805; cat $yourdatafile | sed 's/|/_/g' | awk -F "${yourdate}_" '{print $2" "$1}' | sed 's/[0-9]*_[0-9]*//g' | awk '{print $2" "$1}' | sort -k 2n | tail -n 1
Bob 71

I hope that helps anyway.

0人赞添加讨论(0) 举报

Melony?

4楼-- · 2019-03-07 04:55

I think you mean this:

awk -v date=20150823 '{for(f=2;f<=NF;f++){split($f,a,"|");if(a[1]==date&&a[2]>max){max=a[2];name=$1}}}END{print name,max}' YourFile

So, you pass the date you are looking for in as a variable called date. You then iterate through all fields on the line, and split the date and value of each into an array using | as separator - a[1] has the date, a[2] has the value. If the date matches and the value is greater than any previously seen maximum, save this as the new maximum and save the first field from this line for printing at the end.

0人赞添加讨论(0) 举报

Finding max value of a specific date awk

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间